Skip to content

Support generalized dtype np.ndarray transformers#1518

Open
MasterSkepticista wants to merge 8 commits into
developfrom
karansh1/nparray_tobytes
Open

Support generalized dtype np.ndarray transformers#1518
MasterSkepticista wants to merge 8 commits into
developfrom
karansh1/nparray_tobytes

Conversation

@MasterSkepticista

@MasterSkepticista MasterSkepticista commented Apr 4, 2025

Copy link
Copy Markdown
Collaborator

Status quo

Today, only np.float32 arrays are transported over gRPC. There is a need for supporting wider range of (still numpy array) dtypes e.g. key shares in secure aggregation are strings.

There are two other issues:

  • Use of other numerics (np.intX, np.floatX) leads to silent typecasts to np.float32. User is not aware of this and also may lead to loss of precision.
  • Other dtypes fail with undefined behavior and/or garbage output.

Proposal

This PR proposes a generalized lossless NumpyArrayToBytes transformer which supports all dtypes that numpy supports. A new field in the tensor metadata is inserted: string dtype.

Impact

Secure aggregation currently creates a dependency on the component to serialize keys as json-encoded strings. It is not possible to separate TensorCodec from the Aggregator or Collaborator component without an informed way of sending arbitrary dtype arrays over communication channel.

Changes tested, ready for review (modulo any tests that failed due to tip moving ahead).

Signed-off-by: Shah, Karan <[email protected]>
Signed-off-by: Shah, Karan <[email protected]>
@MasterSkepticista MasterSkepticista force-pushed the karansh1/nparray_tobytes branch from 2a28559 to 1a4923a Compare April 4, 2025 11:13

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for reviewer(s): Changes in this file are unrelated to the PR. Variables are renamed for readability and python GC action

@MasterSkepticista MasterSkepticista force-pushed the karansh1/nparray_tobytes branch from 1a4923a to 439a6a7 Compare April 4, 2025 11:27
@payalcha

payalcha commented Apr 4, 2025

Copy link
Copy Markdown
Collaborator

All tests are failing. Attached participant logs for reference.
collaborator1.log
aggregator.log
collaborator2.log

@theakshaypant theakshaypant left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons for the CI failing seems to be the absence of "dtype" in metadata. I believe we also need to modify places where metadata dictionary is generated from proto as it does not take proto.dtype into account.
Specifically these lines need to be changed such that

        metadata_dict[tensor_proto.name] = [
            {
                "int_to_float": proto.int_to_float,
                "int_list": proto.int_list,
                "bool_list": proto.bool_list,
                "dtype": proto.dtype,
            }
            for proto in tensor_proto.transformer_metadata
        ]

Another such instance can be found here.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

openfl/protocols/base.proto:25

  • The 'dtype' field is defined as a repeated string, yet the transformer sets a single dtype value. Consider changing the field to 'string dtype = 4;' to match the expected single value.
repeated string dtype = 4;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants