Skip to content

Improve efficiency of __getitem__ #101

@MarcAntoineSchmidtQC

Description

Currently, our approach for some of the __getitem__ methods is inefficient. For example, column subsetting for CategoricalMatrix converts the full matrix to a csc_matrix.

Here's a list to update with potential improvements:

  • DenseMatrix: nothing to do. Already optimized with np.ndarray
  • SparseMatrix: nothing to do. Already optimized with sps.csc_matrix
  • CategoricalMatrix:
    • row: nothing to do, trivial
    • column: create a SparseMatrix with only the subset of columns/rows selected
  • SplitMatrix:
    • Test thoroughly all the potential ways to index
  • StandardizedMatrix
    • Not sure if columns subset with only one row works
  • Write docstrings for expected behavior
  • Write tests covering all expected behavior

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions