cosine_similarity_graph_csr_data#

tangles.util.graph.similarity.cosine_similarity_graph_csr_data(mat: csr_matrix, sim_thresh: float = 0.25, weight_range: list[float] | None = None, chunk_size: int = 100, verbose: bool = False)#

Creates a similarity graph on the data based on cosine similarity between the data points. Works with sparse matrices and takes less memory than cosine_similarity_graph().

Parameters#

datascipy.sparse.csr_matrix

The data.

sim_threshfloat

Minimum similarity for an edge.

weight_rangelist of floats, optional

If note None, scale weights to this range.

chunk_sizeint

The size of the chunks.

verbosebool

Prints the amount of nonzero elements in the matrix as the calculation proceeds.

Returns#

adj_matrixscipy.sparse.csr_matrix

A sparse adjacency matrix.