cosine_similarity#
- tangles.util.graph.similarity.cosine_similarity(data: ndarray, sim_thresh: float = 1e-10, max_neighbours: int = None, return_sparse: bool = True, sequential: bool = True, chunk_size: int = 1000) ndarray | csr_matrix #
Return the cosine similarity matrix of the rows of the matrix data.
Parameters#
- datanp.ndarray
The data.
- sim_threshfloat
Similarities smaller than sim_thresh are set to 0.
- return_sparsebool
Whether to return a sparse matrix.
- sequentialbool
Use less memory (sequential == True is a bit slower for small matrices)
- chunk_sizeint
if the similarities are computed sequentially, the similarities are computed in chunks of this size
Returns#
- np.ndarray, scipy.sparse.csr_matrix
A matrix of shape (
data.shape[0]
,data.shape[0]
) containing the cosine similarities of the rows.