genevector.metrics
genevector/metrics.py — co-expression target functions.
- compute_mi_gpu(X, gene_names, n_bins=10, signed=False, device='cuda')[source]
GPU-accelerated MI using PyTorch scatter_add for joint histograms.
- compute_mi_rust(X, gene_names, n_bins=10, signed=False)[source]
Rust-accelerated MI computation via PyO3.
- compute_mi_vectorized(X, gene_names, n_bins=10, signed=False)[source]
Compute MI for all gene pairs using vectorized discretization.
- Parameters:
X (sparse or dense matrix, shape (n_cells, n_genes))
gene_names (list of str)
n_bins (int)
signed (bool) – If True, multiply MI by sign of Pearson correlation.
- Returns:
mi_scores – mi_scores[gene_a][gene_b] = float
- Return type:
dict of dict
- discretize_genes(X, n_bins=10)[source]
Discretize each gene’s expression into integer bin indices.
- Parameters:
X (scipy.sparse.csr_matrix or np.ndarray) – Cells x genes expression matrix.
n_bins (int) – Number of bins per gene (excluding the zero bin).
- Returns:
X_disc (np.ndarray, shape (n_cells, n_genes), dtype=np.int32) – Discretized expression. 0 = zero expression, 1..n_bins = quantile bins of nonzero expression.
n_bins_per_gene (np.ndarray, shape (n_genes,), dtype=np.int32) – Actual number of bins used per gene (may be < n_bins if a gene has fewer unique nonzero values).
- get_target_function(name)[source]
Look up a registered target function by name.
- Parameters:
name (str) – Name of the registered target.
- Returns:
The target function.
- Return type:
callable
- Raises:
ValueError – If name is not registered.
- target_cosine(X, gene_names, **kwargs)[source]
Cosine similarity between gene expression vectors (each gene is a vector across cells).
- target_jaccard(X, gene_names, **kwargs)[source]
Jaccard index on binarized expression (gene detected / not detected).