geolatent.api.latent#
High-level API: inspect_latent_space.
This module exposes the primary entry point for embedding / latent-space analysis.
Unlike visualize_decision_geometry(), which
renders model decision boundaries, inspect_latent_space focuses on the
geometric structure of the representation itself: how well-separated are
class clusters, what is the intrinsic dimensionality of the manifold, and do
subgroups form coherent neighbourhoods?
The function supports arbitrary high-dimensional embeddings — word vectors, image feature maps, VAE latent codes, transformer hidden states, etc. — and reduces them to 3 principal components (or t-SNE / UMAP coordinates) before rendering.
Usage#
from geolatent import inspect_latent_space
# BERT sentence embeddings, shape (512, 768)
fig = inspect_latent_space(
embeddings=bert_output,
labels=sentence_classes,
projection_method="tsne",
title="BERT Sentence Embeddings — Topic Clusters",
)
fig.show()
Functions#
|
Visualise the geometric structure of high-dimensional embeddings in 3-D. |
Module Contents#
- geolatent.api.latent.inspect_latent_space(embeddings: numpy.ndarray, labels: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = True, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, point_size: int | None = None, scatter_opacity: float | None = None) plotly.graph_objects.Figure[source]#
Visualise the geometric structure of high-dimensional embeddings in 3-D.
Projects embeddings from their native dimensionality to 3-D using the chosen dimensionality-reduction algorithm, then renders an interactive scene with class-coloured scatter clouds, centroid markers, and optional structural overlays (confidence ellipsoids, convex hulls).
- Parameters:
embeddings (array-like of shape (n_samples, n_dims)) – High-dimensional embedding vectors. Suitable inputs include transformer hidden states, GAN latent codes, learned feature maps, or any continuous representation to be analysed structurally.
labels (array-like of shape (n_samples,)) – Integer or string class / group labels for colour coding.
config (VisualizationConfig, optional) – Theme and rendering configuration. Defaults to
DARK_SCIENTIFIC.projection_method ({"pca", "tsne", "umap"}) – Dimensionality-reduction algorithm.
"pca"is fast and preserves global geometry;"tsne"/"umap"are better at revealing local cluster structure at the cost of interpretability.show_scatter (bool) – Whether to render the data-point scatter cloud.
show_centroids (bool) – Whether to render class-centroid diamond markers.
show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids (default
True— these are the primary structural indicator in latent-space analysis).show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.
ellipsoid_confidence (float) – Confidence level for the Mahalanobis ellipsoid (default 0.90).
class_names (dict, optional) – Mapping from label value to display string.
title (str, optional) – Figure title.
point_size (int, optional) – Override the default scatter marker size.
scatter_opacity (float, optional) – Override the default scatter marker opacity.
- Returns:
fig
- Return type:
- Raises:
ValueError – If
embeddingsorlabelsfail shape / finiteness validation.
Examples
>>> from geolatent import inspect_latent_space >>> import numpy as np >>> >>> # Simulate 4 Gaussian clusters in a 128-D embedding space >>> rng = np.random.default_rng(0) >>> embeddings = np.vstack([ ... rng.normal(loc=c, scale=1.0, size=(100, 128)) ... for c in [0, 3, 6, 9] ... ]) >>> labels = np.repeat([0, 1, 2, 3], 100) >>> fig = inspect_latent_space( ... embeddings, labels, ... projection_method="pca", ... title="128-D Gaussian Clusters — PCA Projection", ... ) >>> fig.show()