geolatent.api.latent#

High-level API: inspect_latent_space.

This module exposes the primary entry point for embedding / latent-space analysis. Unlike visualize_decision_geometry(), which renders model decision boundaries, inspect_latent_space focuses on the geometric structure of the representation itself: how well-separated are class clusters, what is the intrinsic dimensionality of the manifold, and do subgroups form coherent neighbourhoods?

The function supports arbitrary high-dimensional embeddings — word vectors, image feature maps, VAE latent codes, transformer hidden states, etc. — and reduces them to 3 principal components (or t-SNE / UMAP coordinates) before rendering.

Usage#

from geolatent import inspect_latent_space

# BERT sentence embeddings, shape (512, 768)
fig = inspect_latent_space(
    embeddings=bert_output,
    labels=sentence_classes,
    projection_method="tsne",
    title="BERT Sentence Embeddings — Topic Clusters",
)
fig.show()

Functions#

inspect_latent_space(→ plotly.graph_objects.Figure)

Visualise the geometric structure of high-dimensional embeddings in 3-D.

Module Contents#

geolatent.api.latent.inspect_latent_space(embeddings: numpy.ndarray, labels: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = True, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, point_size: int | None = None, scatter_opacity: float | None = None) plotly.graph_objects.Figure[source]#

Visualise the geometric structure of high-dimensional embeddings in 3-D.

Projects embeddings from their native dimensionality to 3-D using the chosen dimensionality-reduction algorithm, then renders an interactive scene with class-coloured scatter clouds, centroid markers, and optional structural overlays (confidence ellipsoids, convex hulls).

Parameters:
  • embeddings (array-like of shape (n_samples, n_dims)) – High-dimensional embedding vectors. Suitable inputs include transformer hidden states, GAN latent codes, learned feature maps, or any continuous representation to be analysed structurally.

  • labels (array-like of shape (n_samples,)) – Integer or string class / group labels for colour coding.

  • config (VisualizationConfig, optional) – Theme and rendering configuration. Defaults to DARK_SCIENTIFIC.

  • projection_method ({"pca", "tsne", "umap"}) – Dimensionality-reduction algorithm. "pca" is fast and preserves global geometry; "tsne" / "umap" are better at revealing local cluster structure at the cost of interpretability.

  • show_scatter (bool) – Whether to render the data-point scatter cloud.

  • show_centroids (bool) – Whether to render class-centroid diamond markers.

  • show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids (default True — these are the primary structural indicator in latent-space analysis).

  • show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.

  • ellipsoid_confidence (float) – Confidence level for the Mahalanobis ellipsoid (default 0.90).

  • class_names (dict, optional) – Mapping from label value to display string.

  • title (str, optional) – Figure title.

  • point_size (int, optional) – Override the default scatter marker size.

  • scatter_opacity (float, optional) – Override the default scatter marker opacity.

Returns:

fig

Return type:

plotly.graph_objects.Figure

Raises:

ValueError – If embeddings or labels fail shape / finiteness validation.

Examples

>>> from geolatent import inspect_latent_space
>>> import numpy as np
>>>
>>> # Simulate 4 Gaussian clusters in a 128-D embedding space
>>> rng = np.random.default_rng(0)
>>> embeddings = np.vstack([
...     rng.normal(loc=c, scale=1.0, size=(100, 128))
...     for c in [0, 3, 6, 9]
... ])
>>> labels = np.repeat([0, 1, 2, 3], 100)
>>> fig = inspect_latent_space(
...     embeddings, labels,
...     projection_method="pca",
...     title="128-D Gaussian Clusters — PCA Projection",
... )
>>> fig.show()