geolatent.api#

Public API sub-package for geolatent.

Submodules#

Functions#

visualize_decision_geometry(→ plotly.graph_objects.Figure)

Render the decision geometry of a scikit-learn-compatible classifier in 3-D.

inspect_latent_space(→ plotly.graph_objects.Figure)

Visualise the geometric structure of high-dimensional embeddings in 3-D.

Package Contents#

geolatent.api.visualize_decision_geometry(model: object, X: numpy.ndarray, y: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', predict_fn=None, feature_names: List[str] | None = None, mesh_resolution: int = 30, show_surface: bool = True, show_confidence: bool = True, show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = False, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, batch_size: int | None = None) plotly.graph_objects.Figure[source]#

Render the decision geometry of a scikit-learn-compatible classifier in 3-D.

The input feature matrix X is projected to 3 principal components via PCA (or t-SNE / UMAP for pure scatter visualisation). When PCA is used, the model’s decision function is evaluated on a regular 3-D grid that is inverse-transformed back into the original feature space, producing decision boundary isosurfaces anchored to the actual model geometry — not an approximation in an arbitrary slice.

Parameters:
  • model (sklearn-compatible estimator) – Must implement predict(X). Also implements predict_proba(X) for richer confidence-surface rendering (recommended).

  • X (array-like of shape (n_samples, n_features)) – Training feature matrix. Will be standardised and projected internally.

  • y (array-like of shape (n_samples,)) – Class label vector. Integer or string labels are both supported.

  • config (VisualizationConfig, optional) – Custom theme and rendering configuration. Defaults to DARK_SCIENTIFIC.

  • projection_method ({"pca", "tsne", "umap", "sensitivity"}) – Dimensionality-reduction algorithm. "pca" and "sensitivity" both support decision-surface rendering. "sensitivity" uses finite-difference Jacobians to find axes the model actually cares about and works with any callable (sklearn, PyTorch, XGBoost, etc.).

  • predict_fn (callable, optional) – Required when projection_method="sensitivity" with a non-sklearn model. For sklearn models it is auto-derived from model.predict_proba or model.predict when not supplied.

  • feature_names (list of str, optional) – Names of the input features. Shown on axes and sensitivity labels.

  • mesh_resolution (int) – Grid resolution per dimension for the prediction mesh. Total inference calls equal mesh_resolution³. Default 30.

  • show_surface (bool) – Whether to render the decision boundary / probability surfaces.

  • show_confidence (bool) – When True and the model exposes predict_proba, render nested confidence isosurfaces in addition to the primary boundary shell.

  • show_scatter (bool) – Whether to render the data-point scatter cloud.

  • show_centroids (bool) – Whether to render class-centroid diamond markers.

  • show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids.

  • show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.

  • ellipsoid_confidence (float) – Confidence level for ellipsoid construction (default 0.90 → 90 % region).

  • class_names (dict, optional) – Mapping from class label to human-readable display string.

  • title (str, optional) – Figure title. Overrides config.title when supplied.

  • batch_size (int, optional) – Batch size for model inference on the prediction mesh.

Returns:

fig – Interactive 3-D Plotly figure.

Return type:

plotly.graph_objects.Figure

Raises:
  • TypeError – If model does not expose a predict method.

  • ValueError – If X or y fail validation (shape, NaN, insufficient classes).

Examples

>>> from sklearn.ensemble import GradientBoostingClassifier
>>> from sklearn.datasets import make_classification
>>> from geolatent import visualize_decision_geometry
>>>
>>> X, y = make_classification(n_samples=400, n_features=20, n_classes=3,
...                            n_informative=10, random_state=0)
>>> clf = GradientBoostingClassifier(n_estimators=50, random_state=0).fit(X, y)
>>> fig = visualize_decision_geometry(clf, X, y,
...                                   title="GBM — 3-class Decision Geometry")
>>> fig.show()
geolatent.api.inspect_latent_space(embeddings: numpy.ndarray, labels: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = True, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, point_size: int | None = None, scatter_opacity: float | None = None) plotly.graph_objects.Figure[source]#

Visualise the geometric structure of high-dimensional embeddings in 3-D.

Projects embeddings from their native dimensionality to 3-D using the chosen dimensionality-reduction algorithm, then renders an interactive scene with class-coloured scatter clouds, centroid markers, and optional structural overlays (confidence ellipsoids, convex hulls).

Parameters:
  • embeddings (array-like of shape (n_samples, n_dims)) – High-dimensional embedding vectors. Suitable inputs include transformer hidden states, GAN latent codes, learned feature maps, or any continuous representation to be analysed structurally.

  • labels (array-like of shape (n_samples,)) – Integer or string class / group labels for colour coding.

  • config (VisualizationConfig, optional) – Theme and rendering configuration. Defaults to DARK_SCIENTIFIC.

  • projection_method ({"pca", "tsne", "umap"}) – Dimensionality-reduction algorithm. "pca" is fast and preserves global geometry; "tsne" / "umap" are better at revealing local cluster structure at the cost of interpretability.

  • show_scatter (bool) – Whether to render the data-point scatter cloud.

  • show_centroids (bool) – Whether to render class-centroid diamond markers.

  • show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids (default True — these are the primary structural indicator in latent-space analysis).

  • show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.

  • ellipsoid_confidence (float) – Confidence level for the Mahalanobis ellipsoid (default 0.90).

  • class_names (dict, optional) – Mapping from label value to display string.

  • title (str, optional) – Figure title.

  • point_size (int, optional) – Override the default scatter marker size.

  • scatter_opacity (float, optional) – Override the default scatter marker opacity.

Returns:

fig

Return type:

plotly.graph_objects.Figure

Raises:

ValueError – If embeddings or labels fail shape / finiteness validation.

Examples

>>> from geolatent import inspect_latent_space
>>> import numpy as np
>>>
>>> # Simulate 4 Gaussian clusters in a 128-D embedding space
>>> rng = np.random.default_rng(0)
>>> embeddings = np.vstack([
...     rng.normal(loc=c, scale=1.0, size=(100, 128))
...     for c in [0, 3, 6, 9]
... ])
>>> labels = np.repeat([0, 1, 2, 3], 100)
>>> fig = inspect_latent_space(
...     embeddings, labels,
...     projection_method="pca",
...     title="128-D Gaussian Clusters — PCA Projection",
... )
>>> fig.show()