geolatent.api#
Public API sub-package for geolatent.
Submodules#
Functions#
|
Render the decision geometry of a scikit-learn-compatible classifier in 3-D. |
|
Visualise the geometric structure of high-dimensional embeddings in 3-D. |
Package Contents#
- geolatent.api.visualize_decision_geometry(model: object, X: numpy.ndarray, y: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', predict_fn=None, feature_names: List[str] | None = None, mesh_resolution: int = 30, show_surface: bool = True, show_confidence: bool = True, show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = False, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, batch_size: int | None = None) plotly.graph_objects.Figure[source]#
Render the decision geometry of a scikit-learn-compatible classifier in 3-D.
The input feature matrix
Xis projected to 3 principal components via PCA (or t-SNE / UMAP for pure scatter visualisation). When PCA is used, the model’s decision function is evaluated on a regular 3-D grid that is inverse-transformed back into the original feature space, producing decision boundary isosurfaces anchored to the actual model geometry — not an approximation in an arbitrary slice.- Parameters:
model (sklearn-compatible estimator) – Must implement
predict(X). Also implementspredict_proba(X)for richer confidence-surface rendering (recommended).X (array-like of shape (n_samples, n_features)) – Training feature matrix. Will be standardised and projected internally.
y (array-like of shape (n_samples,)) – Class label vector. Integer or string labels are both supported.
config (VisualizationConfig, optional) – Custom theme and rendering configuration. Defaults to
DARK_SCIENTIFIC.projection_method ({"pca", "tsne", "umap", "sensitivity"}) – Dimensionality-reduction algorithm.
"pca"and"sensitivity"both support decision-surface rendering."sensitivity"uses finite-difference Jacobians to find axes the model actually cares about and works with any callable (sklearn, PyTorch, XGBoost, etc.).predict_fn (callable, optional) – Required when
projection_method="sensitivity"with a non-sklearn model. For sklearn models it is auto-derived frommodel.predict_probaormodel.predictwhen not supplied.feature_names (list of str, optional) – Names of the input features. Shown on axes and sensitivity labels.
mesh_resolution (int) – Grid resolution per dimension for the prediction mesh. Total inference calls equal
mesh_resolution³. Default 30.show_surface (bool) – Whether to render the decision boundary / probability surfaces.
show_confidence (bool) – When
Trueand the model exposespredict_proba, render nested confidence isosurfaces in addition to the primary boundary shell.show_scatter (bool) – Whether to render the data-point scatter cloud.
show_centroids (bool) – Whether to render class-centroid diamond markers.
show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids.
show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.
ellipsoid_confidence (float) – Confidence level for ellipsoid construction (default 0.90 → 90 % region).
class_names (dict, optional) – Mapping from class label to human-readable display string.
title (str, optional) – Figure title. Overrides
config.titlewhen supplied.batch_size (int, optional) – Batch size for model inference on the prediction mesh.
- Returns:
fig – Interactive 3-D Plotly figure.
- Return type:
- Raises:
TypeError – If
modeldoes not expose apredictmethod.ValueError – If
Xoryfail validation (shape, NaN, insufficient classes).
Examples
>>> from sklearn.ensemble import GradientBoostingClassifier >>> from sklearn.datasets import make_classification >>> from geolatent import visualize_decision_geometry >>> >>> X, y = make_classification(n_samples=400, n_features=20, n_classes=3, ... n_informative=10, random_state=0) >>> clf = GradientBoostingClassifier(n_estimators=50, random_state=0).fit(X, y) >>> fig = visualize_decision_geometry(clf, X, y, ... title="GBM — 3-class Decision Geometry") >>> fig.show()
- geolatent.api.inspect_latent_space(embeddings: numpy.ndarray, labels: numpy.ndarray, *, config: geolatent.config.themes.VisualizationConfig | None = None, projection_method: str = 'pca', show_scatter: bool = True, show_centroids: bool = True, show_ellipsoids: bool = True, show_convex_hulls: bool = False, ellipsoid_confidence: float = 0.9, class_names: Dict | None = None, title: str | None = None, point_size: int | None = None, scatter_opacity: float | None = None) plotly.graph_objects.Figure[source]#
Visualise the geometric structure of high-dimensional embeddings in 3-D.
Projects embeddings from their native dimensionality to 3-D using the chosen dimensionality-reduction algorithm, then renders an interactive scene with class-coloured scatter clouds, centroid markers, and optional structural overlays (confidence ellipsoids, convex hulls).
- Parameters:
embeddings (array-like of shape (n_samples, n_dims)) – High-dimensional embedding vectors. Suitable inputs include transformer hidden states, GAN latent codes, learned feature maps, or any continuous representation to be analysed structurally.
labels (array-like of shape (n_samples,)) – Integer or string class / group labels for colour coding.
config (VisualizationConfig, optional) – Theme and rendering configuration. Defaults to
DARK_SCIENTIFIC.projection_method ({"pca", "tsne", "umap"}) – Dimensionality-reduction algorithm.
"pca"is fast and preserves global geometry;"tsne"/"umap"are better at revealing local cluster structure at the cost of interpretability.show_scatter (bool) – Whether to render the data-point scatter cloud.
show_centroids (bool) – Whether to render class-centroid diamond markers.
show_ellipsoids (bool) – Whether to overlay Mahalanobis-distance confidence ellipsoids (default
True— these are the primary structural indicator in latent-space analysis).show_convex_hulls (bool) – Whether to overlay transparent convex-hull surfaces per class.
ellipsoid_confidence (float) – Confidence level for the Mahalanobis ellipsoid (default 0.90).
class_names (dict, optional) – Mapping from label value to display string.
title (str, optional) – Figure title.
point_size (int, optional) – Override the default scatter marker size.
scatter_opacity (float, optional) – Override the default scatter marker opacity.
- Returns:
fig
- Return type:
- Raises:
ValueError – If
embeddingsorlabelsfail shape / finiteness validation.
Examples
>>> from geolatent import inspect_latent_space >>> import numpy as np >>> >>> # Simulate 4 Gaussian clusters in a 128-D embedding space >>> rng = np.random.default_rng(0) >>> embeddings = np.vstack([ ... rng.normal(loc=c, scale=1.0, size=(100, 128)) ... for c in [0, 3, 6, 9] ... ]) >>> labels = np.repeat([0, 1, 2, 3], 100) >>> fig = inspect_latent_space( ... embeddings, labels, ... projection_method="pca", ... title="128-D Gaussian Clusters — PCA Projection", ... ) >>> fig.show()