geolatent.core#

Core computation sub-package for geolatent.

Contains the dimensionality-reduction pipeline, prediction-mesh builder, and geometric utility functions.

Submodules#

Classes#

GeometryUtils

Collection of geometric analysis and Plotly trace generators.

MeshBuilder

Constructs prediction meshes for 3-D decision-surface rendering.

PredictionMesh

Outputs of a decision-surface prediction sweep over a 3-D grid.

DimensionalityProjector

Unified dimensionality-reduction pipeline.

ProjectionResult

Container returned by DimensionalityProjector.fit_transform().

Package Contents#

class geolatent.core.GeometryUtils(config: geolatent.config.themes.VisualizationConfig)[source]#

Collection of geometric analysis and Plotly trace generators.

All methods accept the 3-D projected coordinate array X_proj and the label vector y, and return lists of fully configured Plotly traces that can be added directly to a Scene3D.

Parameters:

config (VisualizationConfig) – Master configuration; colour palette and render settings are read here.

config#
compute_class_ellipsoids(X_proj: numpy.ndarray, y: numpy.ndarray, *, confidence: float = 0.9, n_grid: int = 40, class_names: Dict | None = None) List[plotly.graph_objects.Surface][source]#

Generate parametric ellipsoid traces for each class.

The ellipsoid is derived from the empirical covariance matrix of the class subset and scaled so that it encloses confidence percent of samples under a multivariate-Gaussian assumption.

Parameters:
  • X_proj (np.ndarray of shape (n_samples, 3)) – Projected data coordinates.

  • y (np.ndarray of shape (n_samples,)) – Class label vector.

  • confidence (float) – Fraction of the distribution to enclose (e.g., 0.90 → 90 % region).

  • n_grid (int) – Resolution of the parametric ellipsoid surface mesh.

  • class_names (dict, optional) – Mapping from class label to display string.

Returns:

traces

Return type:

list of go.Surface

compute_class_centroids(X_proj: numpy.ndarray, y: numpy.ndarray, *, class_names: Dict | None = None) plotly.graph_objects.Scatter3d[source]#

Build a single Scatter3d trace for all class centroids.

Parameters:
  • X_proj (np.ndarray of shape (n_samples, 3))

  • y (np.ndarray of shape (n_samples,))

  • class_names (dict, optional)

Returns:

trace – Star-shaped markers at the class-mean coordinates.

Return type:

go.Scatter3d

compute_convex_hull_traces(X_proj: numpy.ndarray, y: numpy.ndarray, *, class_names: Dict | None = None) List[plotly.graph_objects.Mesh3d][source]#

Generate convex-hull surface traces for each class cluster.

Parameters:
  • X_proj (np.ndarray of shape (n_samples, 3))

  • y (np.ndarray of shape (n_samples,))

  • class_names (dict, optional)

Returns:

traces – One mesh per class, rendered at very low opacity.

Return type:

list of go.Mesh3d

class geolatent.core.MeshBuilder(resolution: int = 30, padding_fraction: float = 0.12, batch_size: int | None = None)[source]#

Constructs prediction meshes for 3-D decision-surface rendering.

Parameters:
  • resolution (int) – Number of grid points per spatial dimension. Total vertex count equals resolution³. Default 30 provides smooth surfaces for most models with sub-second inference time.

  • padding_fraction (float) – Fractional extension applied beyond the data bounding box on each axis. A value of 0.12 extends the grid by 12 % of the data range on every side, preventing clipping at the edges of the scatter cloud.

  • batch_size (int or None) – Maximum number of points passed to the model in a single predict call. None infers all points at once. Set to e.g. 4096 for memory-constrained models.

Examples

>>> from geolatent.core.mesh_builder import MeshBuilder
>>> builder = MeshBuilder(resolution=25)
>>> mesh = builder.build_prediction_mesh(clf, projector, X_3d)
>>> mesh.probabilities.shape
(15625, 2)
resolution = 30#
padding_fraction = 0.12#
batch_size = None#
build_prediction_mesh(model: object, projector: geolatent.core.projector.DimensionalityProjector, X_proj: numpy.ndarray) PredictionMesh[source]#

Build a prediction mesh for model over the region spanned by X_proj.

Parameters:
  • model (sklearn-compatible estimator) – Must implement at least predict(X).

  • projector (DimensionalityProjector) – Fitted projector that supports inverse_transform (i.e., PCA).

  • X_proj (np.ndarray of shape (n_samples, 3)) – Training data in projected 3-D space, used to define the bounding box.

Returns:

mesh

Return type:

PredictionMesh

Raises:

ValueError – If projector does not support inverse_transform.

class geolatent.core.PredictionMesh[source]#

Outputs of a decision-surface prediction sweep over a 3-D grid.

x#

Flattened x-coordinates of the grid vertices in projected space.

Type:

np.ndarray of shape (resolution**3,)

y#

Flattened y-coordinates.

Type:

np.ndarray of shape (resolution**3,)

z#

Flattened z-coordinates.

Type:

np.ndarray of shape (resolution**3,)

predictions#

Predicted class index (integer) or regression target at each vertex.

Type:

np.ndarray of shape (resolution**3,)

probabilities#

Per-class probability at each vertex; None when the model does not expose predict_proba.

Type:

np.ndarray of shape (resolution**3, n_classes) or None

grid_shape#

Logical shape (resolution, resolution, resolution) of the 3-D grid.

Type:

tuple of 3 ints

n_classes#

Number of unique predicted class labels (meaningful only for classifiers).

Type:

int

unique_classes#

Sorted array of unique class labels found in predictions.

Type:

np.ndarray

bounds#

Per-axis bounding box: [[xmin, xmax], [ymin, ymax], [zmin, zmax]].

Type:

np.ndarray of shape (3, 2)

is_regression#

True when the model output is treated as a continuous regression value.

Type:

bool

x: numpy.ndarray#
y: numpy.ndarray#
z: numpy.ndarray#
predictions: numpy.ndarray#
probabilities: numpy.ndarray | None#
grid_shape: Tuple[int, int, int]#
n_classes: int#
unique_classes: numpy.ndarray#
bounds: numpy.ndarray#
is_regression: bool = False#
class geolatent.core.DimensionalityProjector(config: geolatent.config.themes.ProjectionConfig)[source]#

Unified dimensionality-reduction pipeline.

Supports PCA (with invertible transform), t-SNE, and UMAP. The projector follows the fit/transform convention so it can be reused across multiple datasets sharing the same embedding geometry — e.g., projecting a test set using the PCA basis fitted on training data.

Parameters:

config (ProjectionConfig) – Projection hyper-parameters and algorithm selection.

supports_inverse_transform#

True when the underlying algorithm provides a meaningful inverse mapping from the low-dimensional space back to the original feature space. Currently True only for PCA and the identity projection (when the input is already ≤ 3-dimensional).

Type:

bool

is_fitted#

True after fit() or fit_transform() has been called.

Type:

bool

Examples

>>> from geolatent.config.themes import ProjectionConfig
>>> from geolatent.core.projector import DimensionalityProjector
>>> import numpy as np
>>> cfg = ProjectionConfig(method="pca", scale_input=True)
>>> proj = DimensionalityProjector(cfg)
>>> X = np.random.randn(300, 50)
>>> result = proj.fit_transform(X)
>>> result.coordinates.shape
(300, 3)
>>> result.cumulative_variance
0.312...
config#
supports_inverse_transform: bool = False#
is_fitted: bool = False#
fit(X: numpy.ndarray) DimensionalityProjector[source]#

Fit the projection on training data.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix. Will be standardised internally if config.scale_input is True.

Returns:

self – The fitted projector instance, supporting method chaining.

Return type:

DimensionalityProjector

transform(X: numpy.ndarray) numpy.ndarray[source]#

Project X using the already-fitted transform.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix; must have the same number of features as the data used for fitting.

Returns:

X_proj – 3-D projected coordinates.

Return type:

np.ndarray of shape (n_samples, 3)

fit_transform(X: numpy.ndarray, predict_fn=None, feature_names: List[str] | None = None) ProjectionResult[source]#

Fit the projection and immediately transform the input data.

Parameters:
  • X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix.

  • predict_fn (callable, optional) – Required when config.method == "sensitivity". Any callable that accepts an (n, n_features) array and returns predictions of shape (n,) or (n, n_classes). Works with sklearn, PyTorch, XGBoost, or any custom function.

  • feature_names (list of str, optional) – Names of the input features, used for axis labels.

Returns:

result – Projected coordinates with associated diagnostic metadata.

Return type:

ProjectionResult

inverse_transform(X_proj: numpy.ndarray) numpy.ndarray[source]#

Map projected coordinates back to the original feature space.

This method is only meaningful for PCA and the identity projection. For t-SNE / UMAP, it raises NotImplementedError.

Parameters:

X_proj (np.ndarray of shape (n_points, 3)) – Points in the 3-D projected space.

Returns:

X_original – Best-rank-3 approximation of the corresponding feature vectors in the original (possibly high-dimensional) feature space.

Return type:

np.ndarray of shape (n_points, n_features)

Raises:
class geolatent.core.ProjectionResult[source]#

Container returned by DimensionalityProjector.fit_transform().

coordinates#

3-D projected coordinates for each input sample.

Type:

np.ndarray of shape (n_samples, 3)

explained_variance_ratio#

Fraction of total variance captured by each principal component. None when a non-PCA method is used.

Type:

np.ndarray of shape (3,) or None

cumulative_variance#

Sum of explained_variance_ratio; convenience scalar indicating how much information is preserved in the projection.

Type:

float or None

method#

Name of the projection algorithm that produced these coordinates.

Type:

str

original_dim#

Number of features in the original (un-projected) space.

Type:

int

axis_labels#

Suggested axis labels for the three projected dimensions.

Type:

list of str

coordinates: numpy.ndarray#
explained_variance_ratio: numpy.ndarray | None#
cumulative_variance: float | None#
method: str#
original_dim: int#
axis_labels: List[str] = ['PC 1', 'PC 2', 'PC 3']#