geolatent.core.projector#

Dimensionality-reduction pipeline for geolatent.

Central class: DimensionalityProjector.

Design principles#

  • The projector acts as a stateful transform pipeline (fit → transform) following the scikit-learn estimator convention, which keeps it composable with existing ML pipelines.

  • Only PCA guarantees an analytically invertible linear transform. t-SNE and UMAP projections are therefore restricted to latent-space visualisation; requesting a decision surface with these methods raises a clear error rather than silently producing a misleading result.

  • Input standardisation (zero-mean, unit-variance) is applied internally when ProjectionConfig.scale_input is True, ensuring numerical stability across all supported algorithms and preventing features with large dynamic range from dominating the projection.

  • The projector stores the fitted scaler alongside the projection model so that the full round-trip X → (scale) → (project) → X_3d → (unproject) → (unscale) → X̂ is encapsulated in a single object.

Classes#

ProjectionResult

Container returned by DimensionalityProjector.fit_transform().

DimensionalityProjector

Unified dimensionality-reduction pipeline.

Module Contents#

class geolatent.core.projector.ProjectionResult[source]#

Container returned by DimensionalityProjector.fit_transform().

coordinates#

3-D projected coordinates for each input sample.

Type:

np.ndarray of shape (n_samples, 3)

explained_variance_ratio#

Fraction of total variance captured by each principal component. None when a non-PCA method is used.

Type:

np.ndarray of shape (3,) or None

cumulative_variance#

Sum of explained_variance_ratio; convenience scalar indicating how much information is preserved in the projection.

Type:

float or None

method#

Name of the projection algorithm that produced these coordinates.

Type:

str

original_dim#

Number of features in the original (un-projected) space.

Type:

int

axis_labels#

Suggested axis labels for the three projected dimensions.

Type:

list of str

coordinates: numpy.ndarray#
explained_variance_ratio: numpy.ndarray | None#
cumulative_variance: float | None#
method: str#
original_dim: int#
axis_labels: List[str] = ['PC 1', 'PC 2', 'PC 3']#
class geolatent.core.projector.DimensionalityProjector(config: geolatent.config.themes.ProjectionConfig)[source]#

Unified dimensionality-reduction pipeline.

Supports PCA (with invertible transform), t-SNE, and UMAP. The projector follows the fit/transform convention so it can be reused across multiple datasets sharing the same embedding geometry — e.g., projecting a test set using the PCA basis fitted on training data.

Parameters:

config (ProjectionConfig) – Projection hyper-parameters and algorithm selection.

supports_inverse_transform#

True when the underlying algorithm provides a meaningful inverse mapping from the low-dimensional space back to the original feature space. Currently True only for PCA and the identity projection (when the input is already ≤ 3-dimensional).

Type:

bool

is_fitted#

True after fit() or fit_transform() has been called.

Type:

bool

Examples

>>> from geolatent.config.themes import ProjectionConfig
>>> from geolatent.core.projector import DimensionalityProjector
>>> import numpy as np
>>> cfg = ProjectionConfig(method="pca", scale_input=True)
>>> proj = DimensionalityProjector(cfg)
>>> X = np.random.randn(300, 50)
>>> result = proj.fit_transform(X)
>>> result.coordinates.shape
(300, 3)
>>> result.cumulative_variance
0.312...
config#
supports_inverse_transform: bool = False#
is_fitted: bool = False#
fit(X: numpy.ndarray) DimensionalityProjector[source]#

Fit the projection on training data.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix. Will be standardised internally if config.scale_input is True.

Returns:

self – The fitted projector instance, supporting method chaining.

Return type:

DimensionalityProjector

transform(X: numpy.ndarray) numpy.ndarray[source]#

Project X using the already-fitted transform.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix; must have the same number of features as the data used for fitting.

Returns:

X_proj – 3-D projected coordinates.

Return type:

np.ndarray of shape (n_samples, 3)

fit_transform(X: numpy.ndarray, predict_fn=None, feature_names: List[str] | None = None) ProjectionResult[source]#

Fit the projection and immediately transform the input data.

Parameters:
  • X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix.

  • predict_fn (callable, optional) – Required when config.method == "sensitivity". Any callable that accepts an (n, n_features) array and returns predictions of shape (n,) or (n, n_classes). Works with sklearn, PyTorch, XGBoost, or any custom function.

  • feature_names (list of str, optional) – Names of the input features, used for axis labels.

Returns:

result – Projected coordinates with associated diagnostic metadata.

Return type:

ProjectionResult

inverse_transform(X_proj: numpy.ndarray) numpy.ndarray[source]#

Map projected coordinates back to the original feature space.

This method is only meaningful for PCA and the identity projection. For t-SNE / UMAP, it raises NotImplementedError.

Parameters:

X_proj (np.ndarray of shape (n_points, 3)) – Points in the 3-D projected space.

Returns:

X_original – Best-rank-3 approximation of the corresponding feature vectors in the original (possibly high-dimensional) feature space.

Return type:

np.ndarray of shape (n_points, n_features)

Raises: