geolatent.core.projector#

Dimensionality-reduction pipeline for geolatent.

Central class: DimensionalityProjector.

Design principles#

The projector acts as a stateful transform pipeline (fit → transform) following the scikit-learn estimator convention, which keeps it composable with existing ML pipelines.
Only PCA guarantees an analytically invertible linear transform. t-SNE and UMAP projections are therefore restricted to latent-space visualisation; requesting a decision surface with these methods raises a clear error rather than silently producing a misleading result.
Input standardisation (zero-mean, unit-variance) is applied internally when ProjectionConfig.scale_input is True, ensuring numerical stability across all supported algorithms and preventing features with large dynamic range from dominating the projection.
The projector stores the fitted scaler alongside the projection model so that the full round-trip X → (scale) → (project) → X_3d → (unproject) → (unscale) → X̂ is encapsulated in a single object.

Classes#

`ProjectionResult`	Container returned by `DimensionalityProjector.fit_transform()`.
`DimensionalityProjector`	Unified dimensionality-reduction pipeline.

Module Contents#

class geolatent.core.projector.ProjectionResult[source]#

Container returned by DimensionalityProjector.fit_transform().

coordinates#

3-D projected coordinates for each input sample.

Type:: np.ndarray of shape (n_samples, 3)

explained_variance_ratio#

Fraction of total variance captured by each principal component. None when a non-PCA method is used.

Type:: np.ndarray of shape (3,) or None

cumulative_variance#

Sum of explained_variance_ratio; convenience scalar indicating how much information is preserved in the projection.

Type:: float or None

method#

Name of the projection algorithm that produced these coordinates.

Type:: str

original_dim#

Number of features in the original (un-projected) space.

Type:: int

axis_labels#

Suggested axis labels for the three projected dimensions.

Type:: list of str

coordinates: numpy.ndarray#

explained_variance_ratio: numpy.ndarray | None#

cumulative_variance: float | None#

method: str#

original_dim: int#

axis_labels: List[str] = ['PC 1', 'PC 2', 'PC 3']#

class geolatent.core.projector.DimensionalityProjector(config: geolatent.config.themes.ProjectionConfig)[source]#

Unified dimensionality-reduction pipeline.

Supports PCA (with invertible transform), t-SNE, and UMAP. The projector follows the fit/transform convention so it can be reused across multiple datasets sharing the same embedding geometry — e.g., projecting a test set using the PCA basis fitted on training data.

Parameters:: config (ProjectionConfig) – Projection hyper-parameters and algorithm selection.

supports_inverse_transform#

True when the underlying algorithm provides a meaningful inverse mapping from the low-dimensional space back to the original feature space. Currently True only for PCA and the identity projection (when the input is already ≤ 3-dimensional).

Type:: bool

is_fitted#

True after fit() or fit_transform() has been called.

Type:: bool

Examples

>>> from geolatent.config.themes import ProjectionConfig
>>> from geolatent.core.projector import DimensionalityProjector
>>> import numpy as np
>>> cfg = ProjectionConfig(method="pca", scale_input=True)
>>> proj = DimensionalityProjector(cfg)
>>> X = np.random.randn(300, 50)
>>> result = proj.fit_transform(X)
>>> result.coordinates.shape
(300, 3)
>>> result.cumulative_variance
0.312...

config#

supports_inverse_transform: bool = False#

is_fitted: bool = False#

fit(X: numpy.ndarray) → DimensionalityProjector[source]#

Fit the projection on training data.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix. Will be standardised internally if config.scale_input is True.
Returns:: self – The fitted projector instance, supporting method chaining.
Return type:: DimensionalityProjector

transform(X: numpy.ndarray) → numpy.ndarray[source]#

Project X using the already-fitted transform.

Parameters:: X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix; must have the same number of features as the data used for fitting.
Returns:: X_proj – 3-D projected coordinates.
Return type:: np.ndarray of shape (n_samples, 3)

fit_transform(X: numpy.ndarray, predict_fn=None, feature_names: List[str] | None = None) → ProjectionResult[source]#

Fit the projection and immediately transform the input data.

Parameters:

X (np.ndarray of shape (n_samples, n_features)) – Input feature matrix.
predict_fn (callable, optional) – Required when config.method == "sensitivity". Any callable that accepts an (n, n_features) array and returns predictions of shape (n,) or (n, n_classes). Works with sklearn, PyTorch, XGBoost, or any custom function.
feature_names (list of str, optional) – Names of the input features, used for axis labels.

Returns:

result – Projected coordinates with associated diagnostic metadata.

Return type:

ProjectionResult

inverse_transform(X_proj: numpy.ndarray) → numpy.ndarray[source]#

Map projected coordinates back to the original feature space.

This method is only meaningful for PCA and the identity projection. For t-SNE / UMAP, it raises NotImplementedError.

Parameters:

X_proj (np.ndarray of shape (n_points, 3)) – Points in the 3-D projected space.

Returns:

X_original – Best-rank-3 approximation of the corresponding feature vectors in the original (possibly high-dimensional) feature space.

Return type:

np.ndarray of shape (n_points, n_features)

Raises:

NotImplementedError – When the projector was fitted with t-SNE or UMAP.
RuntimeError – When the projector has not been fitted yet.