Skip to content

Optimization

Grid search optimization over DR hyperparameters.

Parameter Classes

cdr_bench.optimization.params.DimReducerParams dataclass

cdr_bench.optimization.params.PCAParams dataclass

cdr_bench.optimization.params.UMAPParams dataclass

cdr_bench.optimization.params.TSNEParams dataclass

cdr_bench.optimization.params.GTMParams dataclass

Bases: DimReducerParams

Class for keeping track of GTM-specific parameters, inheriting from DimReducerParams.

default_params(n_components) staticmethod

Returns default parameters for GTM based on the number of components.

Parameters:

Name Type Description Default
n_components int

Number of components for GTM.

required

Returns:

Type Description
dict[str, Any]

Dict[str, Any]: A dictionary of default parameters.

Raises:

Type Description
ValueError

If n_components is not 2 or 3.

cdr_bench.optimization.params.ScoringParams dataclass

A data class to store parameters for scoring.

Attributes:

Name Type Description
ambient_dim_indices List

Indices of the high-dimensional data.

n_neighbors int

Number of neighbors to consider.

other_param Any

Other optional parameters.

split Any

Split parameter for compatibility with scikit-learn.

low_dim_metric str

Metric for the low-dimensional space.

normalize Any

Flag to normalize the score.

from_neighbors_indices(k_neighbors, indices) classmethod

Create ScoringParams from nearest neighbors indices.

Parameters:

Name Type Description Default
k_neighbors int

Number of nearest neighbors to consider.

required
indices ndarray

Indices of the nearest neighbors.

required

Returns:

Name Type Description
ScoringParams ScoringParams

An instance of ScoringParams with the calculated parameters.

cdr_bench.optimization.params.OptimizerParams dataclass

Optimizer

cdr_bench.optimization.optimization.Optimizer

__init__(params)

Perform grid search over parameter grid, fit estimator to data, and evaluate performance.

Parameters:

Name Type Description Default
X ndarray

Training data.

required
y ndarray

Target data. Defaults to None.

None

Returns:

Type Description
None

None

convert_scores_to_dataframe()

Convert the all_scores list to a DataFrame.

save_results()

Save optimization results and the best model to files.

Helper Functions

cdr_bench.optimization.optimization.perform_optimization(method, method_grid, method_param, X_transformed, y_transformed, scoring_params, dataset_output_dir)

Perform optimization using a specific dimensionality reduction method.

Parameters:

Name Type Description Default
method str

Dimensionality reduction method to use.

required
method_grid Any

Parameter grid for the method.

required
method_param Any

Parameters for the method.

required
X_transformed ndarray

Transformed high-dimensional data.

required
y_transformed Optional[ndarray]

Transformed reference data.

required
scoring_params Any

Scoring parameters for optimization.

required

Returns:

Type Description
Optimizer

Tuple[Any, np.ndarray, int]: Optimizer, coordinates, and nearest neighbors overlap score.

cdr_bench.optimization.optimization.create_param_grid(data_shape, n_components, method='UMAP', add_dim=False, test=False)

Create a parameter grid for the specified dimensionality reduction method.