Optimization¶
Grid search optimization over DR hyperparameters.
Parameter Classes¶
cdr_bench.optimization.params.DimReducerParams
dataclass
¶
cdr_bench.optimization.params.PCAParams
dataclass
¶
Bases: DimReducerParams
cdr_bench.optimization.params.UMAPParams
dataclass
¶
Bases: DimReducerParams
cdr_bench.optimization.params.TSNEParams
dataclass
¶
Bases: DimReducerParams
cdr_bench.optimization.params.GTMParams
dataclass
¶
Bases: DimReducerParams
Class for keeping track of GTM-specific parameters, inheriting from DimReducerParams.
default_params(n_components)
staticmethod
¶
Returns default parameters for GTM based on the number of components.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n_components
|
int
|
Number of components for GTM. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict[str, Any]: A dictionary of default parameters. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If n_components is not 2 or 3. |
cdr_bench.optimization.params.ScoringParams
dataclass
¶
A data class to store parameters for scoring.
Attributes:
| Name | Type | Description |
|---|---|---|
ambient_dim_indices |
List
|
Indices of the high-dimensional data. |
n_neighbors |
int
|
Number of neighbors to consider. |
other_param |
Any
|
Other optional parameters. |
split |
Any
|
Split parameter for compatibility with scikit-learn. |
low_dim_metric |
str
|
Metric for the low-dimensional space. |
normalize |
Any
|
Flag to normalize the score. |
from_neighbors_indices(k_neighbors, indices)
classmethod
¶
Create ScoringParams from nearest neighbors indices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
k_neighbors
|
int
|
Number of nearest neighbors to consider. |
required |
indices
|
ndarray
|
Indices of the nearest neighbors. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ScoringParams |
ScoringParams
|
An instance of ScoringParams with the calculated parameters. |
cdr_bench.optimization.params.OptimizerParams
dataclass
¶
Optimizer¶
cdr_bench.optimization.optimization.Optimizer
¶
__init__(params)
¶
grid_search(X, y=None)
¶
Perform grid search over parameter grid, fit estimator to data, and evaluate performance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Training data. |
required |
y
|
ndarray
|
Target data. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
None
|
None |
convert_scores_to_dataframe()
¶
Convert the all_scores list to a DataFrame.
save_results()
¶
Save optimization results and the best model to files.
Helper Functions¶
cdr_bench.optimization.optimization.perform_optimization(method, method_grid, method_param, X_transformed, y_transformed, scoring_params, dataset_output_dir)
¶
Perform optimization using a specific dimensionality reduction method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Dimensionality reduction method to use. |
required |
method_grid
|
Any
|
Parameter grid for the method. |
required |
method_param
|
Any
|
Parameters for the method. |
required |
X_transformed
|
ndarray
|
Transformed high-dimensional data. |
required |
y_transformed
|
Optional[ndarray]
|
Transformed reference data. |
required |
scoring_params
|
Any
|
Scoring parameters for optimization. |
required |
Returns:
| Type | Description |
|---|---|
Optimizer
|
Tuple[Any, np.ndarray, int]: Optimizer, coordinates, and nearest neighbors overlap score. |
cdr_bench.optimization.optimization.create_param_grid(data_shape, n_components, method='UMAP', add_dim=False, test=False)
¶
Create a parameter grid for the specified dimensionality reduction method.