optrade.exp


optrade.exp.forecasting

class Experiment(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]

Bases: object

Parameters:
  • log_dir (str | Path)

  • logging (str)

  • seed (int)

  • ablation_id (int | None)

  • exp_id (str)

  • neptune_project_name (str | None)

  • neptune_api_token (str | None)

  • download_only (bool)

__init__(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]

Experiment class for training and evaluating forecasting models in PyTorch.

Parameters:
  • logdir – The directory to save logs.

  • logging (str) – The logging method to use. Options: {“offline”, “neptune”}.

  • seed (int) – The random seed to use for reproducibility.

  • ablation_id (int | None) – The ablation ID.

  • exp_id (str) – The experiment ID.

  • neptune_project_name (str | None) – The Neptune project name.

  • neptune_api_token (str | None) – The Neptune API token.

  • log_dir (str | Path)

  • download_only (bool)

Returns:

None

Return type:

None

save_logs()[source]
init_device(gpu_id=0, mps=False)[source]

Initialize CUDA (or MPS) devices.

Parameters:
  • gpu_id (int) – The GPU ID to use.

  • mps (bool) – Whether to use MPS (Metal Performance Shaders) for macOS.

Returns:

None

Return type:

None

init_logger(exp_id)[source]

Initialize the logger.

Parameters:
  • logdir – The directory to save logs.

  • ablation_id – The ablation ID.

  • exp_id (str) – The experiment ID.

Returns:

None

Return type:

None

init_loaders(root, start_date, end_date, contract_stride, interval_min, right, target_tte, tte_tolerance, moneyness, train_split, val_split, seq_len, pred_len, scaling=False, dtype='float32', core_feats=['option_returns'], tte_feats=None, datetime_feats=None, vol_feats=None, rolling_volatility_range=None, keep_datetime=False, target_channels=None, target_type='multistep', strike_band=0.05, volatility_type='period', volatility_scaled=False, volatility_scalar=1.0, batch_size=32, shuffle=True, drop_last=False, num_workers=4, prefetch_factor=2, pin_memory=False, persistent_workers=True, clean_up=False, offline=False, save_dir=None, verbose=False, validate_contracts=True, modify_contracts=False, dev_mode=False, download_only=False)[source]

Initializes the data loaders for training, validation, and testing.

Parameters:
  • root (str) – The root directory containing the data.

  • start_date (str) – The start date for the data in YYYYMMDD format.

  • end_date (str) – The end date for the data in YYYYMMDD format.

  • contract_stride (int) – The stride for the contracts.

  • interval_min (int) – The interval in minutes for the data.

  • right (str) – The option type (C for call, P for put).

  • target_tte (int) – The target time to expiration in minutes.

  • moneyness (str) – The moneyness type (ATM, OTM, ITM).

  • train_split (float) – The fraction of data to use for training.

  • val_split (float) – The fraction of data to use for validation.

  • strike_band (float | None) – The band to select the strike price for OTM or ITM options.

  • volatility_type (str | None) – The type of historical volatility to use.

  • volatility_scaled (bool) – Whether to scale strike selection by the volatility.

  • volatility_scalar (float | None) – The scalar to multiply the volatility.

  • validate_contracts (bool) – Whether to validate contracts by requesting the data from ThetaData API.

  • modify_contracts (bool) – Whether to overwite contracts .pkl files if certain contracts are invalid.

  • seq_len (int) – The sequence length of the lookback widow.

  • pred_len (int) – The prediction length of the forecast window.

  • scaling (bool) – Whether to apply normalization.

  • core_feats (List[str]) – The core features to use for the model.

  • tte_feats (List[str] | None) – The time-to-expiration features to use for the model.

  • datetime_feats (List[str] | None) – The datetime features to use for the model.

  • keep_datetime (bool) – Whether to keep the datetime column in the dataset.

  • target_channels (List[str] | None) – The target channels to use for the model.

  • target_type (str) – The type of target. Options: “multistep”, “average”, or “average_direction”.

  • batch_size (int) – The batch size for the data loaders.

  • shuffle (bool) – Whether to shuffle the data.

  • drop_last (bool) – Whether to drop the last batch if it is smaller than the batch size.

  • num_workers (int) – The number of workers for the data loaders.

  • prefetch_factor (int | None) – The number of batches to prefetch.

  • pin_memory (bool) – Whether to pin memory for the data loaders.

  • persistent_workers (bool) – Whether to use persistent workers for the data loaders.

  • clean_up (bool) – Whether to clean up the data after loading.

  • offline (bool) – Whether to use offline mode.

  • save_dir (str | None) – The directory to save the data.

  • verbose (bool) – Whether to print verbose output.

  • dev_mode (bool) – Whether to run in development mode.

  • download_only (bool) – Whether to only download the data without running an experiment.

  • tte_tolerance (Tuple[int, int])

  • dtype (str)

  • vol_feats (List[str] | None)

  • rolling_volatility_range (List[int] | None)

Returns:

None

Return type:

None

set_seed()[source]

Fixes a seed for reproducibility purposes.

Return type:

None

init_earlystopping(path, patience)[source]
Parameters:
  • path (str)

  • patience (int)

Return type:

None

get_sklearn_data(train_loader=None, val_loader=None, test_loader=None, has_datetime=False)[source]
Parameters:
  • train_loader (DataLoader | None)

  • val_loader (DataLoader | None)

  • test_loader (DataLoader | None)

  • has_datetime (bool)

Return type:

None

train_sklearn(model, param_dict, tuning_method='grid', n_splits=5, verbose=1, n_jobs=-1, n_iter=100, train_x=None, train_y=None, target_type='multistep', best_model_path=None, early_stopping=False, patience=None)[source]
Parameters:
  • model (BaseEstimator)

  • param_dict (dict)

  • tuning_method (str)

  • n_splits (int)

  • verbose (int)

  • n_jobs (int)

  • n_iter (int)

  • train_x (ndarray | None)

  • train_y (ndarray | None)

  • target_type (str)

  • best_model_path (str | None)

  • early_stopping (bool)

  • patience (int | None)

Return type:

None

test_sklearn(metrics, target_type='multistep', train_x=None, train_y=None, test_x=None, test_y=None, best_model=None)[source]
Parameters:
  • metrics (List[str])

  • target_type (str)

  • train_x (ndarray | None)

  • train_y (ndarray | None)

  • test_x (ndarray | None)

  • test_y (ndarray | None)

  • best_model (BaseEstimator | None)

Return type:

None

train_torch(model, optimizer, criterion, num_epochs, device=None, train_loader=None, val_loader=None, metrics=['mse'], best_model_metric='mse', best_model_path=None, early_stopping=False, patience=None, scheduler=None, target_type='multistep')[source]

Trains a model.

Parameters:
  • model (Module | BaseEstimator) – The model to train.

  • optimizer (Optimizer) – The optimizer to use.

  • criterion (Module) – The loss function.

  • train_loader (DataLoader | None) – The training data loader.

  • num_epochs (int) – The number of epochs to train.

  • val_loader (DataLoader | None) – The validation data loader.

  • metrics (List[str]) – The evaluation metrics to track.

  • best_model_path (str | None) – The path to save the best model.

  • early_stopping (bool) – Whether to use early stopping.

  • patience (int | None) – The number of epochs to wait before stopping.

  • scheduler (_LRScheduler | None) – The learning rate scheduler.

  • best_model_metric (str)

  • target_type (str)

Return type:

Module

validate_torch(model, val_loader, criterion, device, best_model_metric='mse', metrics=['mse'], epoch=None, best_model_path=None, early_stopping=False, target_type='multistep')[source]

Validates the model.

Parameters:
  • model (Module) – The model to validate.

  • val_loader (DataLoader) – The validation data loader.

  • criterion (Module) – The loss function.

  • best_model_metric (str) – The metric to use for the best model.

  • metrics (List[str]) – The evaluation metrics to track.

  • epoch (int | None) – The current epoch.

  • best_model_path (str | None) – The path to save the best model.

  • early_stopping (bool | None) – Whether to use early stopping

  • target_type (str)

Returns:

None

Return type:

None

test_torch(model, criterion, device=None, test_loader=None, metrics=['mse'], target_type='multistep')[source]

Tests the model.

Parameters:
  • model (Module | BaseEstimator) – The model to test.

  • test_loader (DataLoader | None) – The test data loader.

  • criterion (Module) – The loss function.

  • metrics (List[str]) – The evaluation metrics to track.

  • target_type (str)

Returns:

None

Return type:

None

evaluate(model, loader, criterion, device, metrics=['mse'], target_type='multistep')[source]

Evaluates the model and returns metrics. Used in validation and testing.

Parameters:
  • model (Module | BaseEstimator) – The model to evaluate.

  • loader (DataLoader) – The data loader.

  • criterion (Module) – The loss function.

  • metrics (List[str]) – The evaluation metrics to track.

  • target_type (str)

Returns:

stats – A dictionary of evaluation metrics.

Return type:

dict

log_universe(market_metrics, root, parent)[source]

Logs the universe to the logger.

Parameters:
  • root (str) – The root directory containing the data.

  • universe – The Universe object.

  • parent (str) – The parent experiment ID.

Returns:

None

Return type:

None

log_stats(stats, metrics, mode)[source]
Parameters:
  • stats (dict)

  • metrics (List[str])

  • mode (str)

epoch_logger(logger, key, value)[source]
Parameters:
  • key (str)

  • value (str)

Return type:

None

print_master(message)[source]

Prints statements to the rank 0 node.

Parameters:

message (str)

Module contents