optrade.exp
optrade.exp.forecasting
- class Experiment(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]
Bases:
object- Parameters:
log_dir (str | Path)
logging (str)
seed (int)
ablation_id (int | None)
exp_id (str)
neptune_project_name (str | None)
neptune_api_token (str | None)
download_only (bool)
- __init__(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]
Experiment class for training and evaluating forecasting models in PyTorch.
- Parameters:
logdir – The directory to save logs.
logging (str) – The logging method to use. Options: {“offline”, “neptune”}.
seed (int) – The random seed to use for reproducibility.
ablation_id (int | None) – The ablation ID.
exp_id (str) – The experiment ID.
neptune_project_name (str | None) – The Neptune project name.
neptune_api_token (str | None) – The Neptune API token.
log_dir (str | Path)
download_only (bool)
- Returns:
None
- Return type:
None
- init_device(gpu_id=0, mps=False)[source]
Initialize CUDA (or MPS) devices.
- Parameters:
gpu_id (int) – The GPU ID to use.
mps (bool) – Whether to use MPS (Metal Performance Shaders) for macOS.
- Returns:
None
- Return type:
None
- init_logger(exp_id)[source]
Initialize the logger.
- Parameters:
logdir – The directory to save logs.
ablation_id – The ablation ID.
exp_id (str) – The experiment ID.
- Returns:
None
- Return type:
None
- init_loaders(root, start_date, end_date, contract_stride, interval_min, right, target_tte, tte_tolerance, moneyness, train_split, val_split, seq_len, pred_len, scaling=False, dtype='float32', core_feats=['option_returns'], tte_feats=None, datetime_feats=None, vol_feats=None, rolling_volatility_range=None, keep_datetime=False, target_channels=None, target_type='multistep', strike_band=0.05, volatility_type='period', volatility_scaled=False, volatility_scalar=1.0, batch_size=32, shuffle=True, drop_last=False, num_workers=4, prefetch_factor=2, pin_memory=False, persistent_workers=True, clean_up=False, offline=False, save_dir=None, verbose=False, validate_contracts=True, modify_contracts=False, dev_mode=False, download_only=False)[source]
Initializes the data loaders for training, validation, and testing.
- Parameters:
root (str) – The root directory containing the data.
start_date (str) – The start date for the data in YYYYMMDD format.
end_date (str) – The end date for the data in YYYYMMDD format.
contract_stride (int) – The stride for the contracts.
interval_min (int) – The interval in minutes for the data.
right (str) – The option type (C for call, P for put).
target_tte (int) – The target time to expiration in minutes.
moneyness (str) – The moneyness type (ATM, OTM, ITM).
train_split (float) – The fraction of data to use for training.
val_split (float) – The fraction of data to use for validation.
strike_band (float | None) – The band to select the strike price for OTM or ITM options.
volatility_type (str | None) – The type of historical volatility to use.
volatility_scaled (bool) – Whether to scale strike selection by the volatility.
volatility_scalar (float | None) – The scalar to multiply the volatility.
validate_contracts (bool) – Whether to validate contracts by requesting the data from ThetaData API.
modify_contracts (bool) – Whether to overwite contracts .pkl files if certain contracts are invalid.
seq_len (int) – The sequence length of the lookback widow.
pred_len (int) – The prediction length of the forecast window.
scaling (bool) – Whether to apply normalization.
core_feats (List[str]) – The core features to use for the model.
tte_feats (List[str] | None) – The time-to-expiration features to use for the model.
datetime_feats (List[str] | None) – The datetime features to use for the model.
keep_datetime (bool) – Whether to keep the datetime column in the dataset.
target_channels (List[str] | None) – The target channels to use for the model.
target_type (str) – The type of target. Options: “multistep”, “average”, or “average_direction”.
batch_size (int) – The batch size for the data loaders.
shuffle (bool) – Whether to shuffle the data.
drop_last (bool) – Whether to drop the last batch if it is smaller than the batch size.
num_workers (int) – The number of workers for the data loaders.
prefetch_factor (int | None) – The number of batches to prefetch.
pin_memory (bool) – Whether to pin memory for the data loaders.
persistent_workers (bool) – Whether to use persistent workers for the data loaders.
clean_up (bool) – Whether to clean up the data after loading.
offline (bool) – Whether to use offline mode.
save_dir (str | None) – The directory to save the data.
verbose (bool) – Whether to print verbose output.
dev_mode (bool) – Whether to run in development mode.
download_only (bool) – Whether to only download the data without running an experiment.
tte_tolerance (Tuple[int, int])
dtype (str)
vol_feats (List[str] | None)
rolling_volatility_range (List[int] | None)
- Returns:
None
- Return type:
None
- get_sklearn_data(train_loader=None, val_loader=None, test_loader=None, has_datetime=False)[source]
- Parameters:
train_loader (DataLoader | None)
val_loader (DataLoader | None)
test_loader (DataLoader | None)
has_datetime (bool)
- Return type:
None
- train_sklearn(model, param_dict, tuning_method='grid', n_splits=5, verbose=1, n_jobs=-1, n_iter=100, train_x=None, train_y=None, target_type='multistep', best_model_path=None, early_stopping=False, patience=None)[source]
- Parameters:
model (BaseEstimator)
param_dict (dict)
tuning_method (str)
n_splits (int)
verbose (int)
n_jobs (int)
n_iter (int)
train_x (ndarray | None)
train_y (ndarray | None)
target_type (str)
best_model_path (str | None)
early_stopping (bool)
patience (int | None)
- Return type:
None
- test_sklearn(metrics, target_type='multistep', train_x=None, train_y=None, test_x=None, test_y=None, best_model=None)[source]
- Parameters:
metrics (List[str])
target_type (str)
train_x (ndarray | None)
train_y (ndarray | None)
test_x (ndarray | None)
test_y (ndarray | None)
best_model (BaseEstimator | None)
- Return type:
None
- train_torch(model, optimizer, criterion, num_epochs, device=None, train_loader=None, val_loader=None, metrics=['mse'], best_model_metric='mse', best_model_path=None, early_stopping=False, patience=None, scheduler=None, target_type='multistep')[source]
Trains a model.
- Parameters:
model (Module | BaseEstimator) – The model to train.
optimizer (Optimizer) – The optimizer to use.
criterion (Module) – The loss function.
train_loader (DataLoader | None) – The training data loader.
num_epochs (int) – The number of epochs to train.
val_loader (DataLoader | None) – The validation data loader.
metrics (List[str]) – The evaluation metrics to track.
best_model_path (str | None) – The path to save the best model.
early_stopping (bool) – Whether to use early stopping.
patience (int | None) – The number of epochs to wait before stopping.
scheduler (_LRScheduler | None) – The learning rate scheduler.
best_model_metric (str)
target_type (str)
- Return type:
Module
- validate_torch(model, val_loader, criterion, device, best_model_metric='mse', metrics=['mse'], epoch=None, best_model_path=None, early_stopping=False, target_type='multistep')[source]
Validates the model.
- Parameters:
model (Module) – The model to validate.
val_loader (DataLoader) – The validation data loader.
criterion (Module) – The loss function.
best_model_metric (str) – The metric to use for the best model.
metrics (List[str]) – The evaluation metrics to track.
epoch (int | None) – The current epoch.
best_model_path (str | None) – The path to save the best model.
early_stopping (bool | None) – Whether to use early stopping
target_type (str)
- Returns:
None
- Return type:
None
- test_torch(model, criterion, device=None, test_loader=None, metrics=['mse'], target_type='multistep')[source]
Tests the model.
- Parameters:
model (Module | BaseEstimator) – The model to test.
test_loader (DataLoader | None) – The test data loader.
criterion (Module) – The loss function.
metrics (List[str]) – The evaluation metrics to track.
target_type (str)
- Returns:
None
- Return type:
None
- evaluate(model, loader, criterion, device, metrics=['mse'], target_type='multistep')[source]
Evaluates the model and returns metrics. Used in validation and testing.
- Parameters:
model (Module | BaseEstimator) – The model to evaluate.
loader (DataLoader) – The data loader.
criterion (Module) – The loss function.
metrics (List[str]) – The evaluation metrics to track.
target_type (str)
- Returns:
stats – A dictionary of evaluation metrics.
- Return type:
dict