optrade.exp

optrade.exp.forecasting

class Experiment(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]

Bases: object

Parameters:

log_dir (str | Path)
logging (str)
seed (int)
ablation_id (int | None)
exp_id (str)
neptune_project_name (str | None)
neptune_api_token (str | None)
download_only (bool)

__init__(log_dir='logs', logging='offline', seed=42, ablation_id=None, exp_id='MY7GZu', neptune_project_name=None, neptune_api_token=None, download_only=False)[source]

Experiment class for training and evaluating forecasting models in PyTorch.

Parameters:

logdir – The directory to save logs.
logging (str) – The logging method to use. Options: {“offline”, “neptune”}.
seed (int) – The random seed to use for reproducibility.
ablation_id (int | None) – The ablation ID.
exp_id (str) – The experiment ID.
neptune_project_name (str | None) – The Neptune project name.
neptune_api_token (str | None) – The Neptune API token.
log_dir (str | Path)
download_only (bool)

Returns:

None

Return type:

None

save_logs()[source]

init_device(gpu_id=0, mps=False)[source]

Initialize CUDA (or MPS) devices.

Parameters:

gpu_id (int) – The GPU ID to use.
mps (bool) – Whether to use MPS (Metal Performance Shaders) for macOS.

Returns:

None

Return type:

None

init_logger(exp_id)[source]

Initialize the logger.

Parameters:

logdir – The directory to save logs.
ablation_id – The ablation ID.
exp_id (str) – The experiment ID.

Returns:

None

Return type:

None

init_loaders(root, start_date, end_date, contract_stride, interval_min, right, target_tte, tte_tolerance, moneyness, train_split, val_split, seq_len, pred_len, scaling=False, dtype='float32', core_feats=['option_returns'], tte_feats=None, datetime_feats=None, vol_feats=None, rolling_volatility_range=None, keep_datetime=False, target_channels=None, target_type='multistep', strike_band=0.05, volatility_type='period', volatility_scaled=False, volatility_scalar=1.0, batch_size=32, shuffle=True, drop_last=False, num_workers=4, prefetch_factor=2, pin_memory=False, persistent_workers=True, clean_up=False, offline=False, save_dir=None, verbose=False, validate_contracts=True, modify_contracts=False, dev_mode=False, download_only=False)[source]

Initializes the data loaders for training, validation, and testing.

Parameters:

root (str) – The root directory containing the data.
start_date (str) – The start date for the data in YYYYMMDD format.
end_date (str) – The end date for the data in YYYYMMDD format.
contract_stride (int) – The stride for the contracts.
interval_min (int) – The interval in minutes for the data.
right (str) – The option type (C for call, P for put).
target_tte (int) – The target time to expiration in minutes.
moneyness (str) – The moneyness type (ATM, OTM, ITM).
train_split (float) – The fraction of data to use for training.
val_split (float) – The fraction of data to use for validation.
strike_band (float | None) – The band to select the strike price for OTM or ITM options.
volatility_type (str | None) – The type of historical volatility to use.
volatility_scaled (bool) – Whether to scale strike selection by the volatility.
volatility_scalar (float | None) – The scalar to multiply the volatility.
validate_contracts (bool) – Whether to validate contracts by requesting the data from ThetaData API.
modify_contracts (bool) – Whether to overwite contracts .pkl files if certain contracts are invalid.
seq_len (int) – The sequence length of the lookback widow.
pred_len (int) – The prediction length of the forecast window.
scaling (bool) – Whether to apply normalization.
core_feats (List[str]) – The core features to use for the model.
tte_feats (List[str] | None) – The time-to-expiration features to use for the model.
datetime_feats (List[str] | None) – The datetime features to use for the model.
keep_datetime (bool) – Whether to keep the datetime column in the dataset.
target_channels (List[str] | None) – The target channels to use for the model.
target_type (str) – The type of target. Options: “multistep”, “average”, or “average_direction”.
batch_size (int) – The batch size for the data loaders.
shuffle (bool) – Whether to shuffle the data.
drop_last (bool) – Whether to drop the last batch if it is smaller than the batch size.
num_workers (int) – The number of workers for the data loaders.
prefetch_factor (int | None) – The number of batches to prefetch.
pin_memory (bool) – Whether to pin memory for the data loaders.
persistent_workers (bool) – Whether to use persistent workers for the data loaders.
clean_up (bool) – Whether to clean up the data after loading.
offline (bool) – Whether to use offline mode.
save_dir (str | None) – The directory to save the data.
verbose (bool) – Whether to print verbose output.
dev_mode (bool) – Whether to run in development mode.
download_only (bool) – Whether to only download the data without running an experiment.
tte_tolerance (Tuple[int, int])
dtype (str)
vol_feats (List[str] | None)
rolling_volatility_range (List[int] | None)

Returns:

None

Return type:

None

set_seed()[source]

Fixes a seed for reproducibility purposes.

Return type:: None

init_earlystopping(path, patience)[source]

Parameters:

path (str)
patience (int)

Return type:

None

get_sklearn_data(train_loader=None, val_loader=None, test_loader=None, has_datetime=False)[source]

Parameters:

train_loader (DataLoader | None)
val_loader (DataLoader | None)
test_loader (DataLoader | None)
has_datetime (bool)

Return type:

None

train_sklearn(model, param_dict, tuning_method='grid', n_splits=5, verbose=1, n_jobs=-1, n_iter=100, train_x=None, train_y=None, target_type='multistep', best_model_path=None, early_stopping=False, patience=None)[source]

Parameters:

model (BaseEstimator)
param_dict (dict)
tuning_method (str)
n_splits (int)
verbose (int)
n_jobs (int)
n_iter (int)
train_x (ndarray | None)
train_y (ndarray | None)
target_type (str)
best_model_path (str | None)
early_stopping (bool)
patience (int | None)

Return type:

None

test_sklearn(metrics, target_type='multistep', train_x=None, train_y=None, test_x=None, test_y=None, best_model=None)[source]

Parameters:

metrics (List[str])
target_type (str)
train_x (ndarray | None)
train_y (ndarray | None)
test_x (ndarray | None)
test_y (ndarray | None)
best_model (BaseEstimator | None)

Return type:

None

train_torch(model, optimizer, criterion, num_epochs, device=None, train_loader=None, val_loader=None, metrics=['mse'], best_model_metric='mse', best_model_path=None, early_stopping=False, patience=None, scheduler=None, target_type='multistep')[source]

Trains a model.

Parameters:

model (Module | BaseEstimator) – The model to train.
optimizer (Optimizer) – The optimizer to use.
criterion (Module) – The loss function.
train_loader (DataLoader | None) – The training data loader.
num_epochs (int) – The number of epochs to train.
val_loader (DataLoader | None) – The validation data loader.
metrics (List[str]) – The evaluation metrics to track.
best_model_path (str | None) – The path to save the best model.
early_stopping (bool) – Whether to use early stopping.
patience (int | None) – The number of epochs to wait before stopping.
scheduler (_LRScheduler | None) – The learning rate scheduler.
best_model_metric (str)
target_type (str)

Return type:

Module

validate_torch(model, val_loader, criterion, device, best_model_metric='mse', metrics=['mse'], epoch=None, best_model_path=None, early_stopping=False, target_type='multistep')[source]

Validates the model.

Parameters:

model (Module) – The model to validate.
val_loader (DataLoader) – The validation data loader.
criterion (Module) – The loss function.
best_model_metric (str) – The metric to use for the best model.
metrics (List[str]) – The evaluation metrics to track.
epoch (int | None) – The current epoch.
best_model_path (str | None) – The path to save the best model.
early_stopping (bool | None) – Whether to use early stopping
target_type (str)

Returns:

None

Return type:

None

test_torch(model, criterion, device=None, test_loader=None, metrics=['mse'], target_type='multistep')[source]

Tests the model.

Parameters:

model (Module | BaseEstimator) – The model to test.
test_loader (DataLoader | None) – The test data loader.
criterion (Module) – The loss function.
metrics (List[str]) – The evaluation metrics to track.
target_type (str)

Returns:

None

Return type:

None

evaluate(model, loader, criterion, device, metrics=['mse'], target_type='multistep')[source]

Evaluates the model and returns metrics. Used in validation and testing.

Parameters:

model (Module | BaseEstimator) – The model to evaluate.
loader (DataLoader) – The data loader.
criterion (Module) – The loss function.
metrics (List[str]) – The evaluation metrics to track.
target_type (str)

Returns:

stats – A dictionary of evaluation metrics.

Return type:

dict

log_universe(market_metrics, root, parent)[source]

Logs the universe to the logger.

Parameters:

root (str) – The root directory containing the data.
universe – The Universe object.
parent (str) – The parent experiment ID.

Returns:

None

Return type:

None

log_stats(stats, metrics, mode)[source]

Parameters:

stats (dict)
metrics (List[str])
mode (str)

epoch_logger(logger, key, value)[source]

Parameters:

key (str)
value (str)

Return type:

None

print_master(message)[source]

Prints statements to the rank 0 node.

Parameters:: message (str)

optrade.exp

optrade.exp.forecasting

Module contents