optrade.analysis
optrade.analysis.factors
- get_factor_exposures(root, start_date, end_date, mode='ff3')[source]
Calculate factor model exposures for a stock over the specified period. Supports Fama-French 3-factor (ff3), Fama-French 5-factor (ff5), and Carhart 4-factor (c4) models.
- Parameters:
root (str) – Root symbol of the underlying security
start_date (str) – Start date in YYYYMMDD format
end_date (str) – End date in YYYYMMDD format
mode (str) – Mode for the factor model. Options: “ff3” (Fama-French 3 factor), “ff5” (Fama-French 5 factor), or “c4” (Carhart 4 factor).
- Returns:
Dictionary containing the factor betas –
market_beta: Market excess return sensitivity
size_beta: Small Minus Big (SMB) factor exposure
value_beta: High Minus Low (HML) book-to-market factor exposure
momentum_beta: Winners Minus Losers (WML) momentum factor (Carhart model only)
profitability_beta: Robust Minus Weak (RMW) profitability factor (5-factor only)
investment_beta: Conservative Minus Aggressive (CMA) investment factor (5-factor only)
r_squared: Proportion of return variation explained by the factors
- Return type:
Dict[str, Any]
- factor_categorization(factors, mode='ff3')[source]
Categorize stocks based on their factor model exposures using percentiles.
- Parameters:
factors (Dict[str, Dict[str, float]]) – Nested dictionary where: - Outer key is the root symbol - Inner key is the factor type - Value is the factor beta
mode (str) – Factor model type (“ff3”, “ff5”, or “c4”)
- Returns:
Nested dictionary with categorizations for each stock and factor
- Return type:
Dict[str, Dict[str, str]]
- get_universe_factor_exposures(roots, start_date, end_date, mode='ff3')[source]
Calculate factor model exposures for multiple stocks over the specified period.
- Parameters:
roots (List[str]) – List of stock roots to analyze
start_date (str) – Start date in YYYYMMDD format
end_date (str) – End date in YYYYMMDD format
mode (str) – Factor model to use (“ff3”, “ff5”, or “c4”)
- Returns:
Nested dictionary where –
Outer key is the root symbol
Inner key is the factor type
Value is the factor beta
- Return type:
Dict[str, Dict[str, float]]
optrade.analysis.visualizer
- class Analyzer[source]
Bases:
objectComprehensive analysis tool for model forecast performance evaluation.
- period_visualize(period, period_interval, model, dataset, metrics, batch_size=128, x_axis='Time of Day', y_axis='Normalized Error', title=None, figsize=(12, 6), normalize=False, use_secondary_axis=False, dpi=300, output_format='png', save=False)[source]
Visualize metrics aggregated by specific time periods across multiple days. :param period: Type of period to group by. Options: “daily”. Other periods not yet implemented. :type period: str :param period_interval: If period=”daily”, period_interval represents the number of minutes to group by. :type period_interval: int :param model: Model object (PyTorch or scikit-learn) :param dataset: ForecastingDataset or numpy array of time series data :param metrics: List of metrics to calculate (“mse”, “mae”, “rmse”, “mape”, “r2”) :type metrics: List[str] :param batch_size: Batch size for DataLoader :type batch_size: int :param x_axis: Label for x-axis :type x_axis: str :param y_axis: Label for y-axis :type y_axis: str :param title: Plot title :type title: Optional[str] :param figsize: Figure size (width, height) :type figsize: Tuple[int, int] :param normalize: Whether to normalize metrics to [0,1] range for comparison :type normalize: bool :param use_secondary_axis: Use a secondary y-axis for the second metric :type use_secondary_axis: bool :param dpi: Dots per inch for image resolution :type dpi: int :param output_format: Output format for saving the image :type output_format: str :param save: Save the plot to disk :type save: bool
- Returns:
plt.Figure – The matplotlib figure object
- Parameters:
period (str)
period_interval (int)
model (Module | BaseEstimator)
dataset (ForecastingDataset | ndarray)
metrics (List[str])
batch_size (int)
x_axis (str)
y_axis (str)
title (str | None)
figsize (Tuple[int, int])
normalize (bool)
use_secondary_axis (bool)
dpi (int)
output_format (str)
save (bool)
- Return type:
Figure
- information_coefficient_analysis(forward_periods=[1, 5, 10, 20], rolling_window=20)[source]
Calculate IC (Information Coefficient) for different forward periods
Mathematical explanation: - Forward Period: Number of time steps ahead that the prediction is targeting - IC: Spearman rank correlation between predictions and actual future values
IC = corr_spearman(prediction_t, actual_{t+forward_period})
IC IR (Information Coefficient Information Ratio): The ratio of mean IC to standard deviation of IC over time IC IR = mean(IC) / std(IC)
Why it’s relevant for forecasting and alpha research: - IC measures how well your model ranks outcomes (essential for relative value strategies) - IC across different horizons shows decay pattern of your signal - IC IR quantifies signal-to-noise ratio - a high IC IR indicates consistent predictive power - IC > 0.05 is often considered meaningful in practice for daily forecasts - IC analysis helps determine optimal holding periods and trading frequency
- Parameters:
forward_periods (List[int]) – List of forward periods to analyze
rolling_window (int) – Window size for calculating IC IR
- Returns:
DataFrame with IC metrics by forward period
- error_autocorrelation_analysis(lags=20, plot=True)[source]
Analyze autocorrelation in prediction errors
- Parameters:
lags – Number of lags to analyze
plot – Whether to generate visualization
- Returns:
Series with autocorrelation values by lag
- event_study_analysis(event_dates, window=5)[source]
Analyze model performance around specific market events
- Parameters:
event_dates – List of event dates
window – Window size around events to analyze
- Returns:
DataFrame with performance metrics around events
- analyze_forecast_features(features=None, n_bins=10, plot=True)[source]
Analyze model performance conditional on input feature values This is especially useful for alpha research to understand what market conditions lead to better or worse predictions
- Parameters:
features – List of feature columns to analyze, if None use all available columns
n_bins – Number of quantile bins to divide feature values into
plot – Whether to generate visualizations
- Returns:
Dict of DataFrames with performance metrics by feature quantiles
- get_torch_preds(model, dataset, batch_size=128, channel=0, device='cpu')[source]
Generate predictions from a PyTorch model and ForecastingDataset. :param model: PyTorch model :type model: nn.Module :param dataset: ForecastingDataset object :type dataset: ForecastingDataset :param batch_size: Batch size for DataLoader :type batch_size: int :param device: Device to run model on (‘cuda’ or ‘cpu’) :type device: str
- Returns:
Tuple[np.ndarray, np.ndarray, np.ndarray] – (predictions, targets, target_datetimes)
- Parameters:
model (Module)
dataset (Any)
batch_size (int)
channel (int)
device (str)
- Return type:
Tuple[ndarray, ndarray, ndarray]