optrade.analysis


optrade.analysis.factors

get_factor_exposures(root, start_date, end_date, mode='ff3')[source]

Calculate factor model exposures for a stock over the specified period. Supports Fama-French 3-factor (ff3), Fama-French 5-factor (ff5), and Carhart 4-factor (c4) models.

Parameters:
  • root (str) – Root symbol of the underlying security

  • start_date (str) – Start date in YYYYMMDD format

  • end_date (str) – End date in YYYYMMDD format

  • mode (str) – Mode for the factor model. Options: “ff3” (Fama-French 3 factor), “ff5” (Fama-French 5 factor), or “c4” (Carhart 4 factor).

Returns:

Dictionary containing the factor betas

  • market_beta: Market excess return sensitivity

  • size_beta: Small Minus Big (SMB) factor exposure

  • value_beta: High Minus Low (HML) book-to-market factor exposure

  • momentum_beta: Winners Minus Losers (WML) momentum factor (Carhart model only)

  • profitability_beta: Robust Minus Weak (RMW) profitability factor (5-factor only)

  • investment_beta: Conservative Minus Aggressive (CMA) investment factor (5-factor only)

  • r_squared: Proportion of return variation explained by the factors

Return type:

Dict[str, Any]

factor_categorization(factors, mode='ff3')[source]

Categorize stocks based on their factor model exposures using percentiles.

Parameters:
  • factors (Dict[str, Dict[str, float]]) – Nested dictionary where: - Outer key is the root symbol - Inner key is the factor type - Value is the factor beta

  • mode (str) – Factor model type (“ff3”, “ff5”, or “c4”)

Returns:

Nested dictionary with categorizations for each stock and factor

Return type:

Dict[str, Dict[str, str]]

get_universe_factor_exposures(roots, start_date, end_date, mode='ff3')[source]

Calculate factor model exposures for multiple stocks over the specified period.

Parameters:
  • roots (List[str]) – List of stock roots to analyze

  • start_date (str) – Start date in YYYYMMDD format

  • end_date (str) – End date in YYYYMMDD format

  • mode (str) – Factor model to use (“ff3”, “ff5”, or “c4”)

Returns:

Nested dictionary where

  • Outer key is the root symbol

  • Inner key is the factor type

  • Value is the factor beta

Return type:

Dict[str, Dict[str, float]]

optrade.analysis.visualizer

class Analyzer[source]

Bases: object

Comprehensive analysis tool for model forecast performance evaluation.

__init__()[source]
period_visualize(period, period_interval, model, dataset, metrics, batch_size=128, x_axis='Time of Day', y_axis='Normalized Error', title=None, figsize=(12, 6), normalize=False, use_secondary_axis=False, dpi=300, output_format='png', save=False)[source]

Visualize metrics aggregated by specific time periods across multiple days. :param period: Type of period to group by. Options: “daily”. Other periods not yet implemented. :type period: str :param period_interval: If period=”daily”, period_interval represents the number of minutes to group by. :type period_interval: int :param model: Model object (PyTorch or scikit-learn) :param dataset: ForecastingDataset or numpy array of time series data :param metrics: List of metrics to calculate (“mse”, “mae”, “rmse”, “mape”, “r2”) :type metrics: List[str] :param batch_size: Batch size for DataLoader :type batch_size: int :param x_axis: Label for x-axis :type x_axis: str :param y_axis: Label for y-axis :type y_axis: str :param title: Plot title :type title: Optional[str] :param figsize: Figure size (width, height) :type figsize: Tuple[int, int] :param normalize: Whether to normalize metrics to [0,1] range for comparison :type normalize: bool :param use_secondary_axis: Use a secondary y-axis for the second metric :type use_secondary_axis: bool :param dpi: Dots per inch for image resolution :type dpi: int :param output_format: Output format for saving the image :type output_format: str :param save: Save the plot to disk :type save: bool

Returns:

plt.Figure – The matplotlib figure object

Parameters:
  • period (str)

  • period_interval (int)

  • model (Module | BaseEstimator)

  • dataset (ForecastingDataset | ndarray)

  • metrics (List[str])

  • batch_size (int)

  • x_axis (str)

  • y_axis (str)

  • title (str | None)

  • figsize (Tuple[int, int])

  • normalize (bool)

  • use_secondary_axis (bool)

  • dpi (int)

  • output_format (str)

  • save (bool)

Return type:

Figure

information_coefficient_analysis(forward_periods=[1, 5, 10, 20], rolling_window=20)[source]

Calculate IC (Information Coefficient) for different forward periods

Mathematical explanation: - Forward Period: Number of time steps ahead that the prediction is targeting - IC: Spearman rank correlation between predictions and actual future values

IC = corr_spearman(prediction_t, actual_{t+forward_period})

  • IC IR (Information Coefficient Information Ratio): The ratio of mean IC to standard deviation of IC over time IC IR = mean(IC) / std(IC)

Why it’s relevant for forecasting and alpha research: - IC measures how well your model ranks outcomes (essential for relative value strategies) - IC across different horizons shows decay pattern of your signal - IC IR quantifies signal-to-noise ratio - a high IC IR indicates consistent predictive power - IC > 0.05 is often considered meaningful in practice for daily forecasts - IC analysis helps determine optimal holding periods and trading frequency

Parameters:
  • forward_periods (List[int]) – List of forward periods to analyze

  • rolling_window (int) – Window size for calculating IC IR

Returns:

DataFrame with IC metrics by forward period

error_autocorrelation_analysis(lags=20, plot=True)[source]

Analyze autocorrelation in prediction errors

Parameters:
  • lags – Number of lags to analyze

  • plot – Whether to generate visualization

Returns:

Series with autocorrelation values by lag

event_study_analysis(event_dates, window=5)[source]

Analyze model performance around specific market events

Parameters:
  • event_dates – List of event dates

  • window – Window size around events to analyze

Returns:

DataFrame with performance metrics around events

analyze_forecast_features(features=None, n_bins=10, plot=True)[source]

Analyze model performance conditional on input feature values This is especially useful for alpha research to understand what market conditions lead to better or worse predictions

Parameters:
  • features – List of feature columns to analyze, if None use all available columns

  • n_bins – Number of quantile bins to divide feature values into

  • plot – Whether to generate visualizations

Returns:

Dict of DataFrames with performance metrics by feature quantiles

get_torch_preds(model, dataset, batch_size=128, channel=0, device='cpu')[source]

Generate predictions from a PyTorch model and ForecastingDataset. :param model: PyTorch model :type model: nn.Module :param dataset: ForecastingDataset object :type dataset: ForecastingDataset :param batch_size: Batch size for DataLoader :type batch_size: int :param device: Device to run model on (‘cuda’ or ‘cpu’) :type device: str

Returns:

Tuple[np.ndarray, np.ndarray, np.ndarray] – (predictions, targets, target_datetimes)

Parameters:
  • model (Module)

  • dataset (Any)

  • batch_size (int)

  • channel (int)

  • device (str)

Return type:

Tuple[ndarray, ndarray, ndarray]

Module contents