Skip to content

Period Month Feature Set

Usage

config.yaml
...
features:
  PeriodMonthFeatureConfig:
    calculation_cols:
      - revenue
      - quantity
    deviation_from_mean_months:
      - 6
      - 12
    calculate_deltas: True
    calculate_percentage_change: True
    resolve_divide_by_zero: True

Features Generated

Feature Type Description Lagged Aggregation Lagged Periods (Months)
Sum Sum of values for specified columns Yes (as a period, includes all months) P3, P6
Mean Mean (average) of values for specified columns Yes (as a period, includes all months) P3, P6
Deviation From Mean Deviation from the mean for specified columns Yes (as a period, includes all months, Custom Logic) User defined in deviation_from_mean_months
Deltas Calculate deltas between pairs of lagged columns Partially (Uses the ratio between lags) P6_P3
Ratios Calculates the ratios between all above fields Partially (Uses the ratio between lags) P6_P3

Feature Configuration

PeriodMonthFeatureConfig Class

PeriodMonthFeatureConfig is a configuration class for generating features based on a period of time.

Attributes
  • lag_months: List of integers representing the number of months to lag. (by default: [1, 3, 6])
  • calculation_cols: List of column names to generate calculation features for.
  • deviation_from_mean_months: List of integers representing the number of months to calculate deviation from mean for.
  • calculate_deltas: Boolean indicating whether or not to calculate deltas between specified columns.
  • calculate_percentage_change: Boolean indicating whether or not to calculate percentage change for specified columns.
  • resolve_divide_by_zero: Boolean indicating whether or not to resolve divide by zero errors. Can only be True if calculate_percentage_change and calculate_deltas are True.
Config Methods
  • __attrs_post_init__(): Post-initialization hook.

Feature Generation

PeriodMonthFeatureSet Class

PeriodMonthFeatureSet is a feature set that generates features based on a period of data.

Feature Set Methods
  • calculate(df, dataset_config, feature_config): Calculate the features for the given DataFrame, DatasetConfig, and PeriodMonthFeatureConfig.

Feature Functions

  • sum_(df, key_cols, agg_col): Calculate the sum of the given aggregation column for the given key columns.
  • mean_(df, key_cols, agg_col): Calculate the mean of the given aggregation column for the given key columns.
  • stddev_(df, key_cols, agg_col): Calculate the standard deviation of the given aggregation column for the given key columns.
  • deviation_from_mean(df, key_cols, agg_col, mean_months): Calculate the deviation from the mean of a given column for each group of key columns.
  • _calculate_deltas(df, key_cols): Calculate deltas between specified columns.