Period Month Feature Set
Usage
config.yaml
...
features:
PeriodMonthFeatureConfig:
calculation_cols:
- revenue
- quantity
deviation_from_mean_months:
- 6
- 12
calculate_deltas: True
calculate_percentage_change: True
resolve_divide_by_zero: True
Features Generated
| Feature Type | Description | Lagged Aggregation | Lagged Periods (Months) |
|---|---|---|---|
| Sum | Sum of values for specified columns | Yes (as a period, includes all months) | P3, P6 |
| Mean | Mean (average) of values for specified columns | Yes (as a period, includes all months) | P3, P6 |
| Deviation From Mean | Deviation from the mean for specified columns | Yes (as a period, includes all months, Custom Logic) | User defined in deviation_from_mean_months |
| Deltas | Calculate deltas between pairs of lagged columns | Partially (Uses the ratio between lags) | P6_P3 |
| Ratios | Calculates the ratios between all above fields | Partially (Uses the ratio between lags) | P6_P3 |
Feature Configuration
PeriodMonthFeatureConfig Class
PeriodMonthFeatureConfig is a configuration class for generating features based on a period of time.
Attributes
lag_months: List of integers representing the number of months to lag. (by default:[1, 3, 6])calculation_cols: List of column names to generate calculation features for.deviation_from_mean_months: List of integers representing the number of months to calculate deviation from mean for.calculate_deltas: Boolean indicating whether or not to calculate deltas between specified columns.calculate_percentage_change: Boolean indicating whether or not to calculate percentage change for specified columns.resolve_divide_by_zero: Boolean indicating whether or not to resolve divide by zero errors. Can only beTrueifcalculate_percentage_changeandcalculate_deltasareTrue.
Config Methods
__attrs_post_init__(): Post-initialization hook.
Feature Generation
PeriodMonthFeatureSet Class
PeriodMonthFeatureSet is a feature set that generates features based on a period of data.
Feature Set Methods
calculate(df, dataset_config, feature_config): Calculate the features for the given DataFrame,DatasetConfig, andPeriodMonthFeatureConfig.
Feature Functions
sum_(df, key_cols, agg_col): Calculate the sum of the given aggregation column for the given key columns.mean_(df, key_cols, agg_col): Calculate the mean of the given aggregation column for the given key columns.stddev_(df, key_cols, agg_col): Calculate the standard deviation of the given aggregation column for the given key columns.deviation_from_mean(df, key_cols, agg_col, mean_months): Calculate the deviation from the mean of a given column for each group of key columns._calculate_deltas(df, key_cols): Calculate deltas between specified columns.