Period Week Feature Set
Complete Weeks Only
Period week features automatically exclude the current week (the week containing the calculation date) to ensure only complete historical weeks are included. This provides consistent weekly analysis regardless of which day of the week the calculation runs.
Usage
...
features:
PeriodWeekFeatureConfig:
lag_weeks:
- 2
- 4
- 8
calculation_columns:
- <column_name>
deviation_from_mean_weeks:
- 2
- 4
calculate_deltas: True
calculate_percentage_change: True
resolve_divide_by_zero: True
Features Generated
| Feature Type | Description | Lagged Aggregation | Lagged Periods (Weeks) |
|---|---|---|---|
| Sum | Sum of values for specified columns | Yes (Period aggregations) | PW2, PW4, PW8 |
| Mean | Mean (average) of values for specified columns | Yes (Period aggregations) | PW2, PW4, PW8 |
| Stddev | Standard deviation of values for specified cols | Yes (Period aggregations) | PW2, PW4, PW8 |
| Deviation From Mean | Deviation from the mean for specified columns | Yes (Custom logic: current week minus mean) | DEV_ |
| Deltas | Difference between pairs of lagged features | Yes (Between PW columns) | PW4-PW2, PW8-PW4, etc. |
| Ratios | Calculates percentage change between lag columns | Optional (when calculate_percentage_change=True) | PW4/PW2, PW8/PW4, etc. |
Feature Configuration
PeriodWeekFeatureConfig Class
Configuration class for period-based weekly feature calculation.
Attributes
lag_weeks: List[int] List of integers representing the number of weeks to lag (default: [2, 4, 8, 12]).calculation_columns: List[str] Columns to generate period features for.deviation_from_mean_weeks: List[int] Weeks for deviation-from-mean calculation.calculate_deltas: bool Flag to calculate deltas between lag columns.calculate_percentage_change: bool Flag to calculate percentage changes.resolve_divide_by_zero: bool Flag to handle divide-by-zero in percentage change calculations.
Feature Generation
PeriodWeekFeatureSet Class
Class for generating period-week lagged features.
Methods
-
calculate(df, dataset_config, feature_config, calculation_date) -> DataFrameApplies weekly date conversion, computes period features, then deviation, deltas, and percentage changes as configured. -
deviation_from_mean(df, key_cols, agg_col, mean_weeks) -> DataFrameComputes deviation of last week’s sum from the mean overmean_weeks. -
_calculate_deltas(df, key_cols) -> DataFrameComputes differences between all pairs of PW lag columns.
Example Usage
from datetime import datetime
from amee_utils.feature_generator.feature_set.period_week import PeriodWeekFeatureSet, PeriodWeekFeatureConfig
from amee_utils.feature_generator.config import DatasetConfig
config = PeriodWeekFeatureConfig(
lag_weeks=[2, 4],
calculation_columns=["orders"],
deviation_from_mean_weeks=[2],
calculate_deltas=True,
calculate_percentage_change=True,
resolve_divide_by_zero=True
)
dataset = DatasetConfig(key_cols=["customer_id"], date_col="order_date")
features = PeriodWeekFeatureSet().calculate(
df=orders_df,
dataset_config=dataset,
feature_config=config,
calculation_date=datetime(2022, 1, 31)
)