Single Week Feature Set
Complete Weeks Only
Single week features automatically exclude the current week (the week containing the calculation date) to ensure only complete historical weeks are included. This provides consistent weekly analysis regardless of which day of the week the calculation runs.
Usage
config.yaml
...
features:
SingleWeekFeatureConfig:
lag_weeks:
- 1
- 2
- 4
- 8
count_columns:
- <column_name>
sum_columns:
- <column_name>
mean_columns:
- <column_name>
count_if_one_columns:
- <column_name>
calculate_percentage_change: True # optional
resolve_divide_by_zero: True # optional, requires calculate_percentage_change
count_includes_missing: False # optional
Features Generated
| Feature Type | Description | Lagged Aggregation | Lagged Periods (Weeks) |
|---|---|---|---|
| Count | Count of occurrences for specified columns | Yes (Lagged by weeks) | W1, W2, W4, W8 |
| Sum | Sum of values for specified columns | Yes (Lagged by weeks) | W1, W2, W4, W8 |
| Mean | Mean (average) of values for specified columns | Yes (Lagged by weeks) | W1, W2, W4, W8 |
| Count If Equals One | Count of rows where a specified column equals 1 | Yes (Lagged by weeks) | W1, W2, W4, W8 |
| Ratios | Calculates percentage change between lags | Optional (calculate_percentage_change) | (Pairs based on lag_weeks) |
Feature Configuration
SingleWeekFeatureConfig Class
Configuration class for single-week feature calculation.
Attributes
lag_weeks: List[int] Default list of integers representing the number of weeks to lag (default: [1, 2, 4, 8]).count_columns: List[str]sum_columns: List[str]mean_columns: List[str]count_if_one_columns: List[str]calculate_percentage_change: bool (optional)resolve_divide_by_zero: bool (optional; only when calculate_percentage_change=True)count_includes_missing: bool (optional)
Feature Generation
SingleWeekFeatureSet Class
Class for calculating single-week lagged features.
Methods
-
calculate(df, dataset_config, feature_config, calculation_date) -> DataFrame -
Converts the date column to a week-based column using
create_week_date_col. - Collects distinct
key_colsinto the result DataFrame. - For each aggregation strategy and specified column:
- Instantiates
LaggedAggregation(periods_list=lag_weeks, time_col="week", lag_type="single_week"). - Applies the lagged aggregation to generate weekly lag features.
- Optionally calculates percentage change if
calculate_percentage_changeis True. - Joins the features back to the base DataFrame.
- Instantiates
from amee_utils.feature_generator.feature_set.single_week import SingleWeekFeatureSet, SingleWeekFeatureConfig
from amee_utils.feature_generator.config import DatasetConfig
# Example usage
config = SingleWeekFeatureConfig(
lag_weeks=[1, 2],
sum_columns=["orders"],
calculate_percentage_change=True,
resolve_divide_by_zero=True
)
dataset = DatasetConfig(key_cols=["customer_id"], date_col="order_date")
features = SingleWeekFeatureSet().calculate(
df=orders_df,
dataset_config=dataset,
feature_config=config,
calculation_date=datetime(2022, 1, 31)
)