Skip to content

Single Week Feature Set

Complete Weeks Only

Single week features automatically exclude the current week (the week containing the calculation date) to ensure only complete historical weeks are included. This provides consistent weekly analysis regardless of which day of the week the calculation runs.

Usage

config.yaml
...
features:
  SingleWeekFeatureConfig:
    lag_weeks:
      - 1
      - 2
      - 4
      - 8
    count_columns:
      - <column_name>
    sum_columns:
      - <column_name>
    mean_columns:
      - <column_name>
    count_if_one_columns:
      - <column_name>
    calculate_percentage_change: True  # optional
    resolve_divide_by_zero: True      # optional, requires calculate_percentage_change
    count_includes_missing: False     # optional

Features Generated

Feature Type Description Lagged Aggregation Lagged Periods (Weeks)
Count Count of occurrences for specified columns Yes (Lagged by weeks) W1, W2, W4, W8
Sum Sum of values for specified columns Yes (Lagged by weeks) W1, W2, W4, W8
Mean Mean (average) of values for specified columns Yes (Lagged by weeks) W1, W2, W4, W8
Count If Equals One Count of rows where a specified column equals 1 Yes (Lagged by weeks) W1, W2, W4, W8
Ratios Calculates percentage change between lags Optional (calculate_percentage_change) (Pairs based on lag_weeks)

Feature Configuration

SingleWeekFeatureConfig Class

Configuration class for single-week feature calculation.

Attributes

  • lag_weeks: List[int] Default list of integers representing the number of weeks to lag (default: [1, 2, 4, 8]).
  • count_columns: List[str]
  • sum_columns: List[str]
  • mean_columns: List[str]
  • count_if_one_columns: List[str]
  • calculate_percentage_change: bool (optional)
  • resolve_divide_by_zero: bool (optional; only when calculate_percentage_change=True)
  • count_includes_missing: bool (optional)

Feature Generation

SingleWeekFeatureSet Class

Class for calculating single-week lagged features.

Methods

  • calculate(df, dataset_config, feature_config, calculation_date) -> DataFrame

  • Converts the date column to a week-based column using create_week_date_col.

  • Collects distinct key_cols into the result DataFrame.
  • For each aggregation strategy and specified column:
    • Instantiates LaggedAggregation(periods_list=lag_weeks, time_col="week", lag_type="single_week").
    • Applies the lagged aggregation to generate weekly lag features.
    • Optionally calculates percentage change if calculate_percentage_change is True.
    • Joins the features back to the base DataFrame.
from amee_utils.feature_generator.feature_set.single_week import SingleWeekFeatureSet, SingleWeekFeatureConfig
from amee_utils.feature_generator.config import DatasetConfig

# Example usage
config = SingleWeekFeatureConfig(
    lag_weeks=[1, 2],
    sum_columns=["orders"],
    calculate_percentage_change=True,
    resolve_divide_by_zero=True
)

dataset = DatasetConfig(key_cols=["customer_id"], date_col="order_date")

features = SingleWeekFeatureSet().calculate(
    df=orders_df,
    dataset_config=dataset,
    feature_config=config,
    calculation_date=datetime(2022, 1, 31)
)