Skip to content

Base Feature Set

Base class for FeatureSets.

FeatureSet

Bases: ABC, Generic[FeatureConfigType]

A base class for FeatureSets to be inherited from.

Ensures that the correct API is developed for usage in the FeatureCalculator class. These are tightly coupled.

Methods:

Name Description
calculate

Calculate the features for the given DataFrame.

Source code in amee_utils/feature_generator/feature_set/base.py
class FeatureSet(ABC, Generic[FeatureConfigType]):
    """A base class for FeatureSets to be inherited from.

    Ensures that the correct API is developed for usage in the
    `FeatureCalculator` class. These are tightly coupled.

    Methods
    -------
    calculate() -> DataFrame
        Calculate the features for the given DataFrame.
    """

    @abstractmethod
    def calculate(
        self,
        df: DataFrame,
        dataset_config: DatasetConfig,
        feature_config: FeatureConfigType,
        calculation_date: datetime,
    ) -> DataFrame:
        """
        Calculate the features for the given DataFrame.

        This function must return a set of features per customer.

        Parameters
        ----------
        df : DataFrame
            The DataFrame to calculate the features for.
        dataset_config : DatasetConfig
            The dataset configuration.
        feature_config : FeatureConfigType
            The feature configuration.
        calculation_date : datetime
            The calculation date.

        Returns
        -------
        DataFrame
            The DataFrame with the calculated features.

        Example
        -------

        The aggregate key that will be joined on is `CUST_CODE` and `CUST_ID`
        and the resultant DataFrame should look as follows:

        | CUST_CODE | CUST_ID | FEATURE 1 | FEATURE 2 |
        |-----------|---------|-----------|-----------|
        |     A     |    1    |    420    |    TRUE   |
        |     B     |    2    |    69     |    FALSE  |
        """
        raise NotImplementedError

calculate(df, dataset_config, feature_config, calculation_date) abstractmethod

Calculate the features for the given DataFrame.

This function must return a set of features per customer.

Parameters:

Name Type Description Default
df DataFrame

The DataFrame to calculate the features for.

required
dataset_config DatasetConfig

The dataset configuration.

required
feature_config FeatureConfigType

The feature configuration.

required
calculation_date datetime

The calculation date.

required

Returns:

Type Description
DataFrame

The DataFrame with the calculated features.

Example

The aggregate key that will be joined on is CUST_CODE and CUST_ID and the resultant DataFrame should look as follows:

CUST_CODE CUST_ID FEATURE 1 FEATURE 2
A 1 420 TRUE
B 2 69 FALSE
Source code in amee_utils/feature_generator/feature_set/base.py
@abstractmethod
def calculate(
    self,
    df: DataFrame,
    dataset_config: DatasetConfig,
    feature_config: FeatureConfigType,
    calculation_date: datetime,
) -> DataFrame:
    """
    Calculate the features for the given DataFrame.

    This function must return a set of features per customer.

    Parameters
    ----------
    df : DataFrame
        The DataFrame to calculate the features for.
    dataset_config : DatasetConfig
        The dataset configuration.
    feature_config : FeatureConfigType
        The feature configuration.
    calculation_date : datetime
        The calculation date.

    Returns
    -------
    DataFrame
        The DataFrame with the calculated features.

    Example
    -------

    The aggregate key that will be joined on is `CUST_CODE` and `CUST_ID`
    and the resultant DataFrame should look as follows:

    | CUST_CODE | CUST_ID | FEATURE 1 | FEATURE 2 |
    |-----------|---------|-----------|-----------|
    |     A     |    1    |    420    |    TRUE   |
    |     B     |    2    |    69     |    FALSE  |
    """
    raise NotImplementedError