Skip to content

Contributing additional Feature Sets

As a developer contributing Feature Sets towards the framework there are a couple key steps outlined below:

  1. Create your Feature Config
  2. Create your FeatureSet
  3. Write tests for your FeatureSet
  4. Add your FeatureSet to the FEATURE_CONFIG_MAP
  5. Document your FeatureSet

Creating a Feature Config

FeatureSets and FeatureConfigs belong together in the same file. To create a FeatureConfig:

  1. Create a file for your features in the amee_utils/feature_generator/feature_set directory
  2. Use attrs.define and inherit from the base FeatureConfig in amee_utils.feature_generator.config as an example:

    import attrs
    
    from amee_utils.feature_generator.config import FeatureConfig
    
    @attrs.define
    class MyFeatureConfig(FeatureConfig):
        # input your feature configuration
        ...
    

    Note

    The name of your FeatureConfig will correspond directly to the feature config in your configuration yaml file.

    For example, a user's yaml file will looks as follows:

    features:
        MyFeatureConfig:
            ...
    

Creating a FeatureSet

In the same file as your FeatureConfig, create a FeatureSet:

  1. Inherit from the base FeatureSet in amee_utils.feature_generator.feature_set
  2. The remainder of the implementation of your FeatureSet is up to you in terms of how you want to do it.

    Note

    You need to ensure that you abide by the FeatureSet interface. This means that you need to implement the calculate method.

    All of the information that you need in terms of columns and configuration items should be available to you in the FeatureConfig that you created. This means that you need to ensure that you have these items in your FeatureConfig before you can use them in your FeatureSet.

Writing Tests

To create your tests, you only need to implement the test for the FeatureSet you've created and the FeatureConfig will be tested automatically if you use it in your FeatureSet implementation.

Example

An example of the implementation of a test can be seen in the tests/feature_generator/test_single_month.py file. But a snippet may look like:

    )

    df = spark.read.csv(
        "tests/feature_generator/fixtures/test_single_month.csv",
        header=True,
    ).withColumn("month", F.to_date(F.col("month"), "yyyy-MM-dd"))

    result = SingleMonthFeatureSet().calculate(
        df=df,
        feature_config=fc,
        dataset_config=dc,
        calculation_date=datetime(2022, 10, 1),
    )

    assert_df_equality(
        result, expected_df, ignore_row_order=True, ignore_column_order=True
    )


@pytest.mark.parametrize(
    "count_includes_missing, expected_values",
    [
        (False, [(1, 2, None, None, 3, None, None)]),
        (True, [(1, 3, None, None, 3, None, None)]),

Adding your FeatureSet to the Config Map

For user to be able to add your FeatureSet to their configuration, you need to add it to the FEATURE_CONFIG_MAP in amee_utils.feature_generator.__init__.py. This is a dictionary that maps the name of the FeatureConfig to the FeatureConfig class.

Documenting your FeatureSet

The final step in the development is writing the necessary documentation. This includes:

  1. Writing the documentation for the FeatureSet so people know how to implement the config.yaml file and what features they're getting out.

    Example

    You can use the documentation at docs/tutorials/features/single_month.md as an example.

    1. Add a file for your FeatureSet into the API Reference.
    2. Ensure that the mkdocs.yml file is updated

    Note

    You can test out your documentation by running poetry run mkdocs serve and navigating to http://127.0.0.1:8000

Pull Request Template

You can use the below as a template for your pull request:

## Description

Please include a summary of the changes you are proposing. This could include a brief overview of the Feature Set you are adding, as well as any relevant context or background information.

## Checklist

Please ensure that you have completed the following steps before submitting your pull request:

- [ ] Created a Feature Config
- [ ] Created a FeatureSet
- [ ] Written tests for your FeatureSet
- [ ] Added your FeatureSet to the FEATURE_CONFIG_MAP
- [ ] Documented your FeatureSet