Skip to content

nandataframefilter

nandataframefilter

Add module for NanDataframeFilter

Classes

NanDataframeFilter

NanDataframeFilter()

Bases: DataframeFilter

DataframeFilter that removes nan values from feature columns

Filter to filter data of a dataframe

Source code in niceml/data/datafilters/dataframefilter.py
def __init__(self):
    """Filter to filter data of a dataframe"""

    self.data_description = None
Functions
filter
filter(data)

The filter function is used to remove rows from the data that have NaN values in any of the columns that are specified as inputs or targets of the self.data_description. This is done by dropping all rows where there are NaN values in any of these columns.

Parameters:

  • data (DataFrame) –

    pd.DataFrame: Pass the data into the function

Returns:

  • DataFrame

    A dataframe with the rows that have at least one nan value in the columns

  • DataFrame

    specified by filter_columns removed

Source code in niceml/data/datafilters/nandataframefilter.py
def filter(self, data: pd.DataFrame) -> pd.DataFrame:
    """
    The filter function is used to remove rows from the data that have NaN values
    in any of the columns that are specified as inputs or targets of the
    `self.data_description`. This is done by dropping all rows where there are NaN
    values in any of these columns.

    Args:
        data: pd.DataFrame: Pass the data into the function

    Returns:
        A dataframe with the rows that have at least one nan value in the columns
        specified by filter_columns removed

    """
    self.data_description: RegDataDescription = check_instance(
        self.data_description, RegDataDescription
    )
    filter_columns: List[str] = []

    for column in self.data_description.inputs + self.data_description.targets:
        filter_columns.append(column["key"])

    return data.dropna(axis=0, subset=filter_columns)

Functions