Skip to content

probabilityclassselector

probabilityclassselector

Module for probability selection

Classes

ProbabilityClassSelector

ProbabilityClassSelector(
    data, class_col, prob_col, min_delta=0.1
)

Selection class for a specific dataframe which selects based on a class column and a prediction column

Parameters:

  • data (DataFrame) –

    Includes the data to investigate

  • class_col (str) –

    column name containing the classes

  • prob_col (str) –

    column name containing the prediction probabilities

  • min_delta (float, default: 0.1 ) –

    The minimum difference between minimum and maximum probability; default 0.1

Source code in niceml/utilities/filtering/probabilityclassselector.py
def __init__(
    self,
    data: pd.DataFrame,
    class_col: str,
    prob_col: str,
    min_delta: float = 0.1,
):
    self.data = data
    self.class_col = class_col
    self.prob_col = prob_col
    self.min_delta = min_delta
Functions
get_selected_data
get_selected_data(selection)

Filter and sort the data due to the selection

Source code in niceml/utilities/filtering/probabilityclassselector.py
def get_selected_data(self, selection: Selection):
    """Filter and sort the data due to the selection"""
    selected_data: pd.DataFrame = self.data[
        self.data[self.class_col] == selection.class_name
    ]
    selected_data = selected_data.sort_values(by=[self.prob_col])
    selected_data = selected_data[
        selected_data[self.prob_col] >= selection.prob_value
    ]
    return selected_data
get_selection_info
get_selection_info()

Returns info about the possible selections

Source code in niceml/utilities/filtering/probabilityclassselector.py
def get_selection_info(self) -> SelectionInfo:
    """Returns info about the possible selections"""
    class_list = list(self.data[self.class_col].unique())
    min_prob = float(self.data[self.prob_col].min())
    max_prob = float(self.data[self.prob_col].max())
    if max_prob - min_prob < self.min_delta:
        max_prob = min_prob + self.min_delta
    return SelectionInfo(class_list, min_prob, max_prob)

Selection dataclass

Specific selection with regard to SelectionInfo

SelectionInfo dataclass

What classes and probability range is included in data