Distribution Metrics¶
These are metrics utilized to compare 2 distributions. The LOC Curve, which utilizes the length of the curve to quantify separation, is the generalized case of the traditional ROC curve, which utilized the area under the curve using numpy.trapz.
Note
Note: Both functions require 1-dimensional arrays as input.
This is simply to make the function completely generalizeable. One can
make any N-dimensional function “1-dimensional” by calling numpy.ndarray.ravel().
The functions will sort the bins and handle the rest.
- MiLoMerge.ROC_curve(sample1: Iterable[float], sample2: Iterable[float])[source]¶
A function to calculate the classical ROC curve given 2 distributions
- Parameters:
- Returns:
Returns 2 arrays with the same size as sample1 indicating the True Positive Rate (TPR) and False Positive Rate (FPR) per-bin, as well as the Area Under the Curve (AUC)
- Return type:
- MiLoMerge.LOC_curve(sample1: Iterable[float], sample2: Iterable[float])[source]¶
A function to calculate the LOC curve described in (ARXIV LINK) given 2 distributions.
- Parameters:
- Returns:
Returns 2 arrays with the same size as sample1 indicating the True Positive Rate (TPR) and False Positive Rate (FPR) per-bin, as well as the Length of the Curve (LoC).
- Return type:
- Raises:
ValueError – If both samples are not wholly positive, raise an error. At least one sample must be completely positive.