Mergers

Merging is done via these classes, which also optionally output mappings that allow for raw data to be places within these bins. These mappings can be used to bin raw data through the functions listed in Placing Events After Merging

class MiLoMerge.MergerLocal(bin_edges: Iterable[float], *counts: Iterable[float], weights: Iterable[float] | None = None, comp_to_first: bool = False, map_at: Iterable[int] | None = None, file_path: str = './', file_name: str = '')[source]

Bases: Merger

A merger that merges bins locally. This will not change the physical ordering of the bin edges.

The initializer for the local merging class

Parameters:
  • bin_edges (numpy.ndarray) – These are the edges of your histogram that correspond to physical quantities

  • counts (numpy.ndarray) – A series of arrays that correspond to the number of events between your bin edges.

  • weights (numpy.ndarray, optional) – An array of the weights associated for each of the counts. If none are provided, the weights will be 1, by default None

  • comp_to_first (bool, optional) – Whether you would like to compare all samples to the first one provided, as opposed to all of them to each other, by default False

  • map_at (list, optional) – A list of bin numbers at which you would like the mapping from the original sample to be recorded, by default None

  • file_path (str, optional) – The directory to place “_tracker.hdf5” should mapping be desired, by default “./”

  • file_name (str, optional) – The file prefix before “_tracker.hdf5” to identify this mapping, by default “”

Raises:
  • ValueError – The dimension of the bin edges can only be 1-dimensional

  • NotADirectoryError – If file_path is not a valid directory, raise an error

run(target_bin_number: int, return_counts: bool = False)[source]

Runs the merger

Parameters:
  • target_bin_number (int) – The number of bins you would like to merge down to

  • return_counts (bool, optional) – Whether the returned value will have the bin counts returned alongside the bin edges, by default False

Returns:

Returns the 1-d bin edges that would correspond to the best binning for the number of bins you want. If return_counts is set to true, also return the internal final bin counts, which have shape (#samples, target_bin_number).

Return type:

numpy.ndarray or tuple(numpy.ndarray, numpy.ndarray)

class MiLoMerge.MergerNonlocal(bin_edges: Iterable[float], *counts: Iterable[float], weights: Iterable[float] | None = None, comp_to_first: bool = False, map_at: Iterable[int] | None = None, file_path: str = './', file_name: str = '')[source]

Bases: Merger

A merger that merges bins non-locally. Bin edges are irrelevant here.

The initializer for the non-local merging class

Parameters:
  • bin_edges (numpy.ndarray) – These are the edges of your histogram that correspond to physical quantities

  • counts (numpy.ndarray) – A series of arrays that correspond to the number of events between your bin edges

  • weights (numpy.ndarray, optional) – An array of the weights associated for each of the counts. If none are provided, the weights will be 1, by default None

  • comp_to_first (bool, optional) – Whether you would like to compare all samples to the first one provided, as opposed to all of them to each other, by default False

  • map_at (list, optional) – A list of bin numbers at which you would like the mapping from the original sample to be recorded, by default None

  • file_path (str, optional) – The directory to place “_tracker.hdf5” should mapping be desired, by default “./”

  • file_name (str, optional) – The file prefix before “_tracker.hdf5” to identify this mapping, by default “”

Raises:

ValueError – The dimension of the bin edges can only be 1-dimensional

run(target_bin_number: int)[source]

Runs the merger

Parameters:

target_bin_number (int, optional) – The number of bins you would like to merge down to

Returns:

Returns an array containing all the new counts for the nonlocal array since bin edges are now meaningless. The array has shape (# samples, target_bin_number)

Return type:

numpy.ndarray