vocalpy.metrics.segmentation.ir.find_hits#
- vocalpy.metrics.segmentation.ir.find_hits(hypothesis: ndarray[tuple[Any, ...], dtype[_ScalarT]], reference: ndarray[tuple[Any, ...], dtype[_ScalarT]], tolerance: float | int | None = None, decimals: int | None = None) tuple[ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]], ndarray[tuple[Any, ...], dtype[_ScalarT]]][source]#
Find hits in arrays of event times.
This is a helper function used to compute information retrieval metrics. Specifically, this function is called by
precision_recall_fscore().An element in
hypothesis, is considered a hit if its value \(t_h\) falls within an interval around any value inreference, \(t_0\), plus or minustolerance\(t_0 - \Delta t < t < t_0 + \Delta t\)
This function only allows there to be zero or one hit for each element in
reference, but not more than one. If the condition \(|ref_i - hyp_j| < tolerance\) is true for multiple values \(hyp_j\) inhypothesis, then the value with the smallest difference from \(ref_i\) is considered a hit.Both
hypothesisandreferencemust be 1-dimensional arrays of non-negative, strictly increasing values. If you have two arraysonsetsandoffsets, you can concatenate those into a single valid array of boundary times usingconcat_starts_and_stops()that you can then pass to this function.- Parameters:
- hypothesisnumpy.ndarray
Boundaries, e.g., onsets or offsets of segments, as computed by some method.
- referencenumpy.ndarray
Ground truth boundaries that the hypothesized boundaries
hypothesisare compared to.- tolerancefloat or int
Tolerance, in seconds. Elements in
hypothesisare considered a true positive if they are within a time interval around any reference boundary \(t_0\) inreferenceplus or minus thetolerance, i.e., if a hypothesized boundary \(t_h\) is within the interval \(t_0 - \Delta t < t < t_0 + \Delta t\). Default is None, in which case it is set to0(either float or int, depending on the dtype ofhypothesisandreference). See notes for more detail.- decimals: int
The number of decimal places to round both
hypothesisandreferenceto, usingnumpy.round(). This mitigates inflated error rates due to floating point error. Rounding is only applied if bothhypothesisandreferenceare floating point values. To avoid rounding, e.g. to compute strict precision and recall, pass in the valueFalse. Default is 3, which assumes that the values are in seconds and should be rounded to milliseconds.
- Returns:
- hits_refnumpy.ndarray
The indices of hits in
reference.- hits_hypnumpy.ndarray
The indices of hits in
hypothesis.- diffsnumpy.ndarray
Absolute differences \(|hit^{ref}_i - hit^{hyp}_i|\), i.e.,
np.abs(reference[hits_ref] - hypothesis[hits_hyp]).