vocalpy.metrics.segmentation.ir.precision#
- vocalpy.metrics.segmentation.ir.precision(hypothesis: ndarray[tuple[Any, ...], dtype[_ScalarT]], reference: ndarray[tuple[Any, ...], dtype[_ScalarT]], tolerance: float | int | None = None, decimals: int | bool | None = None) tuple[float, int, IRMetricData][source]#
Compute precision \(P\) for a segmentation.
Computes the metric from a hypothesized vector of boundaries
hypothesisreturned by a segmentation algorithm and a reference vector of boundariesreference, e.g., boundaries cleaned by a human expert or boundaries from a benchmark dataset.Precision is defined as the number of true positives (\(T_p\)) over the number of true positives plus the number of false positives (\(F_p\)).
\(P = \\frac{T_p}{T_p+F_p}\).
The number of true positives
n_tpis computed by callingvocalpy.metrics.segmentation.ir.find_hits(). This function then computes the precision asprecision = n_tp / hypothesis.size.Both
hypothesisandreferencemust be 1-dimensional arrays of non-negative, strictly increasing values. If you have two arraysonsetsandoffsets, you can concatenate those into a single valid array of boundary times usingconcat_starts_and_stops()that you can then pass to this function.- Parameters:
- hypothesisnumpy.ndarray
Boundaries, e.g., onsets or offsets of segments, as computed by some method.
- referencenumpy.ndarray
Ground truth boundaries that the hypothesized boundaries
hypothesisare compared to.- tolerancefloat or int
Tolerance, in seconds. Elements in
hypothesisare considered a true positive if they are within a time interval around any reference boundary \(t_0\) inreferenceplus or minus thetolerance, i.e., if a hypothesized boundary \(t_h\) is within the interval \(t_0 - \Delta t < t < t_0 + \Delta t\). Default is None, in which case it is set to0(either float or int, depending on the dtype ofhypothesisandreference).- decimals: int
The number of decimal places to round both
hypothesisandreferenceto, usingnumpy.round(). This mitigates inflated error rates due to floating point error. Rounding is only applied if bothhypothesisandreferenceare floating point values. To avoid rounding, e.g. to compute strict precision and recall, pass in the valueFalse. Default is 3, which assumes that the values are in seconds and should be rounded to milliseconds.
- Returns:
- precisionfloat
Value for precision, computed as described above.
- n_tpint
The number of true positives.
- metric_dataIRMetricData
Instance of
IRMetricDatawith indices of hits in bothhypothesisandreference, and the absolute difference between times inhypothesisandreferencefor the hits.
Notes
The addition of a tolerance parameter is based on [1]. This is also sometimes known as a “collar” [2] or “forgiveness collar” [3]. The value for the tolerance can be determined by visual inspection of the distribution; see for example [4].
References
[1]Kemp, T., Schmidt, M., Whypphal, M., & Waibel, A. (2000, June). Strategies for automatic segmentation of audio data. In 2000 ieee international conference on acoustics, speech, and signal processing. proceedings (cat. no. 00ch37100) (Vol. 3, pp. 1423-1426). IEEE.
[2]Jordán, P. G., & Giménez, A. O. (2023). Advances in Binary and Multiclass Sound Segmentation with Deep Learning Techniques.
[3]NIST. (2009). The 2009 (RT-09) Rich Transcription Meeting Recognition Evaluation Plan. https://web.archive.org/web/20100606041157if_/http://www.itl.nist.gov/iad/mig/thyps/rt/2009/docs/rt09-meeting-eval-plan-v2.pdf
[4]Du, P., & Troyer, T. W. (2006). A segmentation algorithm for zebra finch song at the note level. Neurocomputing, 69(10-12), 1375-1379.
Examples
>>> hypothesis = np.array([1, 6, 10, 16]) >>> reference = np.array([0, 5, 10, 15]) >>> prec, n_tp, ir_metric_data = vocalpy.metrics.segmentation.ir.precision(hypothesis, reference, tolerance=0) >>> print(prec) 0.25 >>> print(ir_metric_data.hits_hyp) np.array([2])
>>> hypothesis = np.array([0, 1, 5, 10]) >>> reference = np.array([0, 5, 10]) >>> fscore, n_tp, metric_data = vocalpy.metrics.segmentation.ir.precision(hypothesis, reference, tolerance=1) >>> print(fscore) 0.75 >>> print(ir_metric_data.hits_hyp) np.array([0, 2, 3])