vocalpy.Segments#
- class vocalpy.Segments(start_inds: ndarray[tuple[Any, ...], dtype[_ScalarT]], lengths: ndarray[tuple[Any, ...], dtype[_ScalarT]], samplerate: int, labels: list[str] | None = None)[source]#
Bases:
objectClass that represents a set of line segments returned by a segmenting algorithm.
This class represents the result of algorithms that segment a signal into a series of consecutive, non-overlapping 2-D line segments \(S\). Each segment \(s_i\) in a
Segmentsinstance has an integer start index and length. The start index is computed by the segmenting algorithm. For algorithms that find segments by thresholding energy, the length will be equal to the stop index computed by the algorithm minus the start index, plus one (to account for how Python indexes). The stop index is the last index above threshold for a segment. For a list of such algorithms, callvocalpy.segment.line.list(). For algorithms that segment spectrograms into boxes, seeBoxes.- Attributes:
- start_indsnumpy.ndarray
- lengths: numpy.ndarray
- labels: list, optional
A
listof strings, where each string is the label for each segment.- soundvocalpy.Sound
The sound that was segmented to produce this set of line segments.
Methods
from_csv(csv_path, samplerate[, ...])Create a
Segmentsinstance from a csv file.from_json(path)Load
Segmentsfrom a json file.to_json(path)Save
Segmentsto a json file.See also
Boxes
Examples
Segmentsare returned by the segmenting algorithms that return a set of line segments (as opposed to segmenting algorithms that return a set of boxes).>>> bfsongrep = voc.example('bfsongrepo') >>> sound = bfsongrepo.sounds[0] >>> segments = voc.segment.meansquared(sound, threshold=1500, min_dur=0.2, min_silent_dur=0.02) >>> segments Segments(start_inds=array([ 22293...4425, 220495]), lengths=array([ 8012,... 6935, 7896]), samplerate=32000, labels=['', '', '', '', '', '', ...]) # noqa
Because audio data is a digital signal with discrete samples, segments are defined in terms of start indices and lengths. Thus, the start index of each segment is the index of the sample where it starts–also known as a “boundary”–and the length is given in number of samples.
However, we often want to think of segments times in terms of seconds. We can get the start times of segments in seconds with the
start_timesproperty, and we can get the duration of segments in seconds with thedurationsproperty.>>> segments.start_times array([0.69665625, 1.801375 , 2.26390625, 2.7535625 , 3.5885 , 6.38828125, 6.89046875]) >>> segments.durations array([0.250375 , 0.33278125, 0.31 , 0.23625 , 0.308625 , 0.21671875, 0.24675 ])
This is possible because each set of
Segmentshas asamplerateattribute, that can be used to convert from sample numbers to seconds. This attribute is taken from thevocalpy.Soundthat was segmented to produce theSegmentsin the first place.Depending on the segmenting algorithm, the start of one segment may not be the same as the end of the segment that precedes it. In this case we may want to find where the segments stop. We can do so with the
stop_indandstop_indproperties.To actually get a
Soundfor every segment in a set ofSegments, we can pass theSegmentsinto to thevocalpy.Sound.segment()method.>>> segment_sounds = sound.segment(segments)
This might seem verbose, but it has a couple of advantages. The first is that the
Segmentscan be saved in a json file, so they can be loaded again and used to segment a sound without needed to re-run the segmentation. You can use a naming convention so that each sound file has a segments file paired with it: e.g., if the sound file is named"mouse1-day1-bout1.wav", then the json file could be named"mouse1-day1-bout1.segments.json".>>> segments.to_json(path='mouse1-day1-bout1.segments.json')
A set of
Segmentsis then loaded with thefrom_json()method.>>> segments = voc.Segments.from_json(path='mouse1-day1-bout1.segments.json')
The second advantage of representing
Segmentsseparately is that they can then be used to compute metrics for segmentation. Note that here we are using theall_timesproperty, that gives us all the boundary times in seconds.>>> sounds = voc.example('bfsongrepo', return_type='sound') >>> segments = voc.segment.meansquared(sound, threshold=1500, min_dur=0.2, min_silent_dur=0.02) >>> annots = voc.example('bfsongrepo', return_type='annotation') >>> ref = np.sorted(np.concatenate(annots[0].seq.onsets, annot[0].seq.offsets)) >>> hyp = segments.all_times >>> prec, _ = voc.metrics.segmentation.ir.precision(reference=ref, hypothesis=hyp)
- __init__(start_inds: ndarray[tuple[Any, ...], dtype[_ScalarT]], lengths: ndarray[tuple[Any, ...], dtype[_ScalarT]], samplerate: int, labels: list[str] | None = None) None[source]#
Methods
__init__(start_inds, lengths, samplerate[, ...])from_csv(csv_path, samplerate[, ...])Create a
Segmentsinstance from a csv file.from_json(path)Load
Segmentsfrom a json file.to_json(path)Save
Segmentsto a json file.Attributes
VALID_COLUMNS_MAP_VALUESStart and stop indices of segments.
all_timesDurations of segments.
Start times of segments.
Indices of where segments stop.
Stop times of segments.
- property all_inds#
Start and stop indices of segments.
Returns the following:
- property durations#
Durations of segments.
Returns
self.lengths / self.sound.samplerate.
- classmethod from_csv(csv_path: str | Path, samplerate: int, columns_map: dict | None = None, default_label: str | None = None, read_csv_kwargs: dict | None = None)[source]#
Create a
Segmentsinstance from a csv file.The csv file can either have the column names
{"start_ind", "length", "label"}, that will be used directly as theSegmentattributesstart_inds,lengths, andlabels, respectively, or it can have the column names{"start_s", "stop_s", "label"}, where"start_s"and"stop_s""refer to times in seconds. Thelabelcolumn is not required, and if it is not found, thelabelswill default to empty strings. You can change this behavior by specifying adefault_labelthat will be used for all the segments if nolabelscolumn is found, instead of an empty string. If one of these sets of columns ({"start_ind", "length"}`` or{"start_s", "stop_s"}) is not found in the csv, then an error will be raised. You can have thevocalpy.Segments.from_csv()method rename columns for you after it loads the csv file into apandas.DataFrameusing thecolumns_mapargument; see example below. All other columns are ignored; you do not need to somehow remove them to load the file.- Parameters:
- csv_pathstring or pathlib.Path
Path to csv file.
- samplerateint
The sampling rate of the audio signal that was segmented to produce these segments.
- columns_mapdict, optional
Mapping that will be used to rename columns in the csv
- default_labelstr, optional
String, a default that is assigned as the label to all segments.
- read_csv_kwargs, dict, optional
Keyword arguments to pass to
pandas.read_csv()function.
- Returns:
- segmentsvocalpy.Segments
Notes
This method is provided as a convenience for the case where you have a segmentation saved in a csv file, e.g., from a
pandas.DataFrame, that was created by another library or script. If you are working mainly withvocalpy, you should prefer to load a set of segments withfrom_json(), and to save the set of segments withto_json(), since this avoids needing to keep track of the samplerate value separately.Examples
The main use of this method is to load a set of line segments from a csv file created by another library or a script.
If the column names in the csv do not match the column names that vocalpy.Segments expects, you can have the vocalpy.Segments.from_csv method rename the columns for you after loading the csv, using the columns_map argument.
Here is an example of renaming columns to the expected names “start_s” and “stop_s”. After renaming, the values in these columns are then converted to the starting indices and lengths of segments using the samplerate.
>>> jourjine = voc.example("jourjine-et-al-2023", return_path=True) >>> sound = voc.Sound.read(jourjine.sound) >>> csv_path = jourjine.segments >>> columns_map = {"start_seconds": "start_s", "stop_seconds": "stop_s"} >>> segments = voc.Segments.from_csv(csv_path, samplerate=sound.samplerate, columns_map=columns_map) >>> print(segments) Segments(start_inds=array([ 131... 149767168]), lengths=array([40447,...29696, 25087]), samplerate=250000, labels=['', '', '', '', '', '', ...])
- classmethod from_json(path: str | Path) Segments[source]#
Load
Segmentsfrom a json file.- Parameters:
- pathstr, pathlib.Path
The path to the json file to load the
Segmentsfrom.
- Returns:
- segmentsSegments
- property start_times#
Start times of segments.
Returns
self.start_inds / self.sound.samplerate.
- property stop_inds#
Indices of where segments stop.
Returns
self.start_inds + self.lengths.
- property stop_times#
Stop times of segments.
Returns
self.start_times + self.durations.