vocalpy.Segments#

class vocalpy.Segments(start_inds: ndarray[Any, dtype[_ScalarType_co]], lengths: ndarray[Any, dtype[_ScalarType_co]], samplerate: int, labels: list[str] | None = None)[source]#

Bases: object

Class that represents a set of line segments returned by a segmenting algorithm.

This class represents the result of algorithms that segment a signal into a series of consecutive, non-overlapping 2-D line segments \(S\). Each segment \(s_i\) in a Segments instance has an integer start index and length. The start index is computed by the segmenting algorithm. For algorithms that find segments by thresholding energy, the length will be equal to the stop index computed by the algorithm minus the start index, plus one (to account for how Python indexes). The stop index is the last index above threshold for a segment. For a list of such algorithms, call vocalpy.segment.line.list(). For algorithms that segment spectrograms into boxes, see Boxes.

Attributes:
start_indsnumpy.ndarray
lengths: numpy.ndarray
labels: list, optional

A list of strings, where each string is the label for each segment.

soundvocalpy.Sound

The sound that was segmented to produce this set of line segments.

Methods

from_csv(csv_path, samplerate[, ...])

Create a Segments instance from a csv file.

from_json(path)

Load Segments from a json file.

to_json(path)

Save Segments to a json file.

See also

Boxes

Examples

Segments are returned by the segmenting algorithms that return a set of line segments (as opposed to segmenting algorithms that return a set of boxes).

>>> bfsongrep = voc.example('bfsongrepo')
>>> sound = bfsongrepo.sounds[0]
>>> segments = voc.segment.meansquared(sound, threshold=1500, min_dur=0.2, min_silent_dur=0.02)
>>> segments
Segments(start_inds=array([ 22293...4425, 220495]), lengths=array([ 8012,... 6935,  7896]), samplerate=32000, labels=['', '', '', '', '', '', ...])  # noqa

Because audio data is a digital signal with discrete samples, segments are defined in terms of start indices and lengths. Thus, the start index of each segment is the index of the sample where it starts–also known as a “boundary”–and the length is given in number of samples.

However, we often want to think of segments times in terms of seconds. We can get the start times of segments in seconds with the start_times property, and we can get the duration of segments in seconds with the durations property.

>>> segments.start_times
array([0.69665625, 1.801375  , 2.26390625, 2.7535625 , 3.5885    ,
       6.38828125, 6.89046875])
>>> segments.durations
array([0.250375  , 0.33278125, 0.31      , 0.23625   , 0.308625  ,
       0.21671875, 0.24675   ])

This is possible because each set of Segments has a samplerate attribute, that can be used to convert from sample numbers to seconds. This attribute is taken from the vocalpy.Sound that was segmented to produce the Segments in the first place.

Depending on the segmenting algorithm, the start of one segment may not be the same as the end of the segment that precedes it. In this case we may want to find where the segments stop. We can do so with the stop_ind and stop_ind properties.

To actually get a Sound for every segment in a set of Segments, we can pass the Segments into to the vocalpy.Sound.segment() method.

>>> segment_sounds = sound.segment(segments)

This might seem verbose, but it has a couple of advantages. The first is that the Segments can be saved in a json file, so they can be loaded again and used to segment a sound without needed to re-run the segmentation. You can use a naming convention so that each sound file has a segments file paired with it: e.g., if the sound file is named "mouse1-day1-bout1.wav", then the json file could be named "mouse1-day1-bout1.segments.json".

>>> segments.to_json(path='mouse1-day1-bout1.segments.json')

A set of Segments is then loaded with the from_json() method.

>>> segments = voc.Segments.from_json(path='mouse1-day1-bout1.segments.json')

The second advantage of representing Segments separately is that they can then be used to compute metrics for segmentation. Note that here we are using the all_times property, that gives us all the boundary times in seconds.

>>> sounds = voc.example('bfsongrepo', return_type='sound')
>>> segments = voc.segment.meansquared(sound, threshold=1500, min_dur=0.2, min_silent_dur=0.02)
>>> annots = voc.example('bfsongrepo', return_type='annotation')
>>> ref = np.sorted(np.concatenate(annots[0].seq.onsets, annot[0].seq.offsets))
>>> hyp = segments.all_times
>>> prec, _ = voc.metrics.segmentation.ir.precision(reference=ref, hypothesis=hyp)
__init__(start_inds: ndarray[Any, dtype[_ScalarType_co]], lengths: ndarray[Any, dtype[_ScalarType_co]], samplerate: int, labels: list[str] | None = None) None[source]#

Methods

__init__(start_inds, lengths, samplerate[, ...])

from_csv(csv_path, samplerate[, ...])

Create a Segments instance from a csv file.

from_json(path)

Load Segments from a json file.

to_json(path)

Save Segments to a json file.

Attributes

VALID_COLUMNS_MAP_VALUES

all_inds

Start and stop indices of segments.

all_times

durations

Durations of segments.

start_times

Start times of segments.

stop_inds

Indices of where segments stop.

stop_times

Stop times of segments.

property all_inds#

Start and stop indices of segments.

Returns the following:

property durations#

Durations of segments.

Returns self.lengths / self.sound.samplerate.

classmethod from_csv(csv_path: str | Path, samplerate: int, columns_map: dict | None = None, default_label: str | None = None, read_csv_kwargs: dict | None = None)[source]#

Create a Segments instance from a csv file.

The csv file can either have the column names {"start_ind", "length", "label"}, that will be used directly as the Segment attributes start_inds, lengths, and labels, respectively, or it can have the column names {"start_s", "stop_s", "label"}, where "start_s" and "stop_s"" refer to times in seconds. The label column is not required, and if it is not found, the labels will default to empty strings. You can change this behavior by specifying a default_label that will be used for all the segments if no labels column is found, instead of an empty string. If one of these sets of columns ({"start_ind", "length"}`` or {"start_s", "stop_s"}) is not found in the csv, then an error will be raised. You can have the vocalpy.Segments.from_csv() method rename columns for you after it loads the csv file into a pandas.DataFrame using the columns_map argument; see example below. All other columns are ignored; you do not need to somehow remove them to load the file.

Parameters:
csv_pathstring or pathlib.Path

Path to csv file.

samplerateint

The sampling rate of the audio signal that was segmented to produce these segments.

columns_mapdict, optional

Mapping that will be used to rename columns in the csv

default_labelstr, optional

String, a default that is assigned as the label to all segments.

read_csv_kwargs, dict, optional

Keyword arguments to pass to pandas.read_csv() function.

Returns:
segmentsvocalpy.Segments

Notes

This method is provided as a convenience for the case where you have a segmentation saved in a csv file, e.g., from a pandas.DataFrame, that was created by another library or script. If you are working mainly with vocalpy, you should prefer to load a set of segments with from_json(), and to save the set of segments with to_json(), since this avoids needing to keep track of the samplerate value separately.

Examples

The main use of this method is to load a set of line segments from a csv file created by another library or a script.

If the column names in the csv do not match the column names that vocalpy.Segments expects, you can have the vocalpy.Segments.from_csv method rename the columns for you after loading the csv, using the columns_map argument.

Here is an example of renaming columns to the expected names “start_s” and “stop_s”. After renaming, the values in these columns are then converted to the starting indices and lengths of segments using the samplerate.

>>> jourjine = voc.example("jourjine-et-al-2023", return_path=True)
>>> sound = voc.Sound.read(jourjine.sound)
>>> csv_path = jourjine.segments
>>> columns_map = {"start_seconds": "start_s", "stop_seconds": "stop_s"}
>>> segments = voc.Segments.from_csv(csv_path, samplerate=sound.samplerate, columns_map=columns_map)
>>> print(segments)
Segments(start_inds=array([   131...   149767168]), lengths=array([40447,...29696, 25087]),
samplerate=250000, labels=['', '', '', '', '', '', ...])
classmethod from_json(path: str | Path) Segments[source]#

Load Segments from a json file.

Parameters:
pathstr, pathlib.Path

The path to the json file to load the Segments from.

Returns:
segmentsSegments
property start_times#

Start times of segments.

Returns self.start_inds / self.sound.samplerate.

property stop_inds#

Indices of where segments stop.

Returns self.start_inds + self.lengths.

property stop_times#

Stop times of segments.

Returns self.start_times + self.durations.

to_json(path: str | Path) None[source]#

Save Segments to a json file.

Parameters:
json_pathstr, pathlib.Path

The path where the json file should be saved with these Segments.