vocalpy.Sound#
- class vocalpy.Sound(data: ndarray[tuple[Any, ...], dtype[_ScalarT]], samplerate: int)[source]#
Bases:
objectClass that represents a sound.
- Attributes:
- datanumpy.ndarray
The audio signal as a
numpy.ndarray, where the dimensions are (channels, samples).- samplerateint
The sampling rate the audio signal was acquired at, in Hertz.
- channelsint
The number of channels in the audio signal. Determined from the first dimension of
data.- samplesint
The number of samples in the audio signal. Determined from the last dimension of
data.- durationfloat
Duration of the sound in seconds. Determined from the last dimension of
dataand thesamplerate.
Methods
clip([start, stop])Make a clip from this
Soundthat starts at timestartin seconds and ends at timestop.read(path[, dtype])Read audio from
path.segment(segments)Segment a sound, using a set of line
Segments.to_mono()Convert a
Soundto mono by averaging samples across channels.write(path, **kwargs)Write audio data to a file.
Examples
A
Soundis read from a file.>>> sound_path = voc.example("bl26lb16.wav", return_path=True) >>> sound = voc.Sound.read(sound_path) >>> sound vocalpy.Sound(data=array([[-0.00... 0.00912476]]), samplerate=32000)
The
Soundclass is designed as a domain-specific data container with attributes that help us avoid cluttering up code with variables that track the sampling rate, number of channels, and duration of the file.>>> sound = voc.example("bl26lb16.wav") >>> print(sound.samplerate) 32000 >>> print(sound.channels) 1 >>> print(sound.duration) 7.254
You can
print()aSoundto see all the properties that are derived from the sampling rate and the shape of the underlying data array: the number of channels, the number of samples, and the duration in seconds.>>> sound = voc.example("bl26lb16.wav") >>> print(sound) vocalpy.Sound(data=array([[-0.00... 0.00912476]]), samplerate=32000), channels=1, samples=184463, duration=5.764)
The
vocalpypackage tries to provide functions that takeSoundinstances as inputs, and return other domain-specific types as outputs, such asSegments,Spectrogram, andFeatures. If instead you need to work with the digital audio signal directly as a numpy array, you can access it through thedataattribute.>>> sound = voc.example("bl26lb16.wav") >>> sound_arr = sound.data
Sound can be written to a file as well, in any format supported by
soundfile.>>> sound = voc.example("bl26lb16.wav") >>> sound.write("bl26lb16-copy.wav")
We can clip a sound to an arbitrary duration using the
clip()method. This is useful if there are long, relatively silent periods before or after the animal sounds that we are interested in.>>> sound = voc.example("bl26lb16.wav") >>> sound_clip = sound.clip(0.1, 1.5) >>> print(sound_clip.duration) 1.4
If we want to clip from a start time to the end of the sound, we can just specify a time for start.
>>> sound = voc.example("bl26lb16.wav") >>> sound_clip = sound.clip(0.5) >>> print(sound_clip.duration) 1.4
Likewise, if we want to clip from the start of the sound we can just specify a time for stop. Notice that we need to use a keyword argument here, since start is the first argument to
clip().>>> sound = voc.example("bl26lb16.wav") >>> sound_clip = sound.clip(stop=0.5) >>> print(sound_clip.duration) 0.5
If we want to segment an audio file into periods of animal sounds and periods of background, we can do that with one of the algorithms in
vocalpy.segment. This will give us aSegmentsinstance that we can then pass into thesegment()method to get back alistofSoundinstances, one for each segment.>>> sound = voc.example("bl26lb16.wav") >>> segments = voc.segment.meansquared(sound, threshold=1000, min_dur=0.0002, min_silent_dur=0.004) >>> syllables = sound.segment(segments) >>> len(syllables) 26
You can also index a
Soundas you would anumpy.arrayand this will give you back a newSound. One place where this is useful is when you have multi-channel audio, and you only want one channel, or you want to iterate over the channels.>>> sound = voc.example("fruitfly-song-multichannel.wav") >>> a_channel = sound[0, :] >>> print(a_channel) vocalpy.Sound(data=array([[-0.00...-0.00723267]]), samplerate=10000), channels=1, samples=15000, duration=1.500) >>> for channel in sound: ... print(channel) vocalpy.Sound(data=array([[-0.00...-0.00723267]]), samplerate=10000), channels=1, samples=15000, duration=1.500) vocalpy.Sound(data=array([[ 0.01... 0.00268555]]), samplerate=10000), channels=1, samples=15000, duration=1.500) vocalpy.Sound(data=array([[ 0.00...-0.00100708]]), samplerate=10000), channels=1, samples=15000, duration=1.500)
This works with other methods of indexing, as shown below.
>>> sound = voc.example("bl26lb16.wav") >>> print(sound.data.shape) >>> decimated = sound[:, ::10] # keep every 10th sample -- not true downsampling, we don't change the sampling rate
Note that we are just passing indexing directly to the underlying
numpy.array, not re-implementing the API.Methods
__init__(data, samplerate)clip([start, stop])Make a clip from this
Soundthat starts at timestartin seconds and ends at timestop.read(path[, dtype])Read audio from
path.segment(segments)Segment a sound, using a set of line
Segments.to_mono()Convert a
Soundto mono by averaging samples across channels.write(path, **kwargs)Write audio data to a file.
Attributes
channelsdurationsamples- clip(start: float = 0.0, stop: float | None = None) Sound[source]#
Make a clip from this
Soundthat starts at timestartin seconds and ends at timestop.- Parameters:
- startfloat
Start time for clip, in seconds. Default is 0.
- stopfloat, optional.
Stop time for clip, in seconds. Default is None, in which case the value will be set to the
durationof thisSound.
- Returns:
- clipvocalpy.Sound
A new
Soundwith durationstop - start.
See also
Notes
The
clip()method is used to clip aSoundat arbitrary times. If you need to segment an audio file into periods of animal sounds and periods of background, use one of the functions invocalpy.segmentto get an instance ofSegments, that you can then use with the :meth`Sound.segment` method.Examples
>>> sound = voc.example('bl26lb16.wav') >>> clip = sound.clip(1.5, 2.5) >>> clip.duration 1.0
- classmethod read(path: str | pathlib.Path, dtype: npt.DTypeLike = <class 'numpy.float64'>, **kwargs) Self[source]#
Read audio from
path.- Parameters:
- pathstr, pathlib.Path
Path to file from which audio data should be read.
- **kwargsdict, optional
Other arguments to
soundfile.read():, refer to :module:`soundfile` documentation for details. Note that :method:`vocalpy.Sound.read` passes in the argumentalways_2d=True, because we require Sound.data to always have a “channel” dimension.
- Returns:
- soundvocalpy.Sound
A
vocalpy.Soundinstance withdataread frompath.
- segment(segments: Segments) list[Sound][source]#
Segment a sound, using a set of line
Segments.- Parameters:
- segmentsvocalpy.Segments.
A
Segmentsinstance, the output of a segmenting function invocalpy.segment.
- Returns:
See also
Notes
The :meth`Sound.segment` method is used with the output of functions from
vocalpy.segment, an instance ofSegments. If you need to clip aSoundat arbitrary times, use theclip()method.Examples
>>> sound = voc.example("bells.wav") >>> segments = voc.segment.meansquared(sound) >>> syllables = sound.segment(segments) >>> len(syllables) 10
- to_mono()[source]#
Convert a
Soundto mono by averaging samples across channels.Notes
This method uses the
librosa.to_mono()function.Examples
>>> sound = voc.examples("WhiLbl0010") >>> print(sound.channels) 2 >>> sound_mono = sound.to_mono() >>> print(sound.channels) 1
Note that feature extraction functions operate on channels independently, so it may speed up your analysis to convert multi-channel audio to mono, if you do not need to consider channels indepedently.
>>> import timeit >>> import numpy as np >>> sound = voc.examples("WhiLbl0010") >>> sound_mono = sound.to_mono() >>> np.mean(timeit.repeat("voc.feature.biosound(sound)", number=5, globals=globals())) np.float64(19.713963174959645) >>> np.mean(timeit.repeat("voc.feature.biosound(sound_mono)", number=5, globals=globals())) np.float64(9.917085491772742)
- write(path: str | Path, **kwargs) AudioFile[source]#
Write audio data to a file.
- Parameters:
- pathstr, pathlib.Path
Path to file that audio data should be saved in.
- **kwargs: dict, optional
Extra arguments to
soundfile.write(). Refer to :module:`soundfile` documentation for details.