vocalpy.spectral.soundsig_spectro

vocalpy.spectral.soundsig_spectro#

vocalpy.spectral.soundsig_spectro(sound: Sound, spec_sample_rate: int = 1000, freq_spacing: int = 50, min_freq: int = 0, max_freq: int = 10000, nstd: int = 6, scale: bool = True, scale_val: int | float = 32768, scale_dtype: npt.DTypeLike = <class 'numpy.int16'>) Spectrogram[source]#

Compute a dB-scaled spectrogram using a Gaussian window.

Replicates the result of the method soundsig.BioSound.spectroCalc().

Parameters:
soundvocalpy.Sound

Sound loaded from a file. Multi-channel is supported.

spec_sample_rateint

Sampling rate for the output spectrogram, in Hz. Sets the overlap for the windows in the STFT.

freq_spacingint

The time-frequency scale for the spectrogram, in Hz. Determines the width of the Gaussian window.

min_freqint

The minimum frequency to analyze, in Hz. The returned Spectrogram will only contain frequencies \(\gte\) min_freq.

max_freqint

The maximum frequency to analyze, in Hz. The returned Spectrogram will only contain frequencies \(\lte\) max_freq.

nstdint

Number of standard deviations of the Gaussian in one window.

scalebool

If True, scale the sound.data. Default is True. This is needed to replicate the behavior of soundsig, which assumes the audio data is loaded as 16-bit integers. Since the default for vocalpy.Sound is to load sounds with a numpy dtype of float64, this function defaults to multiplying the sound.data by 2**15, and then casting to the int16 dtype. This replicates the behavior of the soundsig function, given data with dtype float64. If you have loaded a sound with a dtype of int16, then set this to False.

scale_val

Value to multiply the sound.data by, to scale the data. Default is 2**15. Only used if scale is True. This is needed to replicate the behavior of soundsig, which assumes the audio data is loaded as 16-bit integers.

scale_dtypenumpy.dtype

Numpy Dtype to cast sound.data to, after scaling. Default is np.int16. Only used if scale is True. This is needed to replicate the behavior of soundsig, which assumes the audio data is loaded as 16-bit integers.

Returns:
spectSpectrogram

A dB-scaled Spectrogram from an STFT computed with a Gaussian window.