pytsmod.utils package

Submodules

pytsmod.utils.stft module

pytsmod.utils.stft.istft(spec, syn_hop=2048, win_type='hann', win_size=4096, zero_pad=0, num_iter=1, original_length=-1, fft_shift=False, restore_energy=False)

Inverse Short-Time Fourier Transform to recover the audio signal from the spectrogram. This function is used for phase vocoder.

Parameters:
Xnumpy.ndarray [shape=(num_bins, num_frames)]

the input audio complex spectrogram.

syn_hopint > 0 [scalar]

the hop size of the synthesis window.

win_typestr

type of the window function for the ISTFT. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

zero_padint > 0 [scalar]

the size of the zero pad in the window function.

num_iterint > 0 [scalar]

the number of iterations the algorihm should perform to adapt the phase.

original_lengthint > 0 [scalar]

original length of the audio signal.

fft_shiftbool

apply circular shift to ISTFT.

restore_energybool

tries to reserve potential energy loss.

Returns:
ynumpy.ndarray [shape=(original_length)]

the output audio sequence.

pytsmod.utils.stft.lsee_mstft(X, syn_hop, win_type, win_size, zero_pad, fft_shift, restore_energy)

Least Squares Error Estimation from the MSTFT (Modified STFT). Griffin-Lim procedure to estimate the audio signal from the modified STFT.

Parameters:
Xnumpy.ndarray [shape=(num_bins, num_frames)]

the input audio complex spectrogram.

syn_hopint > 0 [scalar]

the hop size of the synthesis window.

win_typestr

type of the window function for the ISTFT. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

zero_padint > 0 [scalar]

the size of the zero pad in the window function.

fft_shiftbool

apply circular shift to ISTFT.

restore_energybool

tries to reserve potential energy loss.

Returns:
xnumpy.ndarray [shape=num_samples]

the output audio sequence through LSEE_MSTFT

pytsmod.utils.stft.stft(x, ana_hop=2048, win_type='hann', win_size=4096, zero_pad=0, sr=44100, fft_shift=0, time_frequency_out=False)

Short-Time Fourier Transform (STFT) for the audio signal. This function is used for phase vocoder.

Parameters:
xnumpy.ndarray [shape=(num_samples)]

the input audio sequence. Should be a single channel.

ana_hopint > 0 [scalar] or numpy.ndarray [shape=(num_frames)]

either a analysis hop size (scalar) or analyze window positions (array).

win_typestr

type of the window function for the STFT. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

zero_padint > 0 [scalar]

the size of the zero pad in the window function.

srint > 0 [scalar]

the sample rate of the audio sequence. Only used for time_frequency_out.

fft_shiftbool

apply circular shift to STFT.

time_frequency_outbool

returns time and frequency axis indices in (spec, t, f).

Returns:
specnumpy.ndarray [shape=(win_size // 2 + 1, num_frames)]

the STFT result of the input audio sequence.

tnumpy.ndarray [shape=num_frames]

timestamp of the output result.

fnumpy.ndarray [shape=win_size // 2 + 1]

frequency value for each frequency bin of the output result.

pytsmod.utils.validate module

pytsmod.utils.win module

pytsmod.utils.win.win(win_type='hann', win_size=4096, zero_pad=0)

Generate diverse type of window function

Parameters:
win_typestr

the type of window function. Currently, Hann and Sin are supported.

win_sizeint > 0 [scalar]

the size of window function. It doesn’t contains the length of zero padding.

zero_padint > 0 [scalar]

the total length of zero-pad. Zeros are equally distributed for both left and right of the window.

Returns:
winnumpy.ndarray([shape=(win_size)])

the window function generated.

Module contents