pytsmod#

pytsmod.ola(x, s, win_type='hann', win_size=1024, syn_hop_size=512)#

Modify length of the audio sequence using OLA algorithm. WSOLA with zero tolerance is working same as OLA.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]: the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.
win_typestr: type of the window function. hann and sin are available.
win_sizeint > 0 [scalar]: size of the window function.
syn_hop_sizeint > 0 [scalar]: hop size of the synthesis window. Usually half of the window size.

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.

pytsmod.wsola(x, s, win_type='hann', win_size=1024, syn_hop_size=512, tolerance=512)#

Modify length of the audio sequence using WSOLA algorithm.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]: the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.
win_typestr: type of the window function. hann and sin are available.
win_sizeint > 0 [scalar]: size of the window function.
syn_hop_sizeint > 0 [scalar]: hop size of the synthesis window. Usually half of the window size.
toleranceint >= 0 [scalar]: number of samples the window positions in the input signal may be shifted to avoid phase discontinuities when overlap-adding them to form the output signal (given in samples).

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.

pytsmod.phase_vocoder(x, s, win_type='sin', win_size=2048, syn_hop_size=512, zero_pad=0, restore_energy=False, fft_shift=False, phase_lock=False)#

Modify length of the audio sequence using Phase Vocoder algorithm.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]: the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.
win_typestr: type of the window function for the STFT. hann and sin are available.
win_sizeint > 0 [scalar]: size of the window function.
syn_hop_sizeint > 0 [scalar]: hop size of the synthesis window. Usually half of the window size.
zero_padint > 0 [scalar]: the size of the zero pad in the window function.
restore_energybool: tries to reserve potential energy loss.
fft_shiftbool: apply circular shift to STFT and ISTFT.
phase_lockbool: apply phase locking.

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.

pytsmod.phase_vocoder_int(x, s, win_type='hann', win_size=2048, syn_hop_size=512, zero_pad=None, restore_energy=False, fft_shift=True)#

Modify length of the audio sequence using Phase Vocoder algorithm. Works specially well for integer stretching.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
alphaint > 0 [scalar]: the time stretching factor. Only a integer value greater than 0 is allowed.
win_typestr: type of the window function for the STFT. hann and sin are available.
win_sizeint > 0 [scalar]: size of the window function.
syn_hop_sizeint > 0 [scalar]: hop size of the synthesis window. Usually half of the window size.
zero_padint > 0 [scalar]: the size of the zero pad in the window function.
restore_energybool: tries to reserve potential energy loss.
fft_shiftbool: apply circular shift to STFT and ISTFT.

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.

pytsmod.hptsm(x, s, hp_len_harm=10, hp_len_perc=10, hp_mask_mode='binary', hp_win_type='hann', hp_win_size=1024, hp_hop_size=256, hp_zero_pad=0, hp_fft_shift=False, pv_win_type='hann', pv_win_size=2048, pv_syn_hop_size=512, pv_zero_pad=0, pv_restore_energy=False, pv_fft_shift=False, pv_phase_lock=True, ola_win_type='hann', ola_win_size=256, ola_syn_hop_size=128)#

Modify length of the audio sequence using both Phase Vocoder and OLA. Apply Phase Vocoder to harmonic signal, and apply OLA to percussive signal. For HPSS, median filter based algorithm is used.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]: the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.
hp_parameters for HPSS.
pv_parameters for phase vocoder.
ola_parameters for OLA.

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.

pytsmod.tdpsola(x, sr, src_f0, tgt_f0=None, alpha=1, beta=None, win_type='hann', p_hop_size=441, p_win_size=1470)#

Modify length and pitch of the audio sequnce using TD-PSOLA algorithm.

Parameters

xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the input audio sequence to modify.
srint > 0 [scalar]: sample rate of the input audio sequence.
src_f0numpy.ndarray [shape=(channel, num_freqs) or (num_freqs)]: the fundamental frequency contour of the input audio sequence.
tgt_f0numpy.ndarray [shape=(channel, num_freqs) or (num_freqs)]: the target fundamental frequency contour you want to modify the input audio sequence. Should not be used with beta.
alphanumber > 0 [scalar]: time stretching factor.
betanumber > 0 [scalar]: the pitch shifting factor. should not be used with target_f0.
win_typestr: type of the window function. hann and sin are available.
p_hop_sizeint > 0 [scalar]: the hop size of src_f0 (in samples).
p_win_sizeint > 0 [scalar]: the window size of pitch tracking algorithm you used. (in samples).

Returns

ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]: the modified output audio sequence.