pytsmod#

pytsmod.ola(x, s, win_type='hann', win_size=1024, syn_hop_size=512)#

Modify length of the audio sequence using OLA algorithm. WSOLA with zero tolerance is working same as OLA.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]

the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.

win_typestr

type of the window function. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

syn_hop_sizeint > 0 [scalar]

hop size of the synthesis window. Usually half of the window size.

Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.

pytsmod.wsola(x, s, win_type='hann', win_size=1024, syn_hop_size=512, tolerance=512)#

Modify length of the audio sequence using WSOLA algorithm.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]

the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.

win_typestr

type of the window function. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

syn_hop_sizeint > 0 [scalar]

hop size of the synthesis window. Usually half of the window size.

toleranceint >= 0 [scalar]

number of samples the window positions in the input signal may be shifted to avoid phase discontinuities when overlap-adding them to form the output signal (given in samples).

Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.

pytsmod.phase_vocoder(x, s, win_type='sin', win_size=2048, syn_hop_size=512, zero_pad=0, restore_energy=False, fft_shift=False, phase_lock=False)#

Modify length of the audio sequence using Phase Vocoder algorithm.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]

the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.

win_typestr

type of the window function for the STFT. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

syn_hop_sizeint > 0 [scalar]

hop size of the synthesis window. Usually half of the window size.

zero_padint > 0 [scalar]

the size of the zero pad in the window function.

restore_energybool

tries to reserve potential energy loss.

fft_shiftbool

apply circular shift to STFT and ISTFT.

phase_lockbool

apply phase locking.

Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.

pytsmod.phase_vocoder_int(x, s, win_type='hann', win_size=2048, syn_hop_size=512, zero_pad=None, restore_energy=False, fft_shift=True)#

Modify length of the audio sequence using Phase Vocoder algorithm. Works specially well for integer stretching.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

alphaint > 0 [scalar]

the time stretching factor. Only a integer value greater than 0 is allowed.

win_typestr

type of the window function for the STFT. hann and sin are available.

win_sizeint > 0 [scalar]

size of the window function.

syn_hop_sizeint > 0 [scalar]

hop size of the synthesis window. Usually half of the window size.

zero_padint > 0 [scalar]

the size of the zero pad in the window function.

restore_energybool

tries to reserve potential energy loss.

fft_shiftbool

apply circular shift to STFT and ISTFT.

Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.

pytsmod.hptsm(x, s, hp_len_harm=10, hp_len_perc=10, hp_mask_mode='binary', hp_win_type='hann', hp_win_size=1024, hp_hop_size=256, hp_zero_pad=0, hp_fft_shift=False, pv_win_type='hann', pv_win_size=2048, pv_syn_hop_size=512, pv_zero_pad=0, pv_restore_energy=False, pv_fft_shift=False, pv_phase_lock=True, ola_win_type='hann', ola_win_size=256, ola_syn_hop_size=128)#

Modify length of the audio sequence using both Phase Vocoder and OLA. Apply Phase Vocoder to harmonic signal, and apply OLA to percussive signal. For HPSS, median filter based algorithm is used.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

snumber > 0 [scalar] or numpy.ndarray [shape=(2, num_points)]

the time stretching factor. Either a constant value (alpha) or an 2 x n array of anchor points which contains the sample points of the input signal in the first row and the sample points of the output signal in the second row.

hp_parameters for HPSS.
pv_parameters for phase vocoder.
ola_parameters for OLA.
Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.

pytsmod.tdpsola(x, sr, src_f0, tgt_f0=None, alpha=1, beta=None, win_type='hann', p_hop_size=441, p_win_size=1470)#

Modify length and pitch of the audio sequnce using TD-PSOLA algorithm.

Parameters
xnumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the input audio sequence to modify.

srint > 0 [scalar]

sample rate of the input audio sequence.

src_f0numpy.ndarray [shape=(channel, num_freqs) or (num_freqs)]

the fundamental frequency contour of the input audio sequence.

tgt_f0numpy.ndarray [shape=(channel, num_freqs) or (num_freqs)]

the target fundamental frequency contour you want to modify the input audio sequence. Should not be used with beta.

alphanumber > 0 [scalar]

time stretching factor.

betanumber > 0 [scalar]

the pitch shifting factor. should not be used with target_f0.

win_typestr

type of the window function. hann and sin are available.

p_hop_sizeint > 0 [scalar]

the hop size of src_f0 (in samples).

p_win_sizeint > 0 [scalar]

the window size of pitch tracking algorithm you used. (in samples).

Returns
ynumpy.ndarray [shape=(channel, num_samples) or (num_samples)]

the modified output audio sequence.