time-stretch

Time stretching and pitch shifting.

	Domain	Quality	CPU cost	Best for
wsola	time	★★★	low	speech, real-time
psola	time	★★★★	medium	speech / monophonic instruments
vocoder	freq	★★	medium	educational baseline
vocoder `{ lock }`	freq	★★★★	medium	general music
vocoder `{ transients }`	freq	★★★★★	medium	music with percussion
paulstretch	freq	—	medium	extreme stretch (ambient, drones)
sms	sinusoidal	★★★★	high	harmonic / tonal material

For voice pitch shift with formant preservation, use the pitch-shift package.

Usage

npm install time-stretch

import { vocoder, pitchShift } from 'time-stretch'

let slower = vocoder(samples, { factor: 2, transients: true })  // 2× slower, same pitch
let higher = pitchShift(samples, { semitones: 5 })               // pitch up, same speed

let write = vocoder({ factor: 1.5, transients: true })           // real-time streaming
write(block1)
write(block2)
write()                                                           // → remaining samples

Time domain

`wsola`

Waveform Similarity Overlap-Add. Divides signal into overlapping frames and places them at new synthesis positions, but before placing each frame searches ±delta samples for the position that maximizes cross-correlation with the preceding output — eliminating the phase cancellation (flanging) of plain OLA. No FFT overhead.

import { wsola } from 'time-stretch'

wsola(data, { factor: 1.5 })
wsola(data, { factor: 0.5, delta: 512 })

Param	Default
`factor`	`1`	Time stretch ratio
`frameSize`	`1024`	Window size
`hopSize`	`frameSize/4`	Hop between frames
`delta`	`frameSize/4`	Search range (±samples)

Use when: Speech, real-time with tight CPU budgets, moderate ratios (0.5–2×).
Not for: Polyphonic music with sustained tones — frequency-domain methods handle harmonics better.

`psola`

Pitch-Synchronous Overlap-Add. Detects pitch period via autocorrelation, then windows grains at pitch cycle boundaries. Because grains align with the pitch cycle there are no phase discontinuities at overlaps — cleaner results than WSOLA for monophonic pitched signals.

import { psola } from 'time-stretch'

psola(data, { factor: 1.5 })
psola(data, { factor: 0.75, sampleRate: 48000 })
psola(data, { factor: 2, minFreq: 100, maxFreq: 400 })  // male voice range

Param	Default
`factor`	`1`	Time stretch ratio
`sampleRate`	`44100`	For pitch detection frequency range
`minFreq`	`80`	Lowest expected pitch (Hz)
`maxFreq`	`500`	Highest expected pitch (Hz)

Use when: Speech, solo vocals, monophonic instruments, factors 0.5–2×.
Not for: Polyphonic material — autocorrelation finds one pitch period so chords get mangled. Extreme ratios (>2×) cause gaps.

Frequency domain

`vocoder`

Phase vocoder with three quality modes, controlled by lock and transients options.

Plain — each bin's phase advances at its instantaneous frequency independently. Magnitudes are preserved but incoherent inter-harmonic phase relationships give complex signals a diffuse, "underwater" quality.

{ lock: true } — after propagating phases, locks non-peak bins to their nearest spectral peak's rotation (Laroche & Dolson, 1999). Restores harmonic phase coherence, eliminating phasiness.

{ transients: true } — phase-locked vocoder that also measures spectral flux between frames (Röbel, 2003). When a sharp onset is detected it resets to the original analysis phase instead of propagating it, preserving attack sharpness on drums and plucks. Implies lock.

import { vocoder } from 'time-stretch'

vocoder(data, { factor: 2 })                                      // plain — educational baseline
vocoder(data, { factor: 2, lock: true })                          // phase-locked — general music
vocoder(data, { factor: 2, transients: true })                    // transient-aware — best quality
vocoder(data, { factor: 1.5, transientThreshold: 2.0 })           // less sensitive detection

Param	Default
`factor`	`1`	Time stretch ratio
`frameSize`	`2048`	FFT size (power of 2)
`hopSize`	`frameSize/4`	Hop between frames
`lock`	`false`	Phase locking (Laroche & Dolson, 1999)
`transients`	`false`	Transient detection, implies `lock` (Röbel, 2003)
`transientThreshold`	`1.5`	Spectral flux threshold (higher = fewer resets)

Use when: transients: true is the right default for most material — music with percussion, mixed sources.
lock: true for purely tonal/ambient material where transient resets aren't needed.
Plain for educational use or simple tonal signals only.
Not for: Voice/speech — use psola. Extreme stretch — use paulstretch.

`paulstretch`

Extreme time stretching via phase randomization (Nasca, 2006). Preserves magnitudes but replaces all phases with random values, producing smooth, dreamlike textures. Designed for large factors.

import { paulstretch } from 'time-stretch'

paulstretch(data, { factor: 8 })
paulstretch(data, { factor: 100, frameSize: 8192 })

Param	Default
`factor`	`8`	Time stretch ratio (best >2×)
`frameSize`	`4096`	FFT size (larger = smoother)
`seed`	`0x12345678`	PRNG seed for phase randomization (deterministic output)

Use when: Ambient music, sound design, drone generation, 8×–1000× stretch.
Not for: Small ratios (<2×) — sounds washed out. Not for preserving rhythm or transients.

Sinusoidal

`sms`

Sinusoidal Modeling Synthesis (Serra 1989, McAulay-Quatieri 1986). Decomposes audio into individually tracked sinusoidal partials and resynthesizes at the new time rate. Each partial's frequency and magnitude are interpolated independently — no phase spreading or bin-by-bin artifacts.

import { sms } from 'time-stretch'

sms(data, { factor: 2 })
sms(data, { factor: 0.5, maxTracks: 80 })
sms(data, { factor: 3, frameSize: 4096 })

Param	Default
`factor`	`1`	Time stretch ratio
`frameSize`	`2048`	FFT frame size
`hopSize`	`frameSize/4`	Hop between frames
`maxTracks`	`60`	Max simultaneous sinusoidal tracks
`minMag`	`1e-4`	Peak detection threshold (linear)
`freqDev`	`3`	Max frequency deviation (bins) for track continuation
`residualMix`	`1`	Stochastic residual blended into the sinusoidal output

Use when: Harmonic / tonal content — instruments, chords, vocals — where the vocoder introduces smearing. Default residualMix=1 blends breath, noise, and transient energy back in alongside the sinusoidal model.
Not for: Noise-dominated material.

Pitch shift

`pitchShift`

Pitch shifting via time-stretch + resample: stretches by the pitch ratio (ratio = 2^(semitones/12)), then resamples back to original length. Output length equals input length.

import { pitchShift } from 'time-stretch'

pitchShift(data, { semitones: 7 })    // perfect fifth up
pitchShift(data, { semitones: -12 })  // octave down
pitchShift(data, { ratio: 1.5 })      // direct ratio

Param	Default
`semitones`	`0`	Pitch shift in semitones
`ratio`	from semitones	Direct frequency ratio
`frameSize`	`2048`	Passed to stretch method
`hopSize`	`frameSize/4`	Passed to stretch method
`transientThreshold`	`1.5`	Transient sensitivity

Use when: Pitch correction, harmonizing, creative effects.
Not for: Voice without formant preservation — will sound chipmunk/giant. Use the pitch-shift package instead. For content-aware algorithm selection (voice → psola, tonal → sms) call those functions directly.

Integration

Streaming

All algorithms support block-by-block streaming. Call with options only (no data) to get a writer:

let write = vocoder({ factor: 1.5, transients: true })

// in your audio callback:
let output = write(inputBlock)    // → Float32Array (may be empty while buffering)

// when done:
let tail = write()                // → remaining buffered samples

Feed ordered Float32Array chunks. Output sizes are variable — small or empty early chunks are normal.
Call write() exactly once at the end to flush.
Use one writer per channel for stereo or multichannel material.

wsola({ factor })
vocoder({ factor, lock, transients, transientThreshold })
paulstretch({ factor })
psola({ factor, sampleRate, minFreq, maxFreq })
sms({ factor, maxTracks, minMag, freqDev })

One-shot buffer

import { vocoder } from 'time-stretch'

let src = audioBuffer.getChannelData(0)
let out = vocoder(new Float32Array(src), { factor: 1.25, transients: true })

Stereo / multi-channel

All algorithms process mono Float32Array. For stereo, split channels and process independently:

let L = vocoder(left,  { factor: 2, transients: true })
let R = vocoder(right, { factor: 2, transients: true })

// Streaming:
let wL = vocoder({ factor: 2, transients: true })
let wR = vocoder({ factor: 2, transients: true })

Research & comparison

Command	What it does
`node scripts/compare.js`	writes `compare.html` — interactive waveforms, playback, internal-vs-external comparisons
`node scripts/bench.js`	throughput and ×realtime numbers for batch and streaming
`node scripts/diagnose.js`	targeted diagnostics for specific algorithm behaviors

Demo for a lightweight browser listening matrix. scripts/compare.js for deeper analysis.

References

Verhelst, W. & Roelands, M. (1993). "An overlap-add technique based on waveform similarity (WSOLA)." ICASSP.
Laroche, J. & Dolson, M. (1999). "Improved phase vocoder time-scale modification of audio." IEEE Trans. Speech Audio Processing.
Röbel, A. (2003). "A new approach to transient processing in the phase vocoder." DAFx.
Nasca, P. (2006). "PaulStretch — extreme time stretching." paulnasca.com.
Moulines, E. & Charpentier, F. (1990). "Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones." Speech Communication, 9(5-6).
Driedger, J. & Müller, M. (2016). "A review of time-scale modification of music signals." Applied Sciences, 6(2).
Serra, X. (1989). "A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic plus Stochastic Decomposition." PhD thesis, Stanford.
McAulay, R.J. & Quatieri, T.F. (1986). "Speech analysis/synthesis based on a sinusoidal representation." IEEE Trans. ASSP, 34(4).

MIT ॐ

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github/workflows		.github/workflows
.work		.work
scripts		scripts
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.d.ts		index.d.ts
index.html		index.html
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
paulstretch.js		paulstretch.js
pitch-shift.js		pitch-shift.js
psola.js		psola.js
quality.js		quality.js
sms.js		sms.js
stft.js		stft.js
test.js		test.js
util.js		util.js
vocoder.js		vocoder.js
wsola.js		wsola.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

time-stretch

Usage

Time domain

`wsola`

`psola`

Frequency domain

`vocoder`

`paulstretch`

Sinusoidal

`sms`

Pitch shift

`pitchShift`

Integration

Streaming

One-shot buffer

Stereo / multi-channel

Research & comparison

See also

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

time-stretch

Usage

Time domain

wsola

psola

Frequency domain

vocoder

paulstretch

Sinusoidal

sms

Pitch shift

pitchShift

Integration

Streaming

One-shot buffer

Stereo / multi-channel

Research & comparison

See also

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`wsola`

`psola`

`vocoder`

`paulstretch`

`sms`

`pitchShift`

Packages