Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Combining Files

Sample files — download and place in your working directory:

⬇ music.wav — “Erase Data” by Koi-discovery (CC0)

⬇ voice.wav (CC0)

Setup:

sox -n a.wav synth 3 sine 440 gain -6
sox -n b.wav synth 3 sine 660 gain -6
# normalise samples to a common format for mixing
sox samples/music.wav -c 1 -r 44100 music.wav
sox samples/voice.wav -r 44100 voice.wav

Per-input format flags

With multiple inputs, input-section format flags repeat independently for each input file — place them immediately before the file they describe:

sox [input-a] infile_a [input-b] infile_b [output] outfile [effects]

Any input flag works this way: -v, -r, -b, -c, -t, -e. The most common use is -v for per-input volume (shown below), and format flags when combining files of different types or encodings.

-v takes a linear multiplier only — there is no dB form. Common conversions: −6 dB ≈ 0.5, −12 dB ≈ 0.25, −20 dB = 0.1.

sox -v 0.8 a.wav -t raw -r 48000 -b 32 -c 1 -e signed-integer -v 0.5 b.raw out.wav
play out.wav

Concatenation — A then B

List multiple inputs before the output:

sox a.wav b.wav combined.wav
play combined.wav

Files must have identical sample rates and channel counts — sox hard-fails if they differ. Use rate to resample first if needed.

For a smooth crossfade at the join, use the splice effect:

sox a.wav b.wav out.wav splice 3    # crossfade at the 3-second mark
play out.wav

Mixing — A over B

The -m global flag sums inputs together rather than concatenating:

play -m music.wav voice.wav

Mixing raises the overall level — normalize afterward to avoid clipping:

play -m music.wav voice.wav norm -3

Set per-file volume with -v immediately before each input:

play -m -v 0.3 music.wav -v 1.0 voice.wav norm -3

Merging channels — A and B side by side

-M puts channels from each file side by side. Two mono files become one stereo file:

sox -M left.wav right.wav stereo.wav
play stereo.wav

remix — channel routing

Where -c uses sox’s default averaging/duplication, remix gives explicit control. Each argument describes one output channel by naming the input channel(s) that feed it.

play stereo.wav remix 2 1       # swap L and R
play stereo.wav remix -         # average all channels to mono
play stereo.wav remix 1         # keep left channel only, drop right
play stereo.wav remix 1,2 1,2   # both output channels = L+R mix

- averages all input channels into one output channel — equivalent to -c 1 but as an explicit effect. 1,2 sums channels 1 and 2.