How do I check if a 2-track WAV file is "really" in stereo?

linux audio duplicate wav stereo

6,035

Solution 1

This answer has now been expanded to cover three different way of achieving this, from the simplest; no code required, just listen, to more complex examples that could be used for bulk testing.

Simplest method

Flip the phase of one side & sum the outputs to mono.
If the result is silence, then it was mono; if not, it was stereo.
Even in stereo some parts will have been panned centre - vocals, bass, a lot of the drums etc, but you will hear an overwhelming difference between "some bits are missing " and "almost total silence".
If you just hear odd little tinny, crackly bits of the track, or just periodic fizzes, crackles & thumps, put this down to poor encoding, it's still 'mono' to all intents & purposes.

This relies on the physics of sound; in its simplest form if you add two identical waveforms together, the result will be twice as loud. If you invert one, then they will cancel each other out & always add up to 'zero'… silence. This principle is used for such as noise-cancelling headphones & background noise reduction in your phone's microphone.

Method
From the Audacity manual…

Effect > Invert
There is no effect dialog containing parameters for this effect; Invert operates directly on the selected audio. If the inversion takes an appreciable time, a progress dialog will appear.

Usage Examples

Use the Audio Track Dropdown Menu and choose Split Stereo to Mono.
Select one channel but not the other, apply Invert and then Play. The vocals in each track will cancel each other out, leaving just the instrumentals.
Find out how different the stereo channels are: Use the same steps 1 and 2 above on any stereo track. If the audio is just as loud after the steps as before, the channels are very different. If the result is silence, the track is not really stereo but dual mono, where both left and right contain completely identical audio.

Simple method

Load (import) the (allegedly stereo) file in Audacity. From the top bar menu select Effect, Nyquist Prompt…. Paste the following:

(diff (aref *track* 0) (aref *track* 1))

and hit OK. This will compute the difference between the two tracks.

Completely silent result means the tracks were identical.
Very quiet or very noisy result means the tracks were almost identical.
A result that resembles the original audio at least for some fragment(s) means the tracks were probably different.

"Probably", because it may happen the tracks were identical but opposite in phase. Then diff will increase the amplitude instead of bringing it to zero. The result will be significantly louder than the original. To rule this possibility out get back to the original tracks (Edit, Undo Nyquist Prompt) and sum instead of computing the diff:

(sum (aref *track* 0) (aref *track* 1))

Completely silent result means the tracks were identical but opposite in phase.

These simple tests will fail if the two tracks are similar but shifted in phase, or similar but with different volumes. A formula able to spot similarities also in such cases may exist but I'm not familiar with the Audacity Nyquist Prompt enough to help you further.

This answer took a lot from the following Audacity Forum thread: Arithmetic track mix operations.

Not so simple method

Use the following code to create a .png graphics from your .wav. It runs ffmpeg and convert (from Imagemagick).

#!/bin/sh

for input do

ffmpeg -nostdin -i "$input" -lavfi \
 '[0:a] channelsplit=channel_layout=stereo [left][right];
  [left] loudnorm [L];
  [right] loudnorm [R];
  [L][R] join=inputs=2:channel_layout=stereo [a];
  [a] showspectrumpic=s=800x600:mode=combined:color=channel:legend=no [out]' \
  -f apng -map '[out]' - \
| convert - -colorspace RGB -color-matrix \
' 20   0 -20   0   0   0
   0   0   0   0   0   0
   0  20 -20   0   0   0
   0   0   0   0   0   0
   0   0   0   0   0   0
   0   0   0   0   0   0
' "$input".png

done

Name it spect and make executable (chmod +x spect). Provide one or more allegedly stereo .wav files as command line arguments. Example:

./spect foo.wav /path/to/bar.wav

This will generate foo.wav.png and /path/to/bar.wav.png. By examining these files you will be able to tell if the input files were really in stereo.

What the script does:

(ffmpeg) It normalizes left and right channels independently. This is in case a fake stereo file was created by duplicating mono with different amplification.
(still ffmpeg) It visualizes the spectrum as graphics, where the two channels are represented by different colors. This makes the method immune to phase shifts because it's amplitude what matters when creating a spectrum like this, not phase. Red and green components correspond to the two channels; blue component encodes what's common to the two channels (it will be useful in a moment).
(convert) It processes the graphics:
- "Left" and "right" color components are reduced by the "common" component. This way we emphasize fragments where the two channels differ.
- The result is enhanced by the factor of 20 (you can tweak this).
- Colors are remapped from red/green to red/blue. This is only because I wanted the solution to be more colorblind-friendly.

I will analyze some example results down below. From it you can learn how to tell if stereo is genuine.

Notes:

The code assumes there are two channels. It was only tested with .wav files having two channels.
In the pictures time flows from left to right, frequency rises from bottom to top.
You may want not to normalize. In this case showspectrumpic is the only filter you need in ffmpeg.
I used 800x600 in this answer. Adjust the resolution to your needs.
The top half in each picture is black, I guess it spans to 48 kHz (?) while 22.1 kHz would be enough. My ffmpeg seems not to support the stop option for showspectrumpic, most likely this option would help. There are other methods to deal with this "issue" but I decided not to obfuscate the code. It's an inconvenience, not really an issue.
spect can be used with find -exec or find | xargs.
Further automatic processing is possible, ultimately to a point where the script tells you I'm X% certain it's genuine stereo, I'm Y% certain it's fake stereo. In this answer I won't go this far. Look at pictures and apply heuristics. Learn from the examples below.

Examples – song 1

This is the original .wav of song 1 processed by spect:

You can see there are columns of red, columns of blue. This is where (when) one of the channels dominates. This indicates it's genuine stereo.

Queen – Bohemian Rhapsody

The same song 1 with one channel opposite in phase looks virtually identical (click to enlarge):

The same song 1 mixed to mono and presented as stereo (two identical channels), fake stereo:

The result is virtually all black. In theory it should be perfectly black. TBH I don't know where exactly the artifacts come from. The important thing is there is no detailed "structure" the original song had. The diff method from way above would generate silence for this one.

The same song 1 mixed to mono and presented as stereo (two identical channels), fake stereo, but with one channel opposite in phase:

This one would "fool" the diff method, you would need the sum method. spect works well regardless.

The same song 1 mixed to mono and presented as stereo, fake stereo, but with one channel reduced in volume by 10 dB:

You can see artifacts but again the picture looks very different than the one of the original song. Neither diff nor sum would generate silence.

The same song 1 mixed to mono and presented as stereo, fake stereo, but with one channel reduced in volume by 10 dB and opposite in phase:

It should now be clear opposite phase doesn't matter to spect. The rest of this answer treats this issue as solved.

For comparison: original song 1 with one channel reduced in volume by 10 dB:

Thanks to normalizing channels separately, the detailed "structure" the original song had is still visible.

The same song 1 with one channel completely silent:

The above results one next to the other. From left to right:

genuine stereo
genuine stereo, unbalanced
one channel silent
fake stereo
fake stereo, unbalanced

Notes:

If I manipulated the other channel, the blue or red artifacts might be of the other color. Details matter, not the color.
"Genuine stereo, unbalanced" is still genuine stereo. "Unbalanced" means one channel is not as loud as the other. Here I manipulated the original file to achieve this. In general it may be the original recording was like this. It does not mean somebody tampered with the file.

Examples – song 2

This is the original .wav of song 1 processed by spect:

This song does not separate channels as clearly as the first one, there are no columns of red or blue. Still some frequencies are more red than blue. The characteristics changes few times as the song goes. This indicates it's genuine stereo.

Counting Crows – Mr. Jones

Different results one next to the other. From left to right:

genuine stereo
genuine stereo, unbalanced
one channel silent
fake stereo
fake stereo, unbalanced

Like for the song 1, you can tell genuine stereo by spotting detailed "structure".

Examples – song 3

This song is in fact monophonic. Mono signal had been recorded to (I suspect) a stereo tape. Ripped as stereo from the tape along with tape noise different for each channel.

There is no detailed "structure", just noise. This indicates the difference between the channels is basically just noise. The result form the diff method would not be silent, although for this exact .wav file the method would work because I could play the result and hear it's noise.

With unbalanced input the diff/sum method may work if you normalize first. Our spect does this automatically. For the record, this is how unbalanced song 3 processed by spect looks like:

Final notes

Long .wav "compressed" to .png where 800 pixels cover the entire duration may look like noise. A reasonable approach is to improve spect so it retrieves the duration beforehand and adjusts the horizontal resolution accordingly.
If your input is noise then the output from spect will be noise. You may still be able to tell something from the intensity of it, but since the method bases on spotting detailed "structure", it will not give you as obvious results as in cases of genuine stereo for our example songs 1 and 2.
Experiment. :)

Solution 2

An alternative, and in my opinion, easier way to calculate the difference between left and right track:

Click on the track, and then "Split Stereo Track"

Click on the second track, and then "Effect/Invert"

Set the panning of both tracks to center, select everything, and click on "Tracks/Mix/Mix and Render"

The result is the difference of both tracks. If it is zero, then it's the same track on the left and right sides. In this case, it's not.

Solution 3

Here's a solution with sox, which makes more sense for this task than ffmpeg, in my opinion.

Sox has the oops effect, aka karaoke filter:

Out Of Phase Stereo effect. Mixes stereo to twin-mono where each mono channel contains the difference between the left and right stereo channels.

And if both channels are the same, the result should be all zero.
We can use sox's stat effect to check this.

We can chain both effects and have one simple command:

sox infile.wav -n oops stat

Which has this result for a "fake stereo" file, i.e. l/r channels are identical:

...
Maximum amplitude:     0.000000
Minimum amplitude:     0.000000 
...

For a file which is almost stereo it looks like this:

Maximum amplitude:     0.000397

In contrast, for a random song I picked:

Maximum amplitude:     0.950149
Minimum amplitude:    -1.000000

You could go even further and compare the channels at bit level, by diffing the two channels:

# check the -b/-e params with: soxi in.wav
sox in.wav -b 16 -e signed -c 1 in.l.raw remix 1
sox in.wav -b 16 -e signed -c 1 in.r.raw remix 2
diff in.l.raw in.r.raw

Which will output

Binary files in.l.raw and in.r.raw differ

if they differ.

I'm sure you could also condense this into one line with subshells.

Solution 4

You can achieve this with FFmpeg, using the pan filter to generate the delta between both audio channels, followed by the astats filter to print the overall RMS signal level.

ffmpeg -i $INPUT_FILE -filter:a "pan=1c|c0=c0-c1,astats=measure_perchannel=none:measure_overall=RMS_level" -f null /dev/null

Example output:

[Parsed_astats_1 @ 0x3ad1340] Channel: 1
[Parsed_astats_1 @ 0x3ad1340] Overall
[Parsed_astats_1 @ 0x3ad1340] RMS level dB: -16.015304

The lower the displayed RMS level the smaller is the difference between both audio channels. If they are perfectly equal the displayed value will be "-inf".

Solution 5

If you have installed ffmpeg and have ffprobe, use this command:

ffprobe -i file.wav -show_streams -select_streams a:0

This will give you an output where the important part is:

[STREAM]
...
channels=2
channel_layout=stereo

Note the channels=2 and channel_layout=stereo, which you can pass through grep to check.

A simpler command uses -show_entries to specify what you want, applies print formatting via -of to strip everything else as well and sets verbosity to 0 to not print the usual starting information:

ffprobe -i yourFile.mp4 -show_entries stream=channels -select_streams a:0 -of compact=p=0:nk=1 -v 0

This will return "2" for a typical stereo audio file.

source

For comparing the two channels, the post Comparing two supposedly identical tracks has this advice that uses the free Audacity:

Import them both into Audacity. Apply the "Invert" effect to one of the tracks. Select both tracks, then from the "Tracks menu > Mix and Render". If the tracks were identical, the result will be silence. To check that it is absolute silence, select the full (mix) track, and open the "Amplify" effect. If the Amplify effect says that the "New Peak Amplitude" is "-infinity", then the mix track is totally silent and the two imported files have identical audio.

The post contains much discussion that will interest you and some alternatives.

View more solutions

6,035

einpoklum

Made my way from the Olympus of Complexity Theory, Probabilistic Combinatorics and Property Testing to the down-to-earth domain of Heterogeneous and GPU Computing, and now I'm hoping to bring the gospel of GPU and massive-regularized parallelism to DBMS architectures. I've post-doc'ed at the DB architecture group in CWI Amsterdam to do (some of) that. I subscribe to most of Michael Richter's critique of StackOverflow; you might want to take the time to read it. If you listen closely you can hear me muttering "Why am I not socratic again already?"

Updated on September 18, 2022

Comments

einpoklum almost 2 years
I have an audio file (WAV format to be specific). When I open it with an editor (e.g. audacity), I see two channels I suspect that the recording is actually mono rather than audio, i.e. I suspect the tracks are duplicate. What's an easy way to check whether they are...
- "perfectly" duplicate?
- "nearly" duplicate, undistinguishable to the ear?
I'm using Devuan GNU/Linux. A command-line solution would be nice, GUI is ok too.
- einpoklum over 3 years
  
  @psusi: You're suggesting playing both tracks separately? I guess that's possible (although not in my personal case due to a hearing impairment). Please make that an answer.
- psusi over 3 years
  
  Not at all. I'm saying that if both tracks sound the same, then who cares whether they are slightly different? It may as well be mono, so why not save some space and drop the second track? Though I suppose if you are hearing impaired, the changes things.
- einpoklum over 3 years
  
  "I'm saying that if both tracks sound the same," <- I don't know if they sound the same. You're suggesting I check by carefully listening to both tracks separately and comparatively.
- Tanner Swett over 3 years
  
  @psusi "I'm saying that if both tracks sound the same, then who cares whether they are slightly different?" – That's exactly why einpoklum asked the question, isn't it? They want to detect whether or not they have a file that "may as well be mono."
- Fattie over 3 years
  
  @psusi your comments are, no offense, a bit "whacky". I can immediately think of at least four completely obvious reasons why one would need to know exactly what is asked in the question. Great question! It's handy there's an Audacity solution.
- psusi over 3 years
  
  I originally understood the question in a different light not thinking that the OP was hearing impaired. Disregard my comments.
- Karl Knechtel over 3 years
  
  As someone dropping in randomly who found the question and discussion interesting: "I can immediately think of at least four completely obvious reasons" I can only think of the one that actually applies, and only because it was explicitly brought up here. Please enlighten me as to what you find obvious.
- marshal craft over 3 years
  
  I don't know audacity, but I've used the windows audio base apis for microphone and speakers, i know if you try to play one as the other, it should be noise. Also simply if you hear two speakers... It because if it stereo, it has the data for one at that time, then next channel before going in to next time sample. So you can select mono as stereo, and you would cut the sample rate in half or what ever, and not get correct sounds.
- marshal craft over 3 years
  
  And to answer, why carry duplicate data for basic stereo where both channels are same, for one, audio needs minimum amount of real time performance. Many people think audio is much easier then video, not true, there a lot of information and needs real time performance, especially for higher sample rates and sample sizes, like 8,16,32 bit audio. So it saves valuable computation
- marshal craft over 3 years
  
  As well, the way the audio is recorded can be mono but it should work for most all stereo devices. The recording will be one channel. But you can any number of channels, and any number of speakers. The audio codec could easily just double the mono channel. It's up to the users how they record and store audio.
- marshal craft over 3 years
  
  Yeah basically my experience is digital audio is tough. It struggles to do what analog can in many ways. This is just because sound occurs at high frequencies.
- alephzero over 3 years
  
  Just print out the first few kb of the file in hex with od. It will be obvious if the data is exact pairs of repeated 16-bit values. If you want to test for both channels sounding the same, just listen to them separately.
- Tetsujin over 3 years
  
  @marshalcraft - your four comments above show an astounding lack of comprehension as to how this works. To save on the noise level this QA has already raised, it would be easier on everybody if you just deleted them. [I know that may sound 'cruel' or 'unfair' but this is a very specific, very simple, audio task which has attracted an unfounded level of miscomprehension already].
Kamil Maciorowski over 3 years

@einpoklum In theory: destructive interference when you are equally distant from both speakers. But your head is not a point, in practice it's probably hard to tell. With headphones you may not hear any obvious difference at all. I tested with headphones: opposite phases sound somewhat strange in comparison to identical tracks. But non-identical tracks may sound similar. Notes: (1) I'm not a musician nor a sound expert; (2) my current headphones are dumb and work like speakers (my "good Bluetooth headphones" mentioned in this answer no longer work).
Pokechu22 over 3 years

+1, but a slightly simpler approach is to use "Split Stereo to Mono", which automatically sets the pan to center so you don't need to manually do it afterwards.
Cort Ammon over 3 years

@einpoklum When I was working with professional sound equipment, we had a button to reverse the phase of the subwoofers with respect to the rest of the sound. While we could never quite work out the physics of why it mattered, we did notice that one setting sounded better than the other, but which one sounded better was venue dependent. The human ear is an amazing signal processing device that does lots of things that can seem really strange!
Arsenal over 3 years

@einpoklum My rear speakers have two tweeters and a switch to toggle between single tweeter, dual in phase tweeter and dual counter phase tweeters. The effects are quite interesting. Dual counter phase creates a more "diffuse" sound. It feels like the room behind me got suddenly larger, which is great because I sit right in front of the wall. So there is a strange effect going on with opposite phase audio.
Austin Hemmelgarn over 3 years

@Arsenal It’s not really all that ‘strange’ if you look at it scientifically. The destructive interference from the signals being out of phase results in attenuation, and if you’re in a proper surround setup and the rear channels are quieter than the rest, you’ll usually perceive the sounds as coming from further away, which in turn makes the ‘room’ (or alternatively ‘sound stage’) sound bigger than if physically is.
Arsenal over 3 years

@AustinHemmelgarn the effect then would be the same as turning down the volume of the rear speakers. But it isn't.
Fattie over 3 years

Ah great you CAN do it in ffmpeg! thanks!
einpoklum over 3 years

What should be a reasonable cutoff, in your opinion? Or perhaps I should ask - what would I get for a proper stereo file? After all, even when you have proper stereo, the tracks still exhibit some similarity.
blerontin over 3 years

In some quick tests the RMS value of the Left-Right channel signal is around -18..-16dB. So maybe put the threshold at -30dB, or even lower? It basically is relative to the original input level. If the input file is already at -20dB the channel delta will also be much lower than in case the input file would have 0dB.
MackTuesday over 3 years

@einpoklum - In addition to the other answers here, it will mess with the stereo image.
TooTea over 3 years

@KamilMaciorowski Exactly, "destructive interference when you are equally distant from both speakers". And what "equally distant" means depends on the wavelength. Your head is very much a point for anything below say 200 Hz (wavelengths of metres or more), so flipping the phase of one track leads to very dull and washed-out bass. This is why speaker terminals and cables are always color-coded, because connecting one speaker the wrong way round makes everything sound like a laptop.
Mark over 3 years

Won't it also show channels=2 and channel_layout=stereo for two-channel mono being passed off as stereo?
Peter Cordes over 3 years

To check for the possibility of opposite-phase, zoom in the time resolution so you can visually see the actual peaks and troughs of the two tracks. It will be visually obvious if they're in phase or inverted.
Peter Cordes over 3 years

@einpoklum: If the file has been through lossy compression, that could have introduced differences that are just compression artifacts, if the compressor didn't notice and fully exploit the redundancy in the first place. (e.g. by using a joint-stereo mode to compress sum or average and difference, instead of each channel separately.) So you should generally listen to it to get some kind of idea what the difference sounds like.
einpoklum over 3 years

@PeterCordes: If I know the file's provenance, I would probably know whether it's real stereo or not. But wav is barely ocmpressed...
harrymc over 3 years

@Mark: It's not an absolute measure, but it works for non-fake stereos.
ComicSansMS over 3 years

At least on my version of Audacity (3.0, Windows) the Nyquist Prompt command is located under the Tools menu, not under Effect.