Plot spectogram from mp3

python audio signal-processing

11,281

Solution 1

I'd install the Debian/Ubuntu package libav-tools and call avconv to decode the mp3 to a temporary wav file:

Edit: Your other question was closed, so I'll expand my answer here a bit with a simple bandpass filtering example. In the file you linked it looks like most of the birdsong is concentrated in 4 kHz - 5.5 kHz.

import os
from subprocess import check_call
from tempfile import mktemp
from scikits.audiolab import wavread, play
from scipy.signal import remez, lfilter
from pylab import *

# convert mp3, read wav
mp3filename = 'XC124158.mp3'
wname = mktemp('.wav')
check_call(['avconv', '-i', mp3filename, wname])
sig, fs, enc = wavread(wname)
os.unlink(wname)

# bandpass filter
bands = array([0,3500,4000,5500,6000,fs/2.0]) / fs
desired = [0, 1, 0]
b = remez(513, bands, desired)
sig_filt = lfilter(b, 1, sig)
sig_filt /=  1.05 * max(abs(sig_filt)) # normalize

subplot(211)
specgram(sig, Fs=fs, NFFT=1024, noverlap=0)
axis('tight'); axis(ymax=8000)
title('Original')
subplot(212)
specgram(sig_filt, Fs=fs, NFFT=1024, noverlap=0)
axis('tight'); axis(ymax=8000)
title('Filtered')
show()

play(sig_filt, fs)

Bird Song Spectrgrams

Solution 2

Another very simple way of plotting spectrogram of mp3 file.

from pydub import AudioSegment
import matplotlib.pyplot as plt
from scipy.io import wavfile
from tempfile import mktemp

mp3_audio = AudioSegment.from_file('speech.mp3', format="mp3")  # read mp3
wname = mktemp('.wav')  # use temporary file
mp3_audio.export(wname, format="wav")  # convert to wav
FS, data = wavfile.read(wname)  # read wav file
plt.specgram(data, Fs=FS, NFFT=128, noverlap=0)  # plot
plt.show()

This uses the pydub library which is more convenient compared to calling external commands. This way you can iterate over all your .mp3 files without having to convert them to .wav prior to plotting.

11,281

Author by

Majid

Updated on June 06, 2022

Comments

Majid almost 2 years
I am trying to plot a spectogram straight from an mp3 file in python 2.7.3 (using ubuntu). I can do it from a wav file as follows.
```
#!/usr/bin/python
from scikits.audiolab import wavread
from pylab import *

signal, fs, enc = wavread('XC124158.wav')
specgram(signal)
show()
```
What's the cleanest way to do the same thing from an mp3 file instead of a wav? I don't want to convert all the mp3 files to wav if I can avoid it.
Majid about 11 years

Thanks. That works but it gives a slightly different spectogram. The original mp3 is at xeno-canto.org/download.php?XC=124158 . The main difference apart from the x-axis being labelled differently is that the version using the original mp3 and your code includes a blank period at the end and also at the top of the image. I made the wav version just by doing lame --decode XC124158.mp3 .
Majid about 11 years

I just saw that you might know about stackoverflow.com/questions/15309155/… too. It would be great if you had any views on that too.
Eryk Sun about 11 years

I simplified it to just use mktemp (it turns out avconv can't write a proper wav header in a pipe) and added more bins to the FFT, with no overlap. axis('tight') gets rid of the blank sections. You might want to also use axis(ymax=8000) since most of the power is below 8 kHz.
Eryk Sun about 11 years

You're welcome. The filter isn't a perfect solution, but it delivers pretty decently for how simple it is, IMO. Typical audio isn't spectrally stationary, so normally it requires adaptive filtering.
Majid about 11 years

@erykson Thanks. Now I have the small task of trying to find the bird sounds in amongst all the noise. Do you have any idea where the buzzing/crackling noise comes from in the version that your python code plays?
Eryk Sun about 11 years

That noise is in the pass band and gets amplified with the bird song. A simple bandpass filter can't block that.
jbuddy_13 almost 4 years

Should be the accepted answer, most straightforward and can be done directly from python in a single script.