Plot spectogram from mp3
Solution 1
I'd install the Debian/Ubuntu package libav-tools and call avconv
to decode the mp3 to a temporary wav file:
Edit: Your other question was closed, so I'll expand my answer here a bit with a simple bandpass filtering example. In the file you linked it looks like most of the birdsong is concentrated in 4 kHz - 5.5 kHz.
import os
from subprocess import check_call
from tempfile import mktemp
from scikits.audiolab import wavread, play
from scipy.signal import remez, lfilter
from pylab import *
# convert mp3, read wav
mp3filename = 'XC124158.mp3'
wname = mktemp('.wav')
check_call(['avconv', '-i', mp3filename, wname])
sig, fs, enc = wavread(wname)
os.unlink(wname)
# bandpass filter
bands = array([0,3500,4000,5500,6000,fs/2.0]) / fs
desired = [0, 1, 0]
b = remez(513, bands, desired)
sig_filt = lfilter(b, 1, sig)
sig_filt /= 1.05 * max(abs(sig_filt)) # normalize
subplot(211)
specgram(sig, Fs=fs, NFFT=1024, noverlap=0)
axis('tight'); axis(ymax=8000)
title('Original')
subplot(212)
specgram(sig_filt, Fs=fs, NFFT=1024, noverlap=0)
axis('tight'); axis(ymax=8000)
title('Filtered')
show()
play(sig_filt, fs)
Solution 2
Another very simple way of plotting spectrogram of mp3 file.
from pydub import AudioSegment
import matplotlib.pyplot as plt
from scipy.io import wavfile
from tempfile import mktemp
mp3_audio = AudioSegment.from_file('speech.mp3', format="mp3") # read mp3
wname = mktemp('.wav') # use temporary file
mp3_audio.export(wname, format="wav") # convert to wav
FS, data = wavfile.read(wname) # read wav file
plt.specgram(data, Fs=FS, NFFT=128, noverlap=0) # plot
plt.show()
This uses the pydub
library which is more convenient compared to calling external commands.
This way you can iterate over all your .mp3
files without having to convert them to .wav
prior to plotting.
Majid
Updated on June 06, 2022Comments
-
Majid almost 2 years
I am trying to plot a spectogram straight from an mp3 file in python 2.7.3 (using ubuntu). I can do it from a wav file as follows.
#!/usr/bin/python from scikits.audiolab import wavread from pylab import * signal, fs, enc = wavread('XC124158.wav') specgram(signal) show()
What's the cleanest way to do the same thing from an mp3 file instead of a wav? I don't want to convert all the mp3 files to wav if I can avoid it.
-
Majid about 11 yearsThanks. That works but it gives a slightly different spectogram. The original mp3 is at xeno-canto.org/download.php?XC=124158 . The main difference apart from the x-axis being labelled differently is that the version using the original mp3 and your code includes a blank period at the end and also at the top of the image. I made the wav version just by doing lame --decode XC124158.mp3 .
-
Majid about 11 yearsI just saw that you might know about stackoverflow.com/questions/15309155/… too. It would be great if you had any views on that too.
-
Eryk Sun about 11 yearsI simplified it to just use
mktemp
(it turns out avconv can't write a proper wav header in a pipe) and added more bins to the FFT, with no overlap. axis('tight') gets rid of the blank sections. You might want to also useaxis(ymax=8000)
since most of the power is below 8 kHz. -
Eryk Sun about 11 yearsYou're welcome. The filter isn't a perfect solution, but it delivers pretty decently for how simple it is, IMO. Typical audio isn't spectrally stationary, so normally it requires adaptive filtering.
-
Majid about 11 years@erykson Thanks. Now I have the small task of trying to find the bird sounds in amongst all the noise. Do you have any idea where the buzzing/crackling noise comes from in the version that your python code plays?
-
Eryk Sun about 11 yearsThat noise is in the pass band and gets amplified with the bird song. A simple bandpass filter can't block that.
-
jbuddy_13 almost 4 yearsShould be the accepted answer, most straightforward and can be done directly from python in a single script.