Frequency detection from a sound file

15,888

Solution 1

I'm not sure if this is what you want, if you just want the FFT:

import scikits.audiolab, scipy
x, fs, nbits = scikits.audiolab.wavread(filename)
X = scipy.fft(x)

If you want the magnitude response:

import pylab
Xdb = 20*scipy.log10(scipy.absolute(X))
f = scipy.linspace(0, fs, len(Xdb))
pylab.plot(f, Xdb)
pylab.show()

Solution 2

I think that what you need to do is a Short-time Fourier Transform(STFT). Basically, you do multiple partially overlapping FFTs and add them together for each point in time. Then you would find the peak for each point in time. I haven't done this myself, but I've looked into it some in the past and this is definitely the way to go forward.

There's some Python code to do a STFT here and here.

Share:
15,888
Mieke Zwart
Author by

Mieke Zwart

Updated on July 24, 2022

Comments

  • Mieke Zwart
    Mieke Zwart almost 2 years

    What I am trying to achieve is the following: I need the frequency values of a sound file (.wav) for analysis. I know a lot of programs will give a visual graph (spectrogram) of the values but I need to raw data. I know this can be done with FFT and should be fairly easily scriptable in python but not sure how to do it exactly. So let's say that a signal in a file is .4s long then I would like multiple measurements giving an output as an array for each timepoint the program measures and what value (frequency) it found (and possibly power (dB) too). The complicated thing is that I want to analyse bird songs, and they often have harmonics or the signal is over a range of frequency (e.g. 1000-2000 Hz). I would like the program to output this information as well, since this is important for the analysis I would like to do with the data :)

    Now there is a piece of code that looked very much like I wanted, but I think it does not give me all the values I want.... (thanks to Justin Peel for posting this to a different question :)) So I gather that I need numpy and pyaudio but unfortunately I am not familiar with python so I am hoping that a Python expert can help me on this?

    Source Code:

    # Read in a WAV and find the freq's
    import pyaudio
    import wave
    import numpy as np
    
    chunk = 2048
    
    # open up a wave
    wf = wave.open('test-tones/440hz.wav', 'rb')
    swidth = wf.getsampwidth()
    RATE = wf.getframerate()
    # use a Blackman window
    window = np.blackman(chunk)
    # open stream
    p = pyaudio.PyAudio()
    stream = p.open(format =
                    p.get_format_from_width(wf.getsampwidth()),
                    channels = wf.getnchannels(),
                    rate = RATE,
                    output = True)
    
    # read some data
    data = wf.readframes(chunk)
    # play stream and find the frequency of each chunk
    while len(data) == chunk*swidth:
        # write data out to the audio stream
        stream.write(data)
        # unpack the data and times by the hamming window
        indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\
                                             data))*window
        # Take the fft and square each value
        fftData=abs(np.fft.rfft(indata))**2
        # find the maximum
        which = fftData[1:].argmax() + 1
        # use quadratic interpolation around the max
        if which != len(fftData)-1:
            y0,y1,y2 = np.log(fftData[which-1:which+2:])
            x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)
            # find the frequency and output it
            thefreq = (which+x1)*RATE/chunk
            print "The freq is %f Hz." % (thefreq)
        else:
            thefreq = which*RATE/chunk
            print "The freq is %f Hz." % (thefreq)
        # read some more data
        data = wf.readframes(chunk)
    if data:
        stream.write(data)
    stream.close()
    p.terminate()
    
  • Mieke Zwart
    Mieke Zwart over 13 years
    Thanks! The second link definitely looks like what I need. I'll try this out!
  • Mieke Zwart
    Mieke Zwart over 13 years
    I got this to work but only on mono sound files. Stereo seems to be a problem
  • optimus prime
    optimus prime almost 8 years
    Printing X value giving this output [-1.15917969+0.j -0.06542969+0.j -0.06542969+0.j ..., -0.06542969+0.j -0.06542969+0.j -0.06542969+0.j] But I should get only one frequency, right? where is the frequency