Estimate formants using LPC in Python

12,302

Solution 1

The problem had to do with the order being passed to the lpc function. 2 + fs / 1000 where fs is the sampling frequency is the rule of thumb according to:

http://www.phon.ucl.ac.uk/courses/spsci/matlab/lect10.html

Solution 2

I have not been able to get the results you expect, but I do notice two things which might cause some differences:

  1. Your code uses [1, -0.63] where the MATLAB code from the link you provided has [1 0.63].
  2. Your processing is being applied to the entire x vector at once instead of smaller segments of it (see where the MATLAB code does this: x = mtlb(I0:Iend); ).

Hope that helps.

Solution 3

There are at least two problems:

  • According to the link, the "pre-emphasis filter is a highpass all-pole (AR(1)) filter". The signs of the coefficients given there are correct: [1, 0.63]. If you use [1, -0.63], you get a lowpass filter.

  • You have the first two arguments to scipy.signal.lfilter reversed.

So, try changing this:

x1 = lfilter([1., -0.63], 1, x1)

to this:

x1 = lfilter([1.], [1., 0.63], x1)

I haven't tried running your code yet, so I don't know if those are the only problems.

Share:
12,302
pcaisse
Author by

pcaisse

Currently writing a web app using Django framework

Updated on June 14, 2022

Comments

  • pcaisse
    pcaisse almost 2 years

    I'm new to signal processing (and numpy, scipy, and matlab for that matter). I'm trying to estimate vowel formants with LPC in Python by adapting this matlab code:

    http://www.mathworks.com/help/signal/ug/formant-estimation-with-lpc-coefficients.html

    Here is my code so far:

    #!/usr/bin/env python
    import sys
    import numpy
    import wave
    import math
    from scipy.signal import lfilter, hamming
    from scikits.talkbox import lpc
    
    """
    Estimate formants using LPC.
    """
    
    def get_formants(file_path):
    
        # Read from file.
        spf = wave.open(file_path, 'r') # http://www.linguistics.ucla.edu/people/hayes/103/Charts/VChart/ae.wav
    
        # Get file as numpy array.
        x = spf.readframes(-1)
        x = numpy.fromstring(x, 'Int16')
    
        # Get Hamming window.
        N = len(x)
        w = numpy.hamming(N)
    
        # Apply window and high pass filter.
        x1 = x * w
        x1 = lfilter([1., -0.63], 1, x1)
    
        # Get LPC.
        A, e, k = lpc(x1, 8)
    
        # Get roots.
        rts = numpy.roots(A)
        rts = [r for r in rts if numpy.imag(r) >= 0]
    
        # Get angles.
        angz = numpy.arctan2(numpy.imag(rts), numpy.real(rts))
    
        # Get frequencies.
        Fs = spf.getframerate()
        frqs = sorted(angz * (Fs / (2 * math.pi)))
    
        return frqs
    
    print get_formants(sys.argv[1])
    

    Using this file as input, my script returns this list:

    [682.18960189917243, 1886.3054773107765, 3518.8326108511073, 6524.8112723782951]
    

    I didn't even get to the last steps where they filter the frequencies by bandwidth because the frequencies in the list aren't right. According to Praat, I should get something like this (this is the formant listing for the middle of the vowel):

    Time_s     F1_Hz        F2_Hz         F3_Hz         F4_Hz
    0.164969   731.914588   1737.980346   2115.510104   3191.775838 
    

    What am I doing wrong?

    Thanks very much

    UPDATE:

    I changed this

    x1 = lfilter([1., -0.63], 1, x1)

    to

    x1 = lfilter([1], [1., 0.63], x1)

    as per Warren Weckesser's suggestion and am now getting

    [631.44354635609318, 1815.8629524985781, 3421.8288991389031, 6667.5030877036006]

    I feel like I'm missing something since F3 is very off.

    UPDATE 2:

    I realized that the order being passed to scikits.talkbox.lpc was off due to a difference in sampling frequency. Changed it to:

    Fs = spf.getframerate()
    ncoeff = 2 + Fs / 1000
    A, e, k = lpc(x1, ncoeff)
    

    Now I'm getting:

    [257.86573127888488, 774.59006835496086, 1769.4624576002402, 2386.7093679399809, 3282.387975973973, 4413.0428174593926, 6060.8150432549655, 6503.3090645887842, 7266.5069407315023]

    Much closer to Praat's estimation!

  • pcaisse
    pcaisse over 9 years
    1) I got the pre-emphasis filter code from here: pydoc.net/Python/scikits.talkbox/0.2.5/… I've tried with [1., 0.63] and [1., -0.63] and neither gave me the result I was hoping for. 2) I tried setting x to the middle third of the signal but it made very little difference in the frequencies returned.
  • pcaisse
    pcaisse over 9 years
    I changed that line to x1 = lfilter([1], [1., 0.63], x1) (first argument must be a array-like). That got me these results: [631.44354635609318, 1815.8629524985781, 3421.8288991389031, 6667.5030877036006]. I still feel like I'm missing something since that third formant is very off.
  • Warren Weckesser
    Warren Weckesser over 9 years
    I fixed the missing brackets--thanks. Are you sure that lpc from scikits.talkbox gives the same result as matlab's lpc? Do you have matlab so you can compare the results?