From Amplitude or FFT to dB

15,218

A decibel meter measures a signal's mean power. So from your time signal recording you can calculate the mean signal power with:

chunk_size = 44100
num_chunk  = len(signal) // chunk_size
sn = []
for chunk in range(0, num_chunk):
  sn.append(np.mean(signal[chunk*chunk_size:(chunk+1)*chunk_size]**2))

Then the corresponding mean signal power in decibels is simply given by:

logsn = 10*np.log10(sn)

A equivalent relationship could also be obtained for a frequency domain signal with the use of Parseval's theorem, but in your case would require unecessary FFT computations (this relationship is mostly useful when you already have to compute the FFT for other purposes).

Note however that depending on what you compare there may be some (hopefully small) discrepancies. For example the use of non-linear amplifier and speakers would affect the relationship. Similarly ambient noises would add to the measured power by the decibel meter.

Share:
15,218
Selim Turkoglu
Author by

Selim Turkoglu

Updated on June 15, 2022

Comments

  • Selim Turkoglu
    Selim Turkoglu almost 2 years

    I've a Python code which performs FFT on a wav file and plot the amplitude vs time / amplitude vs freq graphs. I want to calculate dB from these graphs (they are long arrays). I do not want to calculate exact dBA, I just want to see a linear relationship after my calculations. I've dB meter, I will compare it. Here is my code:

    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    
    from __future__ import print_function
    import scipy.io.wavfile as wavfile
    import scipy
    import scipy.fftpack
    import numpy as np
    from matplotlib import pyplot as plt
    
    fs_rate, signal = wavfile.read("output.wav")
    print ("Frequency sampling", fs_rate)
    l_audio = len(signal.shape)
    print ("Channels", l_audio)
    if l_audio == 2:
        signal = signal.sum(axis=1) / 2
    N = signal.shape[0]
    print ("Complete Samplings N", N)
    secs = N / float(fs_rate)
    print ("secs", secs)
    Ts = 1.0/fs_rate # sampling interval in time
    print ("Timestep between samples Ts", Ts)
    t = scipy.arange(0, secs, Ts) # time vector as scipy arange field / numpy.ndarray
    FFT = abs(scipy.fft(signal))
    FFT_side = FFT[range(N//4)] # one side FFT range
    freqs = scipy.fftpack.fftfreq(signal.size, t[1]-t[0])
    fft_freqs = np.array(freqs)
    freqs_side = freqs[range(N//4)] # one side frequency range
    fft_freqs_side = np.array(freqs_side)
    
    makespositive = signal[44100:]*(-1)
    logal = np.log10(makespositive)
    
    sn1 = np.mean(logal[1:44100])
    sn2 = np.mean(logal[44100:88200])
    sn3 = np.mean(logal[88200:132300])
    sn4 = np.mean(logal[132300:176400])
    
    print(sn1)
    print(sn2)
    print(sn3)
    print(sn4)
    
    abs(FFT_side)
    for a in range(500):
        FFT_side[a] = 0
    
    plt.subplot(311)
    p1 = plt.plot(t[44100:], signal[44100:], "g") # plotting the signal
    plt.xlabel('Time')
    plt.ylabel('Amplitude')
    
    plt.subplot(312)
    p1 = plt.plot(t[44100:], logal, "r") # plotting the signal
    plt.xlabel('Time')
    plt.ylabel('Amplitude')
    
    plt.subplot(313)
    p3 = plt.plot(freqs_side, abs(FFT_side), "b") # plotting the positive fft spectrum
    plt.xlabel('Frequency (Hz)')
    plt.ylabel('Count single-sided')
    plt.show()
    

    First plot is amplitude vs time, second one is logarithm of previous graph and the last one is FFT. In sn1,sn2 part I tried to calculate dB from signal. First I took log and then calculated mean value for each second. It did not give me a clear relationship. I also tried this and did not worked.

    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.io.wavfile as wf
    
    fs, signal = wf.read('output.wav')  # Load the file
    ref = 32768  # 0 dBFS is 32678 with an int16 signal
    
    N = 8192
    win = np.hamming(N)                                                       
    x = signal[0:N] * win                             # Take a slice and multiply by a window
    sp = np.fft.rfft(x)                               # Calculate real FFT
    s_mag = np.abs(sp) * 2 / np.sum(win)              # Scale the magnitude of FFT by window and factor of 2,
                                                      # because we are using half of FFT spectrum
    s_dbfs = 20 * np.log10(s_mag / ref)               # Convert to dBFS
    freq = np.arange((N / 2) + 1) / (float(N) / fs)   # Frequency axis
    plt.plot(freq, s_dbfs)
    plt.grid(True)
    

    So which steps should I perform? (Sum/mean all freq amplitudes then take log or reverse, or perform it for signal etc.)

    import numpy as np
    import matplotlib.pyplot as plt
    import scipy.io.wavfile as wf
    
    fs, signal = wf.read('db1.wav') 
    
    signal2 = signal[44100:]
    chunk_size = 44100
    num_chunk  = len(signal2) // chunk_size
    sn = []
    for chunk in range(0, num_chunk):
        sn.append(np.mean(signal2[chunk*chunk_size:(chunk+1)*chunk_size].astype(float)**2))
    
    print(sn)
    
    logsn = 20*np.log10(sn)
    
    print(logsn)
    

    Output:

    [4.6057844427695475e+17, 5.0025315250895744e+17, 5.028593412665193e+17, 4.910948397471887e+17]
    [353.26607217 353.98379668 354.02893044 353.82330741]
    

    revised plot

  • Selim Turkoglu
    Selim Turkoglu over 5 years
    I added the plot of first graph. According to your code the first and last part of the sn is positive, middle ones are negative: [16105357.619081633, -3698520.4044557824, -4771604.623208617, 48875571.17558957]. First and last second is silence (around 33 dB), between these there are vacuum cleaner (around 55 dB). I do not understand why first and last value is not same or close. Also because of negativity, I faced with error: [16105357.619081633, -3698520.4044557824, -4771604.623208617, 48875571.17558957]
  • SleuthEye
    SleuthEye over 5 years
    sn being the mean of squared quantities (thus positive) shouldn't be negative. Assuming you've used my code exactly as-is (i.e. without typos), then it may be that signal contains integer values which overflow during the squaring operation. You could try (signal[...].astype(float))**2.
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    You can check the edit in first entry. I just added .astype(float) to the previous one. Still I'm performing it on same wav file. Results are too similar that I cannot distinguish silence and vacuum cleaner. @SleuthEye
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    Additionally we always checked mean value. Bu at silence amplitude is around -0.5, at noise amplitude is around between -0.25 -0.75. So mean value of silence -0.5, but mean value of noise is also -0.5 from (-0.25-0.75)/2. I know you said RMS but mean confused me @SleuthEye
  • SleuthEye
    SleuthEye over 5 years
    It looks like you have a fairly large bias in your input signal. I'm not sure exactly where it's coming from, but typical audio usually oscillate around 0. You'd have better results once you figure out where that bias comes from and remove it (else you could remove the mean value of the signal over smallish windows of say 100ms).
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    I fixed that bias issue (it was adding -5x10^8 so I substracted). My output is now: [2.3161987933791843e+17, 2.5553999430996653e+17, 2.568668790355783e+17, 2.4620444302688246e+17] [347.29551662 348.14917762 348.19416218 347.82591772] Also you can check new graph of first code @SleuthEye
  • SleuthEye
    SleuthEye over 5 years
    From new graph, with a maximum amplitude of ~2e8, the mean squared value should be less than ~4e16, so your output of ~2.5e+17 does not seem to match.
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    I presented the exact code and plot. Why I faced with such that issue? Also mean squared values are close for both silence and noise situation. Additionally first I can took the abs(signal) then performed loop. sn is still negative without astype. Is it okay? @SleuthEye
  • SleuthEye
    SleuthEye over 5 years
    In one case you are loading db1.wav and another output.wav I assume that might be the reason for the difference. What does print(signal2.dtype) give you?
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    int32. wav files are same, I just changed the name because I tried different versions to ensure I wrote correct. Overflowing issue may cause this problem as you pointed before since Python is dealing with huge numbers. @SleuthEye
  • SleuthEye
    SleuthEye over 5 years
    Right now with chunks of about 1sec you end up capturing some of the noisier part of the signal in the first quieter chunk which reduces the difference between the noisy and quiet part. Try splitting the signal in smaller chunks.
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    For 8820: [2.139055863683949e+17, 2.249777252317077e+17, 2.3267567615403248e+17, 2.3808157320083248e+17, 2.484 5883573462458e+17, 2.5547951221054723e+17, 2.5414628804703328e+17, 2.5556553717743168e+17, 2.5664029 974492352e+17, 2.558683343698968e+17, 2.591194857540977e+17, 2.5686734537040586e+17, 2.5666697693860 944e+17, 2.5736359217607843e+17, 2.543169949387e+17, 2.5226561521156406e+17, 2.4434635201936656e+17, 2.448911185499071e+17, 2.4480967355009898e+17, 2.4470945580347578e+17] @SleuthEye
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    [346.60444254 347.04279043 347.33501969 347.53451567 347.90508892 348.14712157 348.10167542 348.15004578 348.18649708 348.16033084 348.27000146 348.19417795 348.18739991 348.21094219 348.10750766 348.03716117 347.76011719 347.7794607 347.7765715 347.77301502]
  • SleuthEye
    SleuthEye over 5 years
    Out of curiosity what's np.min & np.max on those same chunks as float?
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    Chunk size:22050 max 60201654.0 min 6186645.5 max 262774628.0 min -233976502.0 max 281736633.5 min -273703238.5 max 268182806.0 min -250401337.5 max 314386002.5 min -255943050.5 max 282537845.0 min -248732147.0 max 295691067.5 min -236847509.5 max 20141079.0 min -5764759.5 Every max,min represents half second respectively. @SleuthEye
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    By playing with signal (mono/stereo) I got new results on same wav file. Chunk is still 22050. [1000700752140674.9, 1660613646741485.5, 5472748741534938.0, 5354087369440628.0, 5607701170444393.0, 5266715634192847.0, 1259943910814627.2, 54102395761750.03] [300.00608452 304.40537205 314.7641102 314.57370908 314.97569725 314.4307974 302.00702424 274.66432994] As I understood; around 300 is silence, around 315 is vacuum cleaner. Not a significant difference but may work to distinguish If you say results seem appropriate. @SleuthEye
  • SleuthEye
    SleuthEye over 5 years
    Other than the fact that I had incorrectly provided you with the formula 20*np.log10() instead of 10*np.log10(), these values are more in line with what I'd expect from the time signal plots you've shown. If the mono/stereo affected the result, I'd guess you probably had overflow when mixing the channels. If that's the case, you could fix the mixing with signal = signal.astype(float).sum(axis=1) / 2
  • Selim Turkoglu
    Selim Turkoglu over 5 years
    I corrected according to your reply [1000700752140674.9, 1660613646741485.5, 5472748741534938.0, 5354087369440628.0, 5607701170444393.0, 5266715634192847.0, 1259943910814627.2, 54102395761750.03] [150.00304226 152.20268603 157.3820551 157.28685454 157.48784862 157.2153987 151.00351212 137.33216497] In my previous answer I wrote signal.sum(axis=1) / 2 now added float part and changed multiplexer to 10. Seems fine and same. In output 137 is silence, 157 is noise. And the actual difference was around 20 dB which is so nice. Thanks a lot
  • WinEunuuchs2Unix
    WinEunuuchs2Unix over 3 years
    I'm curious if you guys ever made sample output of end result?