get the amplitude data from an mp3 audio files using python
MP3 is encoded wave (+ tags and other stuff). All you need to do is decode it using MP3 decoder. Decoder will give you whole audio data you need for further processing.
How to decode mp3? I am shocked there are so few available tools for Python. Although I found a good one in this question. It's called pydub and I hope I can use a sample snippet from author (I updated it with more info from wiki):
from pydub import AudioSegment
sound = AudioSegment.from_mp3("test.mp3")
# get raw audio data as a bytestring
raw_data = sound.raw_data
# get the frame rate
sample_rate = sound.frame_rate
# get amount of bytes contained in one sample
sample_size = sound.sample_width
# get channels
channels = sound.channels
Note that raw_data
is 'on air' at this point ;). Now it's up to you how do you want to use gathered data, but this module seems to give you everything you need.
Nik391
Love to solve different software engineering problems including Machine Learning, Artificial Intelligence, python, django, javascript, iOS development, fullstack development and lot more
Updated on June 18, 2022Comments
-
Nik391 about 2 years
I have an mp3 file and I want to basically plot the amplitude spectrum present in that audio sample. I know that we can do this very easily if we have a wav file. There are lot of python packages available for handling wav file format. However, I do not want to convert the file into wav format then store it and then use it. What I am trying to achieve is to get the amplitude of an mp3 file directly and even if I have to convert it into wav format, the script should do it on air during runtime without actually storing the file in the database. I know we can convert the file like follows:
from pydub import AudioSegment sound = AudioSegment.from_mp3("test.mp3") sound.export("temp.wav", format="wav")
and it creates the temp.wav which it supposed to but can we just use the content without storing the actual file?
-
Nik391 almost 8 yearsThats excellent. Thats exactly what I needed the Raw audio data.
-
P.hunter over 6 years@Nik391 can you please tell me how did you managed to use that raw data with respect to the Amplitude? that would be extremely helpful to me.
-
Nik391 over 6 years@PaulNicolashunter the raw data returned by the function is in a string format, you just need to convert it into an integer format using numpy something like this
np.fromstring(raw_data, dtype=np.int16)
-
P.hunter over 6 years@Nik391 so what i'm getting is that the string('raw_data') which is in a unicode format represents the Amplitude per second right. and converting it to a numpy array it gives us the integer representation of amplitude per second am I right?
-
Jacek over 6 yearsYou need
sample_size
andchannels
to interpretraw_data
as sound wave. Each frame ischannels*sample_size
bytes long. So if audio is mono (channel = 1) and sample_size = 2 bytes, you need to take first 2 bytes fromraw_data
, make 2-byte intereger out of it and you get the amplitude of the first frame. -
P.hunter over 6 yearsSo if channels are 2 it means audio is stereo and
sample_size
issample width
? and as my channels are 2 so i have to take first 2 byes of myraw_data
how i'm supposed to achieve that? isn't the raw_data is data of all the frames? -
Jacek over 6 yearsIf _ is a sample and you have 3 channels then song
|_ _ _| |_ _ _| |_ _ _|
has 6 samples, 3 frames. Each _ issample_size
bytes long. Ifsample_size = 2 bytes
then my song is 12 bytes long, and played at sample_rate = 6 Hz will have duration of 1 second. -
Jacek over 6 yearsyes, channels = 2 means audio is stereo. Each frame has information what to send to each channel, so channels are always synced together.
-
Jacek over 6 years"how i'm supposed to achieve that?" It's the matter of another question, how to deal with bytestring in Python language. Maybe this can help stackoverflow.com/questions/22824539/…
-
P.hunter over 6 yearsthanks mate, it was
*9 samples ,3 frames
because i see 9_
there, and then it means song is 18 bytes long is the sample size is 2 bytes right? and what about sample_width? does it has any connection with it? -
Jacek over 6 yearsyes, my bad, it has 9 samples ofc, and 18 bytes long, if
sample_size=2
.sample_size
issample_width
here.