How to read a MP3 audio file into a numpy array / save a numpy array to MP3?
Solution 1
Calling ffmpeg
and manually parsing its stdout
as suggested in many posts about reading a MP3 is a tedious task (many corner cases because different number of channels are possible, etc.), so here is a working solution using pydub
(you need to pip install pydub
first).
This code allows to read a MP3 to a numpy array / write a numpy array to a MP3 file with a similar API than scipy.io.wavfile.read/write
:
import pydub
import numpy as np
def read(f, normalized=False):
"""MP3 to numpy array"""
a = pydub.AudioSegment.from_mp3(f)
y = np.array(a.get_array_of_samples())
if a.channels == 2:
y = y.reshape((-1, 2))
if normalized:
return a.frame_rate, np.float32(y) / 2**15
else:
return a.frame_rate, y
def write(f, sr, x, normalized=False):
"""numpy array to MP3"""
channels = 2 if (x.ndim == 2 and x.shape[1] == 2) else 1
if normalized: # normalized array - each item should be a float in [-1, 1)
y = np.int16(x * 2 ** 15)
else:
y = np.int16(x)
song = pydub.AudioSegment(y.tobytes(), frame_rate=sr, sample_width=2, channels=channels)
song.export(f, format="mp3", bitrate="320k")
Notes:
- It only works for 16-bit files for now (even if 24-bit WAV files are pretty common, I've rarely seen 24-bit MP3 files... Does this exist?)
normalized=True
allows to work with a float array (each item in [-1,1))
Usage example:
sr, x = read('test.mp3')
print(x)
#[[-225 707]
# [-234 782]
# [-205 755]
# ...,
# [ 303 89]
# [ 337 69]
# [ 274 89]]
write('out2.mp3', sr, x)
Solution 2
You can use audio2numpy library. Install with
pip install audio2numpy
Then, your code would be:
import audio2numpy as a2n
x,sr=a2n.audio_from_file("test.mp3")
For writing, use @Basj 's answer
Basj
I work on R&D involving Python, maths, machine learning, deep learning, data science, product design, and MacGyver solutions to complex problems. I love prototyping, building proofs-of-concept. For consulting/freelancing inquiries : [email protected]
Updated on July 09, 2022Comments
-
Basj almost 2 years
Is there a way to read/write a MP3 audio file into/from a
numpy
array with a similar API to scipy.io.wavfile.read and scipy.io.wavfile.write:sr, x = wavfile.read('test.wav') wavfile.write('test2.wav', sr, x)
?
Note:
pydub
'sAudioSegment
object doesn't give direct access to a numpy array.PS: I have already read Importing sound files into Python as NumPy arrays (alternatives to audiolab), tried all the answers, including those which requires to
Popen
ffmpeg and read the content from stdout pipe, etc. I have also read Trying to convert an mp3 file to a Numpy Array, and ffmpeg just hangs, etc., and tried the main answers, but there was no simple solution. After spending hours on this, I'm posting it here with "Answer your own question – share your knowledge, Q&A-style". I have also read How to create a numpy array from a pydub AudioSegment? but this does not easily cover the multi channel case, etc.