How to process all .wav files in a folder and append results to a python list
Solution 1
If you really just want the audio data.
import wave, os, glob
zero = []
path = '/path/to/directory'
for filename in glob.glob(os.path.join(path, '*.wav')):
w = wave.open(filename, 'r')
d = w.readframes(w.getnframes())
zero.append(d)
w.close()
Solution 2
Solution
I realized this is a coursework assignment from the Data Science Professional Program by Microsoft, and haven't seen any good answers yet. So here's the scenario you gave us:
- 50 .wav files in a folder
- Need to loop through each .wav file and load them all in
- You want to append only the audio data
data
and not thesample_rate
as returned fromscipy.io.wavfile
You already have an empty list named zero
I assumed:
zero = []
Here's the solution (change the path
to match your directory):
import os
import glob
path = "Datasets/recordings/"
files = os.listdir(path)
for filename in glob.glob(os.path.join(path, '*.wav')):
samplerate, data = wavfile.read(filename)
zero.append(data)
What you don't want to be doing is to simply loop through all files in that directory. Most OS create files (git is one example, .DS_Store on Mac OS is another) and so you want to only read from files with a *.wav
extension only.
You then use a simple .append()
to append them.
Sanity Check
len(zero)
should return 50 since you have 50 .wav
files in that specified directory.
Good luck!
AlK
Updated on December 01, 2022Comments
-
AlK over 1 year
I have 50 .wav files in a folder and I need to loop through the dataset and load up all 50 files. For each audio file, I should simply append the audio data (not the sample_rate, just the data) to my Python list named 'zero'.
Could you help me? Thank you.
-
AlK over 7 yearsThank you for your reply. I am completing an assignment which states that: "Each .wav file is actually just a bunch of numeric samples, "sampled" from the analog signal. Sampling is a type of discretization. When we mention 'samples', we mean observations. When we mention 'audio samples', we mean the actually "features" of the audio file." So I would expect to see numbers in the list zero. However the contents of the list look like this: "b'\x8f\xfeQ\xfe%\xfe\xe1\xfd\xc5\xfd\xd3\xfd\xf0\xfd9\xfev\xfe\xcf\xfe \xff\xa8\xff\n\x00".
-
AlK over 7 yearsThen the assignment requires that I : "convert zero into a DataFrame. When you do so, set the dtype to np.int16, since the input audio files are 16 bits per sample." When I attempt to do so by writing: "zero =pd.DataFrame(dtype=np.int16)" I get an empty dataframe. Could you help me further?
-
charliebeckwith over 7 yearsThose are bytes. Look up the wave python lib. I'm sure you can figure out how to get it to do what you need.
-
charliebeckwith over 7 years16bits = 2 bytes. For loop, read frame. Convert returned bytestring to int.
-
AlK over 7 yearsThank you. I opted finally to use scipy.io.wavfile.read which imports directly the data in the integer format.