How to process all .wav files in a folder and append results to a python list

12,424

Solution 1

If you really just want the audio data.

import wave, os, glob
zero = []
path = '/path/to/directory'
for filename in glob.glob(os.path.join(path, '*.wav')):
    w = wave.open(filename, 'r')
    d = w.readframes(w.getnframes())
    zero.append(d)
    w.close()

Solution 2

Solution

I realized this is a coursework assignment from the Data Science Professional Program by Microsoft, and haven't seen any good answers yet. So here's the scenario you gave us:

  • 50 .wav files in a folder
  • Need to loop through each .wav file and load them all in
  • You want to append only the audio data data and not the sample_rate as returned from scipy.io.wavfile

You already have an empty list named zero I assumed:

zero = []

Here's the solution (change the path to match your directory):

import os
import glob
path = "Datasets/recordings/"
files = os.listdir(path)

for filename in glob.glob(os.path.join(path, '*.wav')):
    samplerate, data = wavfile.read(filename)
    zero.append(data)

What you don't want to be doing is to simply loop through all files in that directory. Most OS create files (git is one example, .DS_Store on Mac OS is another) and so you want to only read from files with a *.wav extension only.

You then use a simple .append() to append them.

Sanity Check

len(zero) should return 50 since you have 50 .wav files in that specified directory.

Good luck!

Share:
12,424
AlK
Author by

AlK

Updated on December 01, 2022

Comments

  • AlK
    AlK over 1 year

    I have 50 .wav files in a folder and I need to loop through the dataset and load up all 50 files. For each audio file, I should simply append the audio data (not the sample_rate, just the data) to my Python list named 'zero'.

    Could you help me? Thank you.

  • AlK
    AlK over 7 years
    Thank you for your reply. I am completing an assignment which states that: "Each .wav file is actually just a bunch of numeric samples, "sampled" from the analog signal. Sampling is a type of discretization. When we mention 'samples', we mean observations. When we mention 'audio samples', we mean the actually "features" of the audio file." So I would expect to see numbers in the list zero. However the contents of the list look like this: "b'\x8f\xfeQ\xfe%\xfe\xe1\xfd\xc5\xfd\xd3\xfd\xf0\xfd9\xfev\‌​xfe\xcf\xfe \xff\xa8\xff\n\x00".
  • AlK
    AlK over 7 years
    Then the assignment requires that I : "convert zero into a DataFrame. When you do so, set the dtype to np.int16, since the input audio files are 16 bits per sample." When I attempt to do so by writing: "zero =pd.DataFrame(dtype=np.int16)" I get an empty dataframe. Could you help me further?
  • charliebeckwith
    charliebeckwith over 7 years
    Those are bytes. Look up the wave python lib. I'm sure you can figure out how to get it to do what you need.
  • charliebeckwith
    charliebeckwith over 7 years
    16bits = 2 bytes. For loop, read frame. Convert returned bytestring to int.
  • AlK
    AlK over 7 years
    Thank you. I opted finally to use scipy.io.wavfile.read which imports directly the data in the integer format.