how to get wav samples from a wav file?

12,043

Solution 1

The wave module of the standard library is the key: after of course import wave at the top of your code, wave.open('the.wav', 'r') returns a "wave read" object from which you can read frames with the .readframes method, which returns a string of bytes which are the samples... in whatever format the wave file has them (you can determine the two parameters relevant to decomposing frames into samples with the .getnchannels method for the number of channels, and .getsampwidth for the number of bytes per sample).

The best way to turn the string of bytes into a sequence of numeric values is with the array module, and a type of (respectively) 'B', 'H', 'L' for 1, 2, 4 bytes per sample (on a 32-bit build of Python; you can use the itemsize value of your array object to double-check this). If you have different sample widths than array can provide you, you'll need to slice up the byte string (padding each little slice appropriately with bytes worth 0) and use the struct module instead (but that's clunkier and slower, so use array instead if you can).

Solution 2

You can use the wave module. First you should read the metadata, such us sample size or the number of channels. Using the readframes() method, you can read samples, but only as a byte string. Based on the sample format, you have to convert them to samples using struct.unpack().

Alternatively, if you want the samples as an array of floating-point numbers, you can use SciPy's io.wavfile module.

Solution 3

Here's a function to read samples from a wave file (tested with mono & stereo):

def read_samples(wave_file, nb_frames):
    frame_data = wave_file.readframes(nb_frames)
    if frame_data:
        sample_width = wave_file.getsampwidth()
        nb_samples = len(frame_data) // sample_width
        format = {1:"%db", 2:"<%dh", 4:"<%dl"}[sample_width] % nb_samples
        return struct.unpack(format, frame_data)
    else:
        return ()

And here's the full script that does windowed mixing or concatenating of multiple .wav files. All input files need to have the same params (# of channels and sample width).

import argparse
import itertools
import struct
import sys
import wave

def _struct_format(sample_width, nb_samples):
    return {1:"%db", 2:"<%dh", 4:"<%dl"}[sample_width] % nb_samples

def _mix_samples(samples):
    return sum(samples)//len(samples)

def read_samples(wave_file, nb_frames):
    frame_data = wave_file.readframes(nb_frames)
    if frame_data:
        sample_width = wave_file.getsampwidth()
        nb_samples = len(frame_data) // sample_width
        format = _struct_format(sample_width, nb_samples)
        return struct.unpack(format, frame_data)
    else:
        return ()

def write_samples(wave_file, samples, sample_width):
    format = _struct_format(sample_width, len(samples))
    frame_data = struct.pack(format, *samples)
    wave_file.writeframes(frame_data)

def compatible_input_wave_files(input_wave_files):
    nchannels, sampwidth, framerate, nframes, comptype, compname = input_wave_files[0].getparams()
    for input_wave_file in input_wave_files[1:]:
        nc,sw,fr,nf,ct,cn = input_wave_file.getparams()
        if (nc,sw,fr,ct,cn) != (nchannels, sampwidth, framerate, comptype, compname):
            return False
    return True

def mix_wave_files(output_wave_file, input_wave_files, buffer_size):
    output_wave_file.setparams(input_wave_files[0].getparams())
    sampwidth = input_wave_files[0].getsampwidth()
    max_nb_frames = max([input_wave_file.getnframes() for input_wave_file in input_wave_files])
    for frame_window in xrange(max_nb_frames // buffer_size + 1):
        all_samples = [read_samples(wave_file, buffer_size) for wave_file in input_wave_files]
        mixed_samples = [_mix_samples(samples) for samples in itertools.izip_longest(*all_samples, fillvalue=0)]
        write_samples(output_wave_file, mixed_samples, sampwidth)

def concatenate_wave_files(output_wave_file, input_wave_files, buffer_size):
    output_wave_file.setparams(input_wave_files[0].getparams())
    sampwidth = input_wave_files[0].getsampwidth()
    for input_wave_file in input_wave_files:
        nb_frames = input_wave_file.getnframes()
        for frame_window in xrange(nb_frames // buffer_size + 1):
            samples = read_samples(input_wave_file, buffer_size)
            if samples:
                write_samples(output_wave_file, samples, sampwidth)

def argument_parser():
    parser = argparse.ArgumentParser(description='Mix or concatenate multiple .wav files')
    parser.add_argument('command', choices = ("mix", "concat"), help='command')
    parser.add_argument('output_file', help='ouput .wav file')
    parser.add_argument('input_files', metavar="input_file", help='input .wav files', nargs="+")
    parser.add_argument('--buffer_size', type=int, help='nb of frames to read per iteration', default=1000)
    return parser

if __name__ == '__main__':
    args = argument_parser().parse_args()

    input_wave_files = [wave.open(name,"rb") for name in args.input_files]
    if not compatible_input_wave_files(input_wave_files):
        print "ERROR: mixed wave files must have the same params."
        sys.exit(2)

    output_wave_file = wave.open(args.output_file, "wb")
    if args.command == "mix":
        mix_wave_files(output_wave_file, input_wave_files, args.buffer_size)
    elif args.command == "concat":
        concatenate_wave_files(output_wave_file, input_wave_files, args.buffer_size)

    output_wave_file.close()
    for input_wave_file in input_wave_files:
        input_wave_file.close()
Share:
12,043
kaki
Author by

kaki

Updated on June 12, 2022

Comments

  • kaki
    kaki about 2 years

    I want to know how to get samples out of a .wav file in order to perform windowed join of two .wav files.

    Can any one please tell how to do this?