Algorithm to draw waveform from audio

c++ algorithm audio ffmpeg

16,804

Solution 1

First, you need to determine where on the screen each sample will end up.

int x = x0 + sample_number * (xn - x0) / number_of_samples;

Now, for all samples with the same x, determine the min and the max separately for positive and negative values. Draw a vertical line, a dark one from negative max to positive max, then a light one from negative min to positive min over the top of it.

Edit: thinking about this a little more, you probably want to use an average instead of the min for the inner lines.

Solution 2

EXPLANATION FOR EVERYBODY I am a developer of a dj app and was searching for similar answers. So i will explain all about the music waveform you may see in any software including audacity.

There are 3 types of waveforms used to display in any music software. Namely Samples, Average and RMS.

1) Samples are the actual music points presented in a graph, could be an array of raw audio data (points you see when you zoom the waveform in audacity).

2) Average: most commonly used, suppose you are displaying 3 minute song on screen, so a single point on screen must display atleast 100ms(approx) of the song which has many raw audio points, so for displaying we calculate the average of all the points in that 100ms duration, and so on for the rest of the track (dark blue big waveform in audacity).

3) RMS: similar to average but here instead of average, root mean square of the particular duration is taken (the small light blue waveform inside the blue one is rms waveform in audacity).

Now how to calculate waveforms.

1) Samples is raw data when you decode a song using any technique you get raw samples/points. Now based on the format of points you convert them to range -1 to 1, example if format is 16-bit you divide all points by 32768(maximum range for 16 bit number) and then draw the points.

2) for average waveform - first add all points converting negative values to positive, then multiply by 2 and then take average.

//samples is the array and nb_samples is the length of array
float sum = 0;
for(int i = 0 ; i < nb_samples ; i++){
    if(samples[i] < 0)
        sum += -samples[i];
    else
        sum += samples[i];
}
float average_point = (sum * 2) / nb_samples; //average after multiplying by 2
//now draw this point

3) RMS: its simple take the root mean sqaure - so first square every sample, then take the sum and then calculate the mean and then sqaure root. I will show in programming

//samples is the array and nb_samples is the length of array
float squaredsum = 0;
for(int i = 0 ; i < nb_samples ; i++){
    squaredsum += samples[i] * samples[i]; // square and sum
}
float mean = squaredsum / nb_samples; // calculated mean
float rms_point = Math.sqrt(mean); //now calculate square root in last
//now draw this point

Note here the samples is the array of points for calculating the point/pixel for a particular duration of song. example if you want to draw 1 minute of songs data in 60 pixels so the samples array will be the array of all points in 1 second, i.e the amount of audio points to be displayed in 1 pixel.

Hope this will help someone to clarify the concepts about audio waveform.

Solution 3

I think you are referring to a waveform described here.

http://manual.audacityteam.org/o/man/audacity_waveform.html

I have not read the whole page. But each vertical bar represents a window of waveform samples. The dark blue are the maximum positive and and minimum negative values in that window (I think). And the light blue is the RMS which is root mean squared. http://www.mathwords.com/r/root_mean_square.htm. (basically you square the values within each window, take an average, and then square root.

Hope this helps.

Solution 4

showwavespic

ffmpeg can draw a waveform with the showwavespic filter.

ffmpeg -i input -filter_complex "showwavespic=split_channels=1" output.png

See showwavespic filter documentation for options.

showwaves

You can also make a video of the live waveform with the showwaves filter.

ffmpeg -i input -filter_complex \
"showwaves=s=600x240:mode=line:split_channels=1,format=yuv420p[v]"  \
-map "[v]" -map 0:a -movflags +faststart output.mp4

See showwaves filter documentation for options.

Solution 5

There is a nice program audiowaveform from BBC R&D that does what you want, you might consult their sources.

View more solutions

16,804

yayuj

Updated on June 15, 2022

Comments

yayuj almost 2 years
I'm trying to draw a waveform from a raw audio file. I demuxed/decoded an audio file using FFmpeg and I have those informations: samples buffer, the size of the samples buffer, the duration of the audio file (in seconds), sample rate (44100, 48000, etc), sample size, sample format (uint8, int16, int32, float, double), and the raw audio data itself.

Digging on the Internet I found this algorithm (more here):

White Noise:

The Algorithm

All you need to do is randomize every sample from –amplitude to amplitude. We don’t care about the number of channels in most cases so we just fill every sample with a new random number.
```
Random rnd = new Random();
short randomValue = 0;

for (int i = 0; i < numSamples; i++)
{
    randomValue = Convert.ToInt16(rnd.Next(-amplitude, amplitude));
    data.shortArray[i] = randomValue;
}
```
It's really good but I don't want to draw that way, but this way:

Is there any algorithm or idea of how I can be drawing using the informations that I have?
yayuj over 9 years

For sure it will help. Thanks.
predatflaps about 8 years

I did some code based on using a bar graph to approximate a zig zag graph, i found the canvas draw functions to be a bit slow for memory, i found that if you divide 44100 by 8 or 16 , use max of every 16 samples, it still looks very clear, SR is then 2900 per second, it's fine, and saves memory... i found that graphics card was much faster to display vertices than attempting to do it in a textre, so i made a grapy of flat polygons as lines, in 2d canvas code it's much faster anyway you probably would miss that much compared ot sending ot graphics shadre to make polys. dx11 code is onunityforum
v01pe almost 4 years

After doing some research and experimentation myself, I found that calculating the max and min values per point (window of samples from the original file) and drawing the max upwards, the min downwards (I plan to post the algorithm as answer as well) looks closest to all audio software I tried (Reaper, Audacity, Reason, Live). When following the "average approach", the waveform shrinks considerably and I have to scale it up again, go get nice results, which look considerably different to the min/max approach or what I saw in common DAWs.
v01pe almost 4 years

"maximum positive and and minimum negative" – I did some experimentation and this looks the closest.
v01pe almost 4 years

Jup, found it in the source: WaveClip::GetWaveDisplay: github.com/audacity/audacity/blob/master/src/WaveClip.cpp - They calculate min, max and RMS per window for display.
CraftedCart over 3 years

Will note that squaredsum being a float here is important - I was trying to implement this myself and was having issues with the rms waveform disappearing because I was overflowing int32 squaredsum
Diljeet about 3 years

@CraftedCart In my opinion drawing maximum value from samples and drawing the rms value are the 2 most important waveforms. The average method is for complete song waveform(used by only few softwares), but mostly rms is used for complete song waveform (without the sqrt part)
Mehmet Efe Akça about 3 years

I was wondering if there is a way to export the waveform data (not the pcm data but the samples with RMS ran through them). I manually calculate RMS from the pcm data but it is pretty slow so I was thinking ffmpeg might have a filter for it.
llogan about 3 years

@MehmetEfeAkça Should be asked as a new question.