Convert 16 bit pcm to 8 bit

12,952

Solution 1

I can't see right now why it's not enough to just take the upper byte, i.e. discard the lower 8 bits of each sample.

That of course assumes that the samples are linear; if they're not then maybe you need to do something to linearize them before dropping bits.

short sixteenBit = 0xfeed;
byte eightBit = sixteenBit >> 8;
// eightBit is now 0xfe.

As suggested by AShelly in a comment, it might be a good idea to round, i.e. add 1 if the byte we're discarding is higher than half its maximum:

eightBit += eightBit < 0xff && ((sixteenBit & 0xff) > 0x80);

The test against 0xff implements clamping, so we don't risk adding 1 to 0xff and wrapping that to 0x00 which would be bad.

Solution 2

16-bit samples are usually signed, and 8-bit samples are usually unsigned, so the simplest answer is that you need to convert the 16-bit samples from signed (16-bit samples are almost always stored as a range from -32768 to +32767) to unsigned and then take the top 8 bits of the result. In C, this could be expressed as output = (unsigned char)((unsigned short)(input + 32768) >> 8). This is a good start, and might be good enough for your needs, but it won't sound very nice. It sounds rough because of "quantization noise".

Quantization noise is the difference between the original input and your algorithm's output. No matter what you do, you're going to have noise, and the noise will be "half a bit" on average. There's nothing you can do about that, but there are ways to make the noise less noticeable.

The main problem with the quantization noise is that it tends to form patterns. If the difference between input and output were completely random, things would actually sound fine, but instead the output will repeatedly be too high for a certain part of the waveform and too low for the next part. Your ear picks up on this pattern.

To have a result that sounds good, you need to add dithering. Dithering is a technique that tries to smooth-out the quantization noise. The simplest dithering just removes the patterns from the noise so that the noise patterns don't distract from the actual signal patterns. Better dithering can go a step further and take steps to reduce the noise by adding together the error values from multiple samples and then adding in a correction when the total error gets large enough to be worth correcting.

You can find explanations and code samples for various dithering algorithms online. One good area to investigate might be the SoX tool, http://en.wikipedia.org/wiki/SoX. Check the source for its dithering effect, and experiment with converting various sounds from 16-bit to 8-bit with and without dithering enabled. You will be surprised by the difference in quality that dithering can make when converting to 8-bit sound.

Solution 3

byteData = (byte) (((shortData +32768)>>8)& 0xFF) 

this worked for me.

Solution 4

Normalize the 16 bit samples, then rescale by the maximum value of your 8 bit sample.

This yields a more accurate conversion as the lower 8 bits of each sample aren't being discarded. However, my solution is more computationally expensive than the selected answer.

Share:
12,952
gop
Author by

gop

Mobile applications development

Updated on June 17, 2022

Comments

  • gop
    gop almost 2 years

    I have pcm audio stored in a byte array. It is 16 bits per sample. I want to make it 8 bit per sample audio.

    Can anyone suggest a good algorithm to do that?

    I haven't mentioned the bitrate because I think it isn't important for the algorithm - right?

  • Sunilsingh
    Sunilsingh about 13 years
    You might also want to round instead of truncate. Add 'eightbit += (sixteenbit & 0x80)>>7;' to add 1 if the lower byte is more than half its range.
  • unwind
    unwind about 13 years
    @AShelly: true, that might be a good idea ... Your code will cause values in the range 0xff00 to 0xffff to wrap to 0x00 though, which is probably worse than not rounding at all. I'll edit.
  • gop
    gop about 13 years
    thanks. If my input is in a byte array(byte[] arr not short) does this mean to just drop half of the bytes i.e. take arr[0] , arr[2], arr[4] etc. ?
  • unwind
    unwind about 13 years
    @gosho-ot-pochivka: Yes, I think so. I'm a bit worried about sign issues, if the samples are signed 16-bit numbers, but since the sign bit will be preserved, it should be fine.