Decoding Opus audio data

29,128

Solution 1

I think the opus_demo.c program from the source tarball has what you want.

It's pretty complicated though, because of all the unrelated code pertaining to

  • encoding, parsing encoder parameters from command line arguments
  • artificial packet loss injection
  • random framesize selection/changing on-the-fly
  • inband FEC (meaning decoding into two buffers, toggling between the two)
  • debug and verification
  • bit-rate statistics reporting

Removing all these bits is a very tedious job, as it turns out. But once you do, you end up with pretty clean, understandable code, see below.

Note that I

  • kept the 'packet-loss' protocol code (even though packet loss won't happen reading from a file) for reference
  • kept the code that verifies the final range after decoding each frame

Mostly because it doesn't seem to complicate the code, and you might be interested in it.

I tested this program in two ways:

  • aurally (by verifying that a mono wav previously encoded using opus_demo was correctly decoded using this stripped decoder). The test wav was ~23Mb, 2.9Mb compressed.
  • regression tested alongside the vanilla opus_demo when called with ./opus_demo -d 48000 1 <opus-file> <pcm-file>. The resultant file had the same md5sum checksum as the one decoded using the stripped decoder here.

MAJOR UPDATE I C++-ified the code. This should get you somewhere using iostreams.

  • Note the loop on fin.readsome now; this loop could be made 'asynchronous' (i.e. it could be made to return, and continue reading when new data arrives (on the next invocation of your Decode function?)[1]
  • I have cut the dependencies on opus.h from the header file
  • I have replaced "all" manual memory management by standard library (vector, unique_ptr) for exception safety and robustness.
  • I have implemented an OpusErrorException class deriving from std::exception which is used to propagate errors from libopus

See all the code + Makefile here: https://github.com/sehe/opus/tree/master/contrib

[1] for true async IO (e.g. network or serial communinication) consider using Boost Asio, see e.g. http://www.boost.org/doc/libs/1_53_0/doc/html/boost_asio/overview/networking/iostreams.html

Header File

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include <stdexcept>
#include <memory>
#include <iosfwd>

struct OpusErrorException : public virtual std::exception
{
    OpusErrorException(int code) : code(code) {}
    const char* what() const noexcept;
private:
    const int code;
};

struct COpusCodec
{
    COpusCodec(int32_t sampling_rate, int channels);
    ~COpusCodec();

    bool decode_frame(std::istream& fin, std::ostream& fout);
private:
    struct Impl;
    std::unique_ptr<Impl> _pimpl;
};

Implementation File

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include "COpusCodec.hpp"
#include <vector>
#include <iomanip>
#include <memory>
#include <sstream>

#include "opus.h"

#define MAX_PACKET 1500

const char* OpusErrorException::what() const noexcept
{
    return opus_strerror(code);
}

// I'd suggest reading with boost::spirit::big_dword or similar
static uint32_t char_to_int(char ch[4])
{
    return static_cast<uint32_t>(static_cast<unsigned char>(ch[0])<<24) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[1])<<16) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[2])<< 8) |
        static_cast<uint32_t>(static_cast<unsigned char>(ch[3])<< 0);
}

struct COpusCodec::Impl
{
    Impl(int32_t sampling_rate = 48000, int channels = 1)
    : 
        _channels(channels),
        _decoder(nullptr, &opus_decoder_destroy),
        _state(_max_frame_size, MAX_PACKET, channels)
    {
        int err = OPUS_OK;
        auto raw = opus_decoder_create(sampling_rate, _channels, &err);
        _decoder.reset(err == OPUS_OK? raw : throw OpusErrorException(err) );
    }

    bool decode_frame(std::istream& fin, std::ostream& fout)
    {
        char ch[4] = {0};

        if (!fin.read(ch, 4) && fin.eof())
            return false;

        uint32_t len = char_to_int(ch);

        if(len>_state.data.size())
            throw std::runtime_error("Invalid payload length");

        fin.read(ch, 4);
        const uint32_t enc_final_range = char_to_int(ch);
        const auto data = reinterpret_cast<char*>(&_state.data.front());

        size_t read = 0ul;
        for (auto append_position = data; fin && read<len; append_position += read)
        {
            read += fin.readsome(append_position, len-read);
        }

        if(read<len)
        {
            std::ostringstream oss;
            oss << "Ran out of input, expecting " << len << " bytes got " << read << " at " << fin.tellg();
            throw std::runtime_error(oss.str());
        }

        int output_samples;
        const bool lost = (len==0);
        if(lost)
        {
            opus_decoder_ctl(_decoder.get(), OPUS_GET_LAST_PACKET_DURATION(&output_samples));
        }
        else
        {
            output_samples = _max_frame_size;
        }

        output_samples = opus_decode(
                _decoder.get(), 
                lost ? NULL : _state.data.data(),
                len,
                _state.out.data(),
                output_samples,
                0);

        if(output_samples>0)
        {
            for(int i=0; i<(output_samples)*_channels; i++)
            {
                short s;
                s=_state.out[i];
                _state.fbytes[2*i]   = s&0xFF;
                _state.fbytes[2*i+1] = (s>>8)&0xFF;
            }
            if(!fout.write(reinterpret_cast<char*>(_state.fbytes.data()), sizeof(short)* _channels * output_samples))
                throw std::runtime_error("Error writing");
        }
        else
        {
            throw OpusErrorException(output_samples); // negative return is error code
        }

        uint32_t dec_final_range;
        opus_decoder_ctl(_decoder.get(), OPUS_GET_FINAL_RANGE(&dec_final_range));

        /* compare final range encoder rng values of encoder and decoder */
        if(enc_final_range!=0
                && !lost && !_state.lost_prev
                && dec_final_range != enc_final_range)
        {
            std::ostringstream oss;
            oss << "Error: Range coder state mismatch between encoder and decoder in frame " << _state.frameno << ": " <<
                    "0x" << std::setw(8) << std::setfill('0') << std::hex << (unsigned long)enc_final_range <<
                    "0x" << std::setw(8) << std::setfill('0') << std::hex << (unsigned long)dec_final_range;

            throw std::runtime_error(oss.str());
        }

        _state.lost_prev = lost;
        _state.frameno++;

        return true;
    }
private:
    const int _channels;
    const int _max_frame_size = 960*6;
    std::unique_ptr<OpusDecoder, void(*)(OpusDecoder*)> _decoder;

    struct State
    {
        State(int max_frame_size, int max_payload_bytes, int channels) :
            out   (max_frame_size*channels),
            fbytes(max_frame_size*channels*sizeof(decltype(out)::value_type)),
            data  (max_payload_bytes)
        { }

        std::vector<short>         out;
        std::vector<unsigned char> fbytes, data;
        int32_t frameno   = 0;
        bool    lost_prev = true;
    };
    State _state;
};

COpusCodec::COpusCodec(int32_t sampling_rate, int channels)
    : _pimpl(std::unique_ptr<Impl>(new Impl(sampling_rate, channels)))
{
    //
}

COpusCodec::~COpusCodec()
{
    // this instantiates the pimpl deletor code on the, now-complete, pimpl class
}

bool COpusCodec::decode_frame(
        std::istream& fin,
        std::ostream& fout)
{
    return _pimpl->decode_frame(fin, fout);
}

test.cpp

// (c) Seth Heeren 2013
//
// Based on src/opus_demo.c in opus-1.0.2
// License see http://www.opus-codec.org/license/
#include <fstream>
#include <iostream>

#include "COpusCodec.hpp"

int main(int argc, char *argv[])
{
    if(argc != 3)
    {
        std::cerr << "Usage: " << argv[0] << " <input> <output>\n";
        return 255;
    }

    std::basic_ifstream<char> fin (argv[1], std::ios::binary);
    std::basic_ofstream<char> fout(argv[2], std::ios::binary);

    if(!fin)  throw std::runtime_error("Could not open input file");
    if(!fout) throw std::runtime_error("Could not open output file");

    try
    {
        COpusCodec codec(48000, 1);

        size_t frames = 0;
        while(codec.decode_frame(fin, fout))
        {
            frames++;
        }

        std::cout << "Successfully decoded " << frames << " frames\n";
    }
    catch(OpusErrorException const& e)
    {
        std::cerr << "OpusErrorException: " << e.what() << "\n";
        return 255;
    }
}

Solution 2

libopus provides an API for turning opus packets into chunks of PCM data, and vice-versa.

But to store opus packets in a file, you need some kind of container format that stores the packet boundaries. opus_demo is, well, a demo app: it has its own minimal container format for testing purposes that is not documented, and thus files produced by opus_demo should not be distributed. The standard container format for opus files is Ogg, which also provides support for metadata and sample-accurate decoding and efficient seeking for variable-bitrate streams. Ogg Opus files have the extension ".opus".

The Ogg Opus spec is at https://wiki.xiph.org/OggOpus.

(Since Opus is also a VoIP codec, there are uses of Opus that do not require a container, such as transmitting Opus packets directly over UDP.)

So firstly you should encode your files using opusenc from opus-tools, not opus_demo. Other software can produce Ogg Opus files too (I believe gstreamer and ffmpeg can, for example) but you can't really go wrong with opus-tools as it's the reference implementation.

Then, assuming your files are standard Ogg Opus files (that can be read by, say, Firefox), what you need to do is: (a) extract opus packets from the Ogg container; (b) pass the packets to libopus and get raw PCM back.

Conveniently, there's a library called libopusfile that does precisely this. libopusfile supports all of the features of Ogg Opus streams, including metadata and seeking (including seeking over an HTTP connection).

libopusfile is available at https://git.xiph.org/?p=opusfile.git and https://github.com/xiph/opusfile. The API is documented here, and opusfile_example.c (xiph.org | github) provides example code for decoding to WAV. Since you're on windows I should add there are prebuilt DLLs on the downloads page.

Share:
29,128
tmighty
Author by

tmighty

Updated on March 24, 2020

Comments

  • tmighty
    tmighty over 4 years

    I am trying to decode an Opus file back to raw 48 kHz. However I am unable to find any sample code to do that.

    My current code is this:

    void COpusCodec::Decode(unsigned char* encoded, short* decoded, unsigned int len)
    {
         int max_size=960*6;//not sure about this one
    
         int error;
         dec = opus_decoder_create(48000, 1, &error);//decode to 48kHz mono
    
         int frame_size=opus_decode(dec, encoded, len, decoded, max_size, 0);
    }
    

    The argument "encoded" might be larger amounts of data, so I think I have to split it into frames. I am not sure how I could do that.

    And with being a beginner with Opus, I am really afraid to mess something up.

    Could anybody perhaps help?

  • tmighty
    tmighty about 11 years
    Thank you for your efforts and your explanations. Could you relate to my code more exactely? I have worked with the demo code from Opus website as well, and it works, no doubt. But I don't understand it. I need it in a much simpler form. I have read the encoded bytes, and I would like to turn it into decoded shorts. I don't have the file open for reading anymore, so the fread(ch, 1, 4, fin) is not needed, but I also don't know how I could do it without it because I don't understand the sample code. I am glad that you do! You seem to be one of the few people in the world who do.
  • sehe
    sehe about 11 years
    See the comment at the original post. I think you should be able to work exactly with the above, potentially after adding a buffer to decouple from the external users of COpusCodec::Decode in case you can't "control" them and they might call your API with indiscriminate packet sizes.
  • sehe
    sehe about 11 years
    @tmighty I have added a major update with a C++ version of the code. Regression test checks out fine for my test files. Test program now much cleaner :) github.com/sehe/opus/blob/master/contrib/test.cpp. See also notes in answer.
  • sehe
    sehe about 11 years
    +1 for highly informative. I'd worked out the same ideas about file format, but it's not entirely clear to me whether the OP is interested in decoding realtime packets instead , perhaps receiving them over UDP ("mono" seems to indicate no hifi intentions, also the original tags included speex which I'd associate with voice chat applications)
  • Chase
    Chase almost 10 years
    Any chance you still have the C code as well? C++ streams give me nightmares.
  • sehe
    sehe almost 10 years
    @Chase sure. It's open source! And the link is right in the first sentence of my answer :)
  • sehe
    sehe almost 10 years
    @Chase just realized I had a cleaned up version of the C code there historically (why?!?! :)) and SO keeps older versions: stackoverflow.com/revisions/16551709/2
  • AnthumChris
    AnthumChris over 6 years
    Great mention of opusfile_example.c, I wish they had that listed on the Opus website.
  • Ibrahim
    Ibrahim over 5 years
    @sehe Please can you explain the structure of input file to decode raw opus. I guess that: The input file must include line by line opus packets and in the beginning of each line there must be 4 byte len information of the packet. is it true?
  • tytyryty
    tytyryty over 3 years
    Is that possible to do only (a) extract opus packets from the Ogg container; using libopusfile or this is end to end solution only?