FFmpeg - resampling from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16 got very bad sound quality (slow, out of tune, noise)

14,413

Solution 1

You need to remember that AV_SAMPLE_FMT_FLTP is a planar mode. If your code is expecting an AV_SAMPLE_FMT_S16 (interleaved mode) output, you need to reorder the samples after converting. Considering 2 audio channels and using interleaved mode, the samples are ordered as "c0, c1, c0, c1, c0, c1, ...". Planar mode is "c0, c0, c0, ..., c1, c1, c1, ...".

Similar question: What is the difference between AV_SAMPLE_FMT_S16P and AV_SAMPLE_FMT_S16?

Details here: http://www.ffmpeg.org/doxygen/2.0/samplefmt_8h.html

Solution 2

I've had good luck doing something similar. On your code block

int nb_samples = frame_->nb_samples;
int channels = frame_->channels;
int outputBufferLen = nb_samples & channels * 2;
auto outputBuffer = (int16_t*)outbuf;

for (int i = 0; i < nb_samples; i++) {
   for (int c = 0; c < channels; c++) {
      float* extended_data = (float*)frame_->extended_data[c];
      float sample = extended_data[i];
      if (sample < -1.0f) sample = -1.0f;
      else if (sample > 1.0f) sample = 1.0f;
      outputBuffer[i * channels + c] = (int16_t)round(sample * 32767.0f);
   }

}

Try replacing with the following:

int nb_samples = frame_->nb_samples;
int channels = frame_->channels;
int outputBufferLen = nb_samples & channels * 2;
auto outputBuffer = (int16_t*)outbuf;

for(int i=0; i < nb_samples; i++) {
   for(int c=0; c < channels; c++) {
      outputBuffer[i*channels+c] = (int16_t)(((float *)frame_->extended_data[c]) * 32767.0f);
   }
}
Share:
14,413
kaienfr
Author by

kaienfr

Updated on July 26, 2022

Comments

  • kaienfr
    kaienfr almost 2 years

    I was confused with resampling result in new ffmpeg. I decode an AAC audio into PCM, the ffmpeg show audio information as:

    Stream #0:0: Audio: aac, 44100 Hz, stereo, fltp, 122 kb/s
    

    In new ffmpeg, the output samples are fltp format, so I have to convert it from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16

    PS: in old ffmpeg as libavcodec 54.12.100, it is directly S16, so do not need resampling and without any sound quality problem.

    Then I've tried three ways to resampling,

    1. using swr_convert

    2. using avresample_convert

    3. convert manualy

    But all of them yield the same result, the sound quality is really bad, very slow and out of tune, with some noise too.

    My resampling code is as follows:

    void resampling(AVFrame* frame_, AVCodecContext* pCodecCtx, int64_t want_sample_rate, uint8_t* outbuf){
        SwrContext      *swrCtx_ = 0;
        AVAudioResampleContext *avr = 0;
    
        // Initializing the sample rate convert. We only really use it to convert float output into int.
        int64_t wanted_channel_layout = AV_CH_LAYOUT_STEREO;
    
    #ifdef AV_SAMPLEING
        avr = avresample_alloc_context();
        av_opt_set_int(avr, "in_channel_layout", frame_->channel_layout, 0);
        av_opt_set_int(avr, "out_channel_layout", wanted_channel_layout, 0);
        av_opt_set_int(avr, "in_sample_rate", frame_->sample_rate, 0);
        av_opt_set_int(avr, "out_sample_rate", 44100, 0);
        av_opt_set_int(avr, "in_sample_fmt", pCodecCtx->sample_fmt, 0); //AV_SAMPLE_FMT_FLTP
        av_opt_set_int(avr, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);
        av_opt_set_int(avr, "internal_sample_fmt", pCodecCtx->sample_fmt, 0);
        avresample_open(avr);
        avresample_convert(avr, &outbuf, frame_->linesize[0], frame_->nb_samples, frame_->extended_data, frame_->linesize[0], frame_->nb_samples);
        avresample_close(avr);
        return;
    #endif
    
    #ifdef USER_SAMPLEING
        if (pCodecCtx->sample_fmt == AV_SAMPLE_FMT_FLTP)
        {
                int nb_samples = frame_->nb_samples;
                int channels = frame_->channels;
                int outputBufferLen = nb_samples & channels * 2;
                auto outputBuffer = (int16_t*)outbuf;
    
                for (int i = 0; i < nb_samples; i++)
                {
                        for (int c = 0; c < channels; c++)
                        {
                                float* extended_data = (float*)frame_->extended_data[c];
                                float sample = extended_data[i];
                                if (sample < -1.0f) sample = -1.0f;
                                else if (sample > 1.0f) sample = 1.0f;
                                outputBuffer[i * channels + c] = (int16_t)round(sample * 32767.0f);
                        }
                }
                return;
        }
    #endif
        swrCtx_ = swr_alloc_set_opts(
                NULL, //swrCtx_,
                wanted_channel_layout,
                AV_SAMPLE_FMT_S16,
                want_sample_rate,
                pCodecCtx->channel_layout,
                pCodecCtx->sample_fmt,
                pCodecCtx->sample_rate,
                0,
                NULL);
    
        if (!swrCtx_ || swr_init(swrCtx_) < 0) {
                printf("swr_init: Failed to initialize the resampling context");
                return;
        }
    
        // convert audio to AV_SAMPLE_FMT_S16
        int swrRet = swr_convert(swrCtx_, &outbuf, frame_->nb_samples, (const uint8_t **)frame_->extended_data, frame_->nb_samples);
        if (swrRet < 0) {
                printf("swr_convert: Error while converting %d", swrRet);
                return;
        }
    }
    

    What should to do?

    PS1: playing with ffplay is just all right.

    PS2: save resample S16 PCM into file and playing it will have the same sound quality problem.

    Thanks a lot for your help and suggestions!


    I've also noticed that, in old ffmpeg, aac is recongized as FLT format and directly decoded into 16-bit PCM, while in new ffmpeg, aac is counted as FLTP format and produce still 32-bit IEEE float output.

    Thus the same code will produce quite different outputs with different versions of ffmpeg. Then, I'd like to ask what is the right way to convert a AAC audio to 16-bit PCM in new version?

    Thanks a lot in advance!