How can I convert an FFmpeg AVFrame with pixel format AV_PIX_FMT_CUDA to a new AVFrame with pixel format AV_PIX_FMT_RGB

c++ ffmpeg h.264 cuvid

11,092

Solution 1

This my understanding of the hardware decoding on the latest FFMPeg 4.1 version. Below are my conclusion after studying the source code.

First I recommend to inspire yourself from the hw_decode example:

https://github.com/FFmpeg/FFmpeg/blob/release/4.1/doc/examples/hw_decode.c

With the new API, when you send a packet to the encoder using avcodec_send_packet(), then use avcodec_receive_frame() to retrieve the decoded frame.

There are two different kinds of AVFrame: software one, which is stored in the "CPU" memory (a.k.a RAM), and hardware one, which is stored in the graphic card memory.

Getting AVFrame from the hardware

To retrieve the hardware frame and get it into a readable, convertible (with swscaler) AVFrame, av_hwframe_transfer_data() needs to be used to retrieve the data from the graphic card. Then look at the pixel format of the retrieved frame, it is usually NV12 format when using nVidia decoding.

// According to the API, if the format of the AVFrame is set before calling 
// av_hwframe_transfer_data(), the graphic card will try to automatically convert
// to the desired format. (with some limitation, see below)
m_swFrame->format = AV_PIX_FMT_NV12;

// retrieve data from GPU to CPU
err = av_hwframe_transfer_data(
     m_swFrame, // The frame that will contain the usable data.
     m_decodedFrame, // Frame returned by avcodec_receive_frame()
     0);

const char* gpu_pixfmt = av_get_pix_fmt_name((AVPixelFormat)m_decodedFrame->format);
const char* cpu_pixfmt = av_get_pix_fmt_name((AVPixelFormat)m_swFrame->format);

Listing supported "software" pixel formats

Side note here if you want to select the pixel format, not all AVPixelFormat are supported. AVHWFramesConstraints is your friend here:

AVHWDeviceType type = AV_HWDEVICE_TYPE_CUDA;
int err = av_hwdevice_ctx_create(&hwDeviceCtx, type, nullptr, nullptr, 0);
if (err < 0) {
    // Err
}

AVHWFramesConstraints* hw_frames_const = av_hwdevice_get_hwframe_constraints(hwDeviceCtx, nullptr);
if (hw_frames_const == nullptr) {
    // Err
}

// Check if we can convert the pixel format to a readable format.
AVPixelFormat found = AV_PIX_FMT_NONE;
for (AVPixelFormat* p = hw_frames_const->valid_sw_formats; 
    *p != AV_PIX_FMT_NONE; p++)
{
    // Check if we can convert to the desired format.
    if (sws_isSupportedInput(*p))
    {
        // Ok! This format can be used with swscale!
        found = *p;
        break;
    }
}

// Don't forget to free the constraint object.
av_hwframe_constraints_free(&hw_frames_const);

// Attach your hw device to your codec context if you want to use hw decoding.
// Check AVCodecContext.hw_device_ctx!

Finally, a quicker way is probably the av_hwframe_transfer_get_formats() function, but you need to decode at least one frame.

Hope this will help!

Solution 2

I am not an ffmpeg expert, but I had a similar problem and managed to solve it. I was getting AV_PIX_FMT_NV12 from cuvid (mjpeg_cuvid decoder), and wanted AV_PIX_FMT_CUDA for cuda processing.

I found that setting the pixel format just before decoding the frame worked.

    pCodecCtx->pix_fmt = AV_PIX_FMT_CUDA; // change format here
    avcodec_decode_video2(pCodecCtx, pFrame, &frameFinished, &packet);
    // do something with pFrame->data[0] (Y) and pFrame->data[1] (UV)

You can check which pixel formats are supported by your decoder using pix_fmts:

    AVCodec *pCodec = avcodec_find_decoder_by_name("mjpeg_cuvid");
    for (int i = 0; pCodec->pix_fmts[i] != AV_PIX_FMT_NONE; i++)
            std::cout << pCodec->pix_fmts[i] << std::endl;

I'm sure there's a better way of doing this, but I then used this list to map the integer pixel format ids to human readable pixel formats.

If that doesn't work, you can do a cudaMemcpy to transfer your pixels from device to host:

    cudaMemcpy(pLocalBuf pFrame->data[0], size, cudaMemcpyDeviceToHost);

The conversion from YUV to RGB/RGBA can be done many ways. This example does it using the libavdevice API.

Solution 3

You must use vf_scale_npp to do this. You can use either nppscale_deinterleave or nppscale_resize depend on your needs.

Both has same input parameters, which are AVFilterContext that should be initialize with nppscale_init, NPPScaleStageContext which takes your in/out pixel format and two AVFrames which of course are your input and output frames.

For more information you can see npplib\nppscale definition which will do the CUDA-accelerated format conversion and scaling since ffmpeg 3.1.

Anyway, I recommend to use NVIDIA Video Codec SDK directly for this purpose.

11,092

Author by

costef

Updated on June 06, 2022

Comments

costef almost 2 years
I have a simple C++ application that uses FFmpeg 3.2 to receive an H264 RTP stream. In order to save CPU, I'm doing the decoding part with the codec h264_cuvid. My FFmpeg 3.2 is compiled with hw acceleration enabled. In fact, if I do the command:
```
ffmpeg -hwaccels
```
I get
```
cuvid
```
This means that my FFmpeg setup has everything OK to "speak" with my NVIDIA card. The frames that the function avcodec_decode_video2 provides me have the pixel format AV_PIX_FMT_CUDA. I need to convert those frames to new ones with AV_PIX_FMT_RGB. Unfortunately, I can't do the conversion using the well knwon functions sws_getContext and sws_scale because the pixel format AV_PIX_FMT_CUDA is not supported. If I try with swscale I get the error:

"cuda is not supported as input pixel format"

Do you know how to convert an FFmpeg AVFrame from AV_PIX_FMT_CUDA to AV_PIX_FMT_RGB ? (pieces of code would be very appreciated)
costef over 6 years

Hi Hamed. Thanks a lot for your answer. I'm going to study that vf_scale_npp. The functions static int nppscale_deinterleave (AVFilterContext *ctx, NPPScaleStageContext * stage, AVFrame *out, AVFrame *in); static int nppscale_resize (AVFilterContext *ctx, NPPScaleStageContext * stage, AVFrame *out, AVFrame *in); seem really well promising. I'll give me feedback soon. thanks again
costef over 6 years

Hi Hamed. I tried, but with no success. I get success from the function "nppscale_init" but failure from "nppscale_deinterleave". From this last I get the error code: [in @ 0x7fff69b97820] NPP deinterleave error: -8 Apparently the problem is in my "in" AVFrame. But what ? Do you know what it means it ? You also suggested to use the NVIDIA Video Codec SDK directly for such conversions. I'm willing to use it. FFmpeg lacks of documentation and good examples here. Do you have a piece of code, a function maybe, that receives an AVFrame got from cuvid and returns a new one in AV_PIX_FMT_RGB ?
costef over 6 years

Ok, no problem. Thanks anyway I've investigated deeper. I saw that "nppscale_deinterleave" calls the "nppiYCbCr420_8u_P2P3R" function and it returns the error code -8. That error code is NPP_NULL_POINTER_ERROR = -8 I've checked that I don't pass any NULL to "nppscale_deinterleave", but the error still occurs. I suspect that I've to "bring" more code of that vf_scale_npp in my software.