How to use constant memory for beginners (Cuda C)

16,182

The constant memory in CUDA is a dedicated memory space of 65536 bytes. It is dedicated because it has some special features like cache and broadcasting.

The constant memory space resides in device memory and is cached in the constant cache mentioned in Compute Capability 1.x and Compute Capability 2.x.

See Sect. 5.3.2. Device Memory Access and Sect. G.4.4. Constant Memory in the CUDA C Programming Guide for more details.

So, you can allocate constant memory for one element as you already did, and you can also allocate memory for an array of element.

__constant__ float c_ABC[3]; // 3 elements of type float (12 bytes)

However, dynamically allocation of constant memory is not allowed in CUDA. Therefore, you must copy the data from the CPU to the GPU as you did with one element.

float pABC[] = {1, 2, 3};
...
cudaMemcpyToSymbol(c_ABC, &pABC, 3 * sizeof(float));

You can initialize pABC in the CPU for example in a loop or loading data from a file and then copy the data in the constant memory of the GPU.

NOTE that I have adjusted your example to use always floats.

Share:
16,182
Federico Gentile
Author by

Federico Gentile

Contact: [email protected] Skills: Data Analytics Data Science Web Development Programming: Python, C, CUDA HTML, CSS, jQuery Azure

Updated on June 23, 2022

Comments

  • Federico Gentile
    Federico Gentile almost 2 years

    I have 3 constant values (A,B,C) which I would like to save in the constant memory; I figured out one way to do it by typing this lines of code:

    // CUDA global constants
    __constant__ int A;
    __constant__ int B;
    __constant__ int C;
    
    int main(void)
    {
    
        float pA=1;
        float pB=2;
        float pC=3;
        ...
        cudaMemcpyToSymbol(A, &pA, sizeof(A));
        cudaMemcpyToSymbol(B, &pB, sizeof(B));
        cudaMemcpyToSymbol(C, &pC, sizeof(C));
        ...
    }
    

    However I believe this is not the best way to proceed since it would become very inconvenient if I had a larger number of constants.

    Here is my question: how can I replicate the lines of code I wrote above in order to have a more compact form?