Compute Shaders Input 3d array of floats

10,833

Using a 1D buffer, index into it as though 3D, by special indexing, on CPU & GPU.

There are only 1-dimensional buffers in HLSL. Use a function / formula to convert from an N-dimensional (say 3D or 2D) index vector to a 1D vector which you use to index into your 1D array.

If we have a 3D array indexed [z][y][x] (see footnote #1 for why), and created an array[Z_MAX][Y_MAX][X_MAX], we can turn [z][y][x] this into a linear index [i].

Here's how its done...

Imagine a block you have cut into slices from top to bottom (so it piles up like a stack of coins), where xy is each layer / slice, running up along z which is the vertical axis. Now for every increment in z (upwards) we know we have x (width) * y (height) elements already accounted for. Now to that total, we need to add how much we have walked in the current 2D slice: for every step in y (which counts elements in rows going from left to right) we know we have x (width) elements already accounted for, so add that to the total. Then to that we had the number of steps within the current row, which is x, add this to total. You now have a 1D index.

i = z * (Y_MAX * X_MAX) + y * (X_MAX) + x; //you may need to use "*_MAX - 1" instead

Footnote #1 I don't use unity's coordinate system here because it is easier to explain by swapping y and z. In this case, [z][y][x] indexing prevents jumps all over memory ; see this article. Unity would swap [z][y][x] for [y][z][x] (to operate primarily on slices laid out this same way.)

Footnote #2 This principle is exactly what uint3 id : SV_DispatchThreadID does as compared with uint3 threadID : SV_GroupThreadID and uint3 groupID : SV_GroupID. See the docs:

SV_DispatchThreadID is the sum of SV_GroupID * numthreads and GroupThreadID.

...So use this instead where possible, given the structure of your program.

Footnote #3 This is the same way that N-dimensional indexing is achieved in C, under the hood.

Share:
10,833

Related videos on Youtube

PandemoniumSyndicate
Author by

PandemoniumSyndicate

Updated on June 04, 2022

Comments

  • PandemoniumSyndicate
    PandemoniumSyndicate almost 2 years

    Writing a Compute Shader to be used in Unity 4. I'm attempting to get 3d noise.

    The goal is to get a multidiminsional float3 array into my compute shader from my C# code. Is this possible in a straightforward manner (using some kind of declaration) or can it only be achieved using Texture3D objects?

    I currently have an implementation of simplex noise working on individual float3 points, outputting a single float -1 to 1. I ported the code found here for the compute shader.

    I would like to extend this to work on a 3D array of float3's (I suppose the closest comparison in C# would be Vector3[,,]) by applying the noise operation to each float3 point in the array.

    I've tried a few other things, but they feel bizarre and completely miss the point of using a parallel approach. The above is what I imagine it should look like.

    I also managed to get Scrawk's Implemenation working as Vertex Shaders. Scrawk got a 3D float4 array into the shader using a Texture3D. But I wasn't able to extract the floats from the texture. Is that how Compute Shaders work as well? Relying on Textures? I probably have overlooked something concerning getting the values out of the Texture. This seems to be how this user was getting data in in this post. Similar question to mine, but not quite what I'm looking for.

    New to shaders in general, and I feel like I'm missing something pretty fundamental about Compute Shaders and how they work. The goal is to (as I'm sure you've guessed) get noise generation and mesh computation using marching cubes onto the GPU using Compute Shaders (or whatever shader is best suited to this kind of work).

    Constraints are the Free Trial Edition of Unity 4.

    Here's a skeleton of the C# code I'm using:

        int volumeSize = 16; 
        compute.SetInt ("simplexSeed", 10); 
    
        // This will be a float[,,] array with our density values. 
        ComputeBuffer output = new ComputeBuffer (/*s ize goes here, no idea */, 16);
        compute.SetBuffer (compute.FindKernel ("CSMain"), "Output", output);  
    
        // Buffer filled with float3[,,] equivalent, what ever that is in C#. Also what is 'Stride'? 
        // Haven't found anything exactly clear. I think it's the size of basic datatype we're using in the buffer?
        ComputeBuffer voxelPositions = new ComputeBuffer (/* size goes here, no idea */, 16); 
        compute.SetBuffer (compute.FindKernel ("CSMain"), "VoxelPos", voxelPositions);    
    
    
        compute.Dispatch(0,16,16,16);
        float[,,] res = new float[volumeSize, volumeSize, volumeSize];
    
        output.GetData(res); // <=== populated with float density values
    
        MarchingCubes.DoStuff(res); // <=== The goal (Obviously not implemented yet)
    

    And here's the Compute Shader

    #pragma kernel CSMain
    
    uniform int simplexSeed;
    RWStructuredBuffer<float3[,,]> VoxelPos;  // I know these won't work, but it's what I'm trying
    RWStructuredBuffer<float[,,]> Output;     // to get in there. 
    
    float simplexNoise(float3 input)
    {
        /* ... A bunch of awesome stuff the pastebin guy did ...*/
    
        return noise;
    }
    
    /** A bunch of other awesome stuff to support the simplexNoise function **/
    /* .... */
    
    /* Here's the entry point, with my (supposedly) supplied input kicking things off */
    [numthreads(16,16,16)] // <== Not sure if this thread count is correct? 
    void CSMain (uint3 id : SV_DispatchThreadID)
    {
        Output[id.xyz] = simplexNoise(VoxelPos.xyz); // Where the action starts.     
    }
    
  • PandemoniumSyndicate
    PandemoniumSyndicate about 10 years
    Yeah I'm definitely trying to grock the num threads stuff. Yep the goal is voxel terrain :) Seems this is the trial-by-fire most proc terrain devs face. I'm going for a 3D array because I want a point cloud, not a height map. This allows generation of overhangs and caves and such. I have an implementation running on the CPU, but it's about 20 fps while generating chunks. So is the only way to get multidimensional float data into a compute shader is by using a Texture2D or Texture3D? Or is there another type syntax I can use for the REStructuredBuffer<multi-dim array type> ?
  • War
    War about 10 years
    This is probably a bit presumptuos of me but i think you are going about it the wrong way ... generating the array like that is not feasable, it's a 2 pass process, first generate the "height of a stack" up to ground level and fill with "dirt" or whatever, then run a procedure (compute kernel) on the same array to "dig tunnels".
  • War
    War about 10 years
    Of course theres always the GPU gems approach ... http.developer.nvidia.com/GPUGems3/gpugems3_ch01.html ... if you are using that then i htink you might confusing a density function with a noise function, they are alike but still different functions.
  • PandemoniumSyndicate
    PandemoniumSyndicate about 10 years
    Right, but for both those passes I still need noise. I'm starting with the 3d noise first. The issue is how do I get a :3d array of vector3's into an hlsl compute shader through Unity 4? Perhaps I shouldn't have mentioned the terrain because now we're off topic. Forget about the proc gen stuff. I just new to get initialization data into my shader.
  • War
    War about 10 years
    surely you just declare a new compute buffer on the cpu and assign it to your compute shader then in your gpu code fill it up, if that's the case then actually what you are doing is already correct, but you can't have a "3d array buffer" you have to declare a flat one ... RWStructuredBuffer<float> buffer;
  • War
    War about 10 years
    Might be worth us getting together some time and talking in more detail. email me if you're interested in working with me on this paul.ward at ccoder.co.uk i'm looking to build (hopefully) within unity a layer that lets me store the whole scenes voxels in buffers on the GPU which i simply manipulate with functions (compute kernels).