How does the CPU and GPU interact in displaying computer graphics?

graphics-card cpu graphics gpu 3d-graphics

46,369

Solution 1

I decided to write a bit about the programming aspect and how components talk to each other. Maybe it'll shed some light on certain areas.

The Presentation

What does it take to even have that single image, that you posted in your question, drawn on the screen?

There are many ways to draw a triangle on the screen. For simplicity, let's assume no vertex buffers were used. (A vertex buffer is an area of memory where you store coordinates.) Let's assume the program simply told the graphics processing pipeline about every single vertex (a vertex is just a coordinate in space) in a row.

But, before we can draw anything, we first have to run some scaffolding. We'll see why later:

// Clear The Screen And The Depth Buffer
glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); 

// Reset The Current Modelview Matrix
glMatrixMode(GL_MODELVIEW); 
glLoadIdentity();

// Drawing Using Triangles
glBegin(GL_TRIANGLES);

  // Red
  glColor3f(1.0f,0.0f,0.0f);
  // Top Of Triangle (Front)
  glVertex3f( 0.0f, 1.0f, 0.0f);

  // Green
  glColor3f(0.0f,1.0f,0.0f);
  // Left Of Triangle (Front)
  glVertex3f(-1.0f,-1.0f, 1.0f);

  // Blue
  glColor3f(0.0f,0.0f,1.0f);
  // Right Of Triangle (Front)
  glVertex3f( 1.0f,-1.0f, 1.0f);

// Done Drawing
glEnd();

So what did that do?

When you write a program that wants to use the graphics card, you'll usually pick some kind of interface to the driver. Some well known interfaces to the driver are:

OpenGL
Direct3D
CUDA

For this example we'll stick with OpenGL. Now, your interface to the driver is what gives you all the tools you need to make your program talk to the graphics card (or the driver, which then talks to the card).

This interface is bound to give you certain tools. These tools take the shape of an API which you can call from your program.

That API is what we see being used in the example above. Let's take a closer look.

The Scaffolding

Before you can really do any actual drawing, you'll have to perform a setup. You have to define your viewport (the area that will actually be rendered), your perspective (the camera into your world), what anti-aliasing you will be using (to smooth out the edged of your triangle)...

But we won't look at any of that. We'll just take a peek at the stuff you'll have to do every frame. Like:

Clearing the screen

The graphics pipeline is not going to clear the screen for you every frame. You'll have to tell it. Why? This is why:

enter image description here

If you don't clear the screen, you'll simply draw over it every frame. That's why we call glClear with the GL_COLOR_BUFFER_BIT set. The other bit (GL_DEPTH_BUFFER_BIT) tells OpenGL to clear the depth buffer. This buffer is used to determine which pixels are in front (or behind) other pixels.

Transformation

enter image description here
Image source

Transformation is the part where we take all the input coordinates (the vertices of our triangle) and apply our ModelView matrix. This is the matrix that explains how our model (the vertices) are rotated, scaled, and translated (moved).

Next, we apply our Projection matrix. This moves all coordinates so that they face our camera correctly.

Now we transform once more, with our Viewport matrix. We do this to scale our model to the size of our monitor. Now we have a set of vertices that are ready to be rendered!

We'll come back to transformation a bit later.

Drawing

To draw a triangle, we can simply tell OpenGL to start a new list of triangles by calling glBegin with the GL_TRIANGLES constant.
^{There are also other forms you can draw. Like a triangle strip or a triangle fan. These are primarily optimizations, as they require less communication between the CPU and the GPU to draw the same amount of triangles.}

After that, we can provide a list of sets of 3 vertices which should make up each triangle. Every triangle uses 3 coordinates (as we're in 3D-space). Additionally, I also provide a color for each vertex, by calling glColor3f before calling glVertex3f.

The shade between the 3 vertices (the 3 corners of the triangle) is calculated by OpenGL automatically. It will interpolate the color over the whole face of the polygon.

Interaction

Now, when you click the window. The application only has to capture the window message that signals the click. Then you can run any action in your program you want.

This gets a lot more difficult once you want to start interacting with your 3D scene.

You first have to clearly know at which pixel the user clicked the window. Then, taking your perspective into account, you can calculate the direction of a ray, from the point of the mouse click into your scene. You can then calculate if any object in your scene intersects with that ray. Now you know if the user clicked an object.

So, how do you make it rotate?

Transformation

I am aware of two types of transformations that are generally applied:

Matrix-based transformation
Bone-based transformation

The difference is that bones affect single vertices. Matrices always affect all drawn vertices in the same way. Let's look at an example.

Example

Earlier, we loaded our identity matrix before drawing our triangle. The identity matrix is one that simply provides no transformation at all. So, whatever I draw, is only affected by my perspective. So, the triangle will not be rotated at all.

If I want to rotate it now, I could either do the math myself (on the CPU) and simply call glVertex3f with other coordinates (that are rotated). Or I could let the GPU do all the work, by calling glRotatef before drawing:

// Rotate The Triangle On The Y axis
glRotatef(amount,0.0f,1.0f,0.0f);

amount is, of course, just a fixed value. If you want to animate, you'll have to keep track of amount and increase it every frame.

So, wait, what happened to all the matrix talk earlier?

In this simple example, we don't have to care about matrices. We simply call glRotatef and it takes care of all that for us.

glRotate produces a rotation of angle degrees around the vector x y z . The current matrix (see glMatrixMode) is multiplied by a rotation matrix with the product replacing the current matrix, as if glMultMatrix were called with the following matrix as its argument:

x 2 ⁡ 1 - c + c x ⁢ y ⁡ 1 - c - z ⁢ s x ⁢ z ⁡ 1 - c + y ⁢ s 0 y ⁢ x ⁡ 1 - c + z ⁢ s y 2 ⁡ 1 - c + c y ⁢ z ⁡ 1 - c - x ⁢ s 0 x ⁢ z ⁡ 1 - c - y ⁢ s y ⁢ z ⁡ 1 - c + x ⁢ s z 2 ⁡ 1 - c + c 0 0 0 0 1

Well, thanks for that!

Conclusion

What becomes obvious is, there's a lot of talk to OpenGL. But it's not telling us anything. Where is the communication?

The only thing that OpenGL is telling us in this example is when it's done. Every operation will take a certain amount of time. Some operation take incredibly long, others are incredibly quick.

Sending a vertex to the GPU will be so fast, I wouldn't even know how to express it. Sending thousands of vertices from the CPU to the GPU, every single frame, is, most likely, no issue at all.

Clearing the screen can take a millisecond or worse (keep in mind, you usually only have about 16 milliseconds of time to draw each frame), depending on how large your viewport is. To clear it, OpenGL has to draw every single pixel in the color you want to clear to, that could be millions of pixels.

Other than that, we can pretty much only ask OpenGL about the capabilities of our graphics adapter (max resolution, max anti-aliasing, max color depth, ...).

But we can also fill a texture with pixels that each have a specific color. Each pixel thus holds a value and the texture is a giant "file" filled with data. We can load that into the graphics card (by creating a texture buffer), then load a shader, tell that shader to use our texture as an input and run some extremely heavy calculations on our "file".

We can then "render" the result of our computation (in the form of new colors) into a new texture.

That's how you can make the GPU work for you in other ways. I assume CUDA performs similar to that aspect, but I never had the opportunity to work with it.

We really only slightly touched the whole subject. 3D graphics programming is a hell of a beast.

enter image description here
Image Source

Solution 2

It's hard to understand exactly what it is you don't understand.

The GPU has a series of registers that the BIOS maps. These permit the CPU to access the GPU's memory and instruct the GPU to perform operations. The CPU plugs values into those registers to map some of the GPU's memory so that the CPU can access it. Then it loads instructions into that memory. It then writes a value to a register that tells the GPU to execute the instructions the CPU loaded into its memory.

The information consists of the software that the GPU needs to run. This software is bundled with the driver and then the driver handles the responsibility split between the CPU and GPU (by running portions of its code on both devices).

The driver then manages a series of "windows" into GPU memory that the CPU can read from and write to. Generally, the access pattern involves the CPU writing instructions or information into mapped GPU memory and then instructing the GPU, through a register, to execute those instruction or process that information. The information includes shader logic, textures, and so on.

Solution 3

I was just curious and wanted to know the whole process from double clicking on Triangle.exe under Windows XP until I can see the triangle rotating on the monitor. What happens, how do CPU (which first handles the .exe) and GPU (which finally outputs the triangle on the screen) interact?

Let's take the assumption you actually know how an executable runs on an operating system and how that executable is sent from your GPU to the monitor, but don't know about what is happening in between. So, let's take a look from a hardware aspect and extend further on the programmer aspect answer...

What is the interface between CPU and GPU?

Using a driver, the CPU can talk through motherboard features like PCI to the graphics card and sent commands to it to execute some GPU instructions, access / update the GPU memory, load in a code to be executed on the GPU and more...

But, you can't really talk straight to the hardware or driver from code; so, this will have to happen through APIs like OpenGL, Direct3D, CUDA, HLSL, Cg. While the former run GPU instructions and / or update the GPU memory, the latter will actually execute code on the GPU as they are physics / shader languages.

Why run code on the GPU and not on the CPU?

While the CPU is good at running our daily workstation and server programs, there wasn't thought much about all those shiny graphics you see in games of these days. Back in the days there were software renderers which did the trick from some 2D and 3D things, but they were very limiting. So, here is where the GPU came into play.

The GPU is optimized for one of the most important calculations in graphics, Matrix Manipulation. While the CPU has to compute each multiplication in a matrix manipulation one-by-one (later, things like 3DNow! and SSE catched up), the GPU can do all those multiplications at once! Parallelism.

But parallel computations isn't the only reason, another reason is that the GPU is much closer to the video memory which makes it much faster than having to do round trips through the CPU, etc...

How do these GPU instructions / memory / code show graphics?

There is one missing piece to make this all work, we need something we can write to which we can then read and sent to the screen. We can do this by creating a framebuffer. Whatever operation you do, you will eventually update the pixels in the framebuffer; which besides location also hold information about color and depth.

Let's give you an example where you wanted to draw a blood sprite (an image) somewhere; first, the tree texture itself is loaded into GPU memory which makes it easy to redraw it at wish. Next, to do actually draw the sprite somewhere we can translate the sprite using vertexes (putting it in the right position), rasterising it (turning it from a 3D object into pixels) and update the framebuffer. To get a better idea, here is an OpenGL pipeline flow chart from Wikipedia:

This is the main gist of the whole graphics idea, more research is homework for the reader.

Solution 4

To keep things simple we can describe it like this. Some memory addresses are reserved (by BIOS and/or operating system) not for RAM but for the video card. Any data written at those values (pointers) goes to the card. So in theory any program can write directly to the videocard just by knowing the address range and this is exactly how it was done back in the old days. In practice with modern OSes this is managed by the video driver and/or the graphics library on top (DirectX, OpenGL etc.).

Solution 5

GPUs are usually driven by DMA buffers. That is, the driver compiles the commands it receives from the user space program into a stream of instructions (switch state, draw this in that way, switch contexts, etc.), that are then copied to device memory. It then instructs the GPU to execute this command buffer via a PCI register or similar methods.

So on every draw call etc. what happens is that the user space driver will compile the command, which then calls the kernel space driver via an interrupt and that one finally submits the command buffer to device memory and instructs the GPU to start rendering.

On consoles you even can have the fun to do all of that yourself, especially on PS3.

View more solutions

46,369

JohnnyFromBF

Updated on September 18, 2022

Comments

JohnnyFromBF almost 2 years
Here you can see a screenshot of a small C++ program called Triangle.exe with a rotating triangle based on the OpenGL API.

Admittedly a very basic example but I think it's applicable to other graphic cards operations.

I was just curious and wanted to know the whole process from double clicking on Triangle.exe under Windows XP until I can see the triangle rotating on the monitor. What happens, how do CPU (which first handles the .exe) and GPU (which finally outputs the triangle on the screen) interact?

I guess involved in displaying this rotating triangle is primarily the following hardware/software among others:

Hardware
- HDD
- System Memory (RAM)
- CPU
- Video memory
- GPU
- LCD display
Software
- Operating System
- DirectX/OpenGL API
- Nvidia Driver
Can anyone explain the process, maybe with some sort of flow chart for illustration?

It should not be a complex explanation that covers every single step (guess that would go beyond the scope), but an explanation an intermediate IT guy can follow.

I'm pretty sure a lot of people that would even call themselves IT professionals could not describe this process correctly.
- Admin over 10 years
  
  Your dilemma would be over if you could just consider GPU an extension of CPU!
BlueRaja - Danny Pflughoeft almost 12 years

-1 he asks how a DirectX API call from the CPU can communicate with the GPU, and your answer is "it's managed by the driver and/or DirectX"? This also doesn't explain how custom code (ala CUDA) could be ran.
AZ. almost 12 years

please learn to read. I said by writing to specific memory addresses that are reserved for GPU instead of RAM. And this explains how you can run everything. A memory range is registered for a card. Everything you write in that range goes to the GPU running vertex processing, CUDA, whatever.
Axel Gneiting almost 12 years

There is no CPU instruction set involved. The driver & runtime compile your CUDA, OpenGL, Direct3D, etc. to native GPU programs/kernels, which are then also uploaded to device memory. The command buffer then refers to those like any other resource.
David Schwartz almost 12 years

@sidran32: For example, in nVidia's Kepler architecture, kernels, streams, and events are created by software that runs on the GPU, not (usually) the CPU. GPU-side software manages RDMA as well. All of that software is loaded into GPU memory by the driver and runs as a "mini-OS" on the GPU that handles the GPU-side of the CPU/GPU cooperating pair.
user3359503 almost 12 years

@DavidSchwartz I did forget about GPU compute tasks. However, they still behave similar to shaders, in implementation, anyway. I wouldn't call it a "mini-OS", though, since it doesn't have the same functionality typically associated with OSs. It's still very specialized software, since the GPU isn't designed like a CPU (for good reason).
DavidPostill about 8 years

This duplicates another answer and adds no new content. Please don't post an answer unless you actually have something new to contribute.
AnotherDeveloper over 4 years

This aptly describes the Physics + Programming part of the problem. Kudos!!!
twitchdotcom slash KANJICODER about 4 years

+1, Shame this answer has less upvotes. Most of the other answers say "driver" or "api". But those both in the end are code, so the question is really about what the driver or api is doing under the hood.