Why do you program in assembly?

c performance low-level assembly

63,070

Solution 1

I think you're misreading this statement:

For example, many modern 3-D Games have their high performance core engine written in C++ and Assembly.

Games (and most programs these days) aren't "written in assembly" the same way they're "written in C++". That blog isn't saying that a significant fraction of the game is designed in assembly, or that a team of programmers sit around and develop in assembly as their primary language.

What this really means is that developers first write the game and get it working in C++. Then they profile it, figure out what the bottlenecks are, and if it's worthwhile they optimize the heck out of them in assembly. Or, if they're already experienced, they know which parts are going to be bottlenecks, and they've got optimized pieces sitting around from other games they've built.

The point of programming in assembly is the same as it always has been: speed. It would be ridiculous to write a lot of code in assembler, but there are some optimizations the compiler isn't aware of, and for a small enough window of code, a human is going to do better.

For example, for floating point, compilers tend to be pretty conservative and may not be aware of some of the more advanced features of your architecture. If you're willing to accept some error, you can usually do better than the compiler, and it's worth writing that little bit of code in assembly if you find that lots of time is spent on it.

Here are some more relevant examples:

Examples from Games

Article from Intel about optimizing a game engine using SSE intrinsics. The final code uses intrinsics (not inline assembler), so the amount of pure assembly is very small. But they look at the assembler output by the compiler to figure out exactly what to optimize.
Quake's fast inverse square root. Again, the routine doesn't have assembler in it, but you need to know something about architecture to do this kind of optimization. The authors know what operations are fast (multiply, shift) and which are slow (divide, sqrt). So they come up with a very tricky implementation of square root that avoids the slow operations entirely.

High-Performance Computing

Outside the domain of games, people in scientific computing frequently optimize the crap out of things to get them to run fast on the latest hardware. Think of this as games where you can't cheat on the physics.

A great recent example of this is Lattice Quantum Chromodynamics (Lattice QCD). This paper describes how the problem pretty much boils down to one very small computational kernel, which was optimized heavily for PowerPC 440's on an IBM Blue Gene/L. Each 440 has two FPUs, and they support some special ternary operations that are tricky for compilers to exploit. Without these optimizations, Lattice QCD would've run much slower, which is costly when your problem requires millions of CPU hours on expensive machines.

If you are wondering why this is important, check out the article in Science that came out of this work. Using Lattice QCD, these guys calculated the mass of a proton from first principles, and showed last year that 90% of the mass comes from strong force binding energy, and the rest from quarks. That's E=mc² in action. Here's a summary.

For all of the above, the applications are not designed or written 100% in assembly -- not even close. But when people really need speed, they focus on writing the key parts of their code to fly on specific hardware.

Solution 2

I have not coded in assembly language for many years, but I can give several reasons that I frequently saw:

Not all compilers can make use of certain CPU optimizations and instruction set (e.g., the new instruction sets that Intel adds once in a while). Waiting for compiler writers to catch up means losing a competitive advantage.
Easier to match actual code to known CPU architecture and optimization. For example, things you know about the fetching mechanism, caching, etc. This is supposed to be transparent to the developer, but the fact is that it is not, that's why compiler writers can optimize.
Certain hardware level accesses are only possible/practical via assembly language (e.g., when writing device driver).
Formal reasoning is sometimes actually easier for the assembly language than for the high-level language since you already know what the final or almost final layout of the code is.
Programming certain 3D graphic cards (circa late 1990s) in the absence of APIs was often more practical and efficient in assembly language, and sometimes not possible in other languages. But again, this involved really expert-level games based on the accelerator architecture like manually moving data in and out in certain order.

I doubt many people use assembly language when a higher-level language would do, especially when that language is C. Hand-optimizing large amounts of general-purpose code is impractical.

Solution 3

Usually, a layman's assembly is slower than C (due to C's optimization) but many games (I distinctly remember Doom) had to have specific sections of the game in Assembly so it would run smoothly on normal machines.

Here's the example to which I am referring.

Solution 4

I started professional programming in assembly language in my very first job (80's). For embedded systems the memory demands - RAM and EPROM - were low. You could write tight code that was easy on resources.

By the late 80's I had switched to C. The code was easier to write, debug and maintain. Very small snippets of code were written in assembler - for me it was when I was writing the context switching in an roll-your-own RTOS. (Something you shouldn't do anymore unless it is a "science project".)

You will see assembler snippets in some Linux kernel code. Most recently I've browsed it in spinlocks and other synchronization code. These pieces of code need to gain access to atomic test-and-set operations, manipulating caches, etc.

I think you would be hard pressed to out-optimize modern C compilers for most general programming.

I agree with @altCognito that your time is probably better spent thinking harder about the problem and doing things better. For some reason programmers often focus on micro-efficiencies and neglect the macro-efficiencies. Assembly language to improve performance is a micro-efficiency. Stepping back for a wider view of the system can expose the macro problems in a system. Solving the macro problems can often yield better performance gains. Once the macro problems are solved then collapse to the micro level.

I guess micro problems are within the control of a single programmer and in a smaller domain. Altering behavior at the macro level requires communication with more people - a thing some programmers avoid. That whole cowboy vs the team thing.

Solution 5

"Yes". But, understand that for the most part the benefits of writing code in assembler are not worth the effort. The return received for writing it in assembly tends to be smaller than the simply focusing on thinking harder about the problem and spending your time thinking of a better way of doing thigns.

John Carmack and Michael Abrash who were largely responsible for writing Quake and all of the high performance code that went into IDs gaming engines go into this in length detail in this book.

I would also agree with Ólafur Waage that today, compilers are pretty smart and often employ many techniques which take advantage of hidden architectural boosts.

View more solutions

63,070

Author by

Adam

Updated on February 05, 2020

Comments

Adam over 4 years

I have a question for all the hardcore low level hackers out there. I ran across this sentence in a blog. I don't really think the source matters (it's Haack if you really care) because it seems to be a common statement.

For example, many modern 3-D Games have their high performance core engine written in C++ and Assembly.

As far as the assembly goes - is the code written in assembly because you don't want a compiler emitting extra instructions or using excessive bytes, or are you using better algorithms that you can't express in C (or can't express without the compiler mussing them up)?

I completely get that it's important to understand the low-level stuff. I just want to understand the why program in assembly after you do understand it.