Is it possible to store the address of a label in a variable and use goto to jump to it?
Solution 1
The C and C++ standards do not support this feature. However, the GNU Compiler Collection (GCC) includes a non-standard extension for doing this as described in this article. Essentially, they have added a special operator "&&" that reports the address of the label as type "void*". See the article for details.
P.S. In other words, just use "&&" instead of "&" in your example, and it will work on GCC.
P.P.S. I know you don't want me to say it, but I'll say it anyway,... DON'T DO THAT!!!
Solution 2
I know the feeling then everybody says it shouldn't be done; it just has to be done. In GNU C use &&the_label;
to take the address of a label. (https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html) The syntax you guessed, goto *ptr
on a void*
, is actually what GNU C uses.
Or if you want to use inline assembly for some reason, here's how to do it with GNU C asm goto
// unsafe: this needs to use asm goto so the compiler knows
// execution might not come out the other side
#define unsafe_jumpto(a) asm("jmp *%0"::"r"(a):)
// target pointer, possible targets
#define jumpto(a, ...) asm goto("jmp *%0" : : "r"(a) : : __VA_ARGS__)
int main (void)
{
int i=1;
void* the_label_pointer;
the_label:
the_label_pointer = &&the_label;
label2:
if( i-- )
jumpto(the_label_pointer, the_label, label2, label3);
label3:
return 0;
}
The list of labels must include every possible value for the_label_pointer
.
The macro expansion will be something like
asm goto("jmp *%0" : : "ri"(the_label_pointer) : : the_label, label2, label3);
This compiles with gcc 4.5 and later, and with the latest clang which just got asm goto
support some time after clang 8.0. https://godbolt.org/z/BzhckE. The resulting asm looks like this for GCC9.1, which optimized away the "loop" of i=i
/ i--
and just put the the_label
after the jumpto
. So it still runs exactly once, like in the C source.
# gcc9.1 -O3 -fpie
main:
leaq .L2(%rip), %rax # ptr = &&label
jmp *%rax # from inline asm
.L2:
xorl %eax, %eax # return 0
ret
But clang didn't do that optimization and still has the loop:
# clang -O3 -fpie
main:
movl $1, %eax
leaq .Ltmp1(%rip), %rcx
.Ltmp1: # Block address taken
subl $1, %eax
jb .LBB0_4 # jump over the JMP if i was < 1 (unsigned) before SUB. i.e. skip the backwards jump if i wrapped
jmpq *%rcx # from inline asm
.LBB0_4:
xorl %eax, %eax # return 0
retq
The label address operator && will only work with gcc. And obviously the jumpto assembly macro needs to be implemented specifically for each processor (this one works with both 32 and 64 bit x86).
Also keep in mind that (without asm goto
) there would be no guarantee that the state of the stack is the same at two different points in the same function. And at least with some optimization turned on it's possible that the compiler assumes some registers to contain some value at the point after the label. These kind of things can easily get screwed up then doing crazy shit the compiler doesn't expect. Be sure to proof read the compiled code.
These are why asm goto
is necessary to make it safe by letting the compiler know where you will / might jump, getting consistent code-gen for the jump and the destination.
Solution 3
You can do something similar with setjmp/longjmp.
int main (void)
{
jmp_buf buf;
int i=1;
// this acts sort of like a dynamic label
setjmp(buf);
if( i-- )
// and this effectively does a goto to the dynamic label
longjmp(buf, 1);
return 0;
}
Solution 4
According to the C99 standard, § 6.8.6, the syntax for a goto
is:
goto identifier ;
So, even if you could take the address of a label, you couldn't use it with goto.
You could combine a goto
with a switch
, which is like a computed goto
, for a similar effect:
int foo() {
static int i=0;
return i++;
}
int main(void) {
enum {
skip=-1,
run,
jump,
scamper
} label = skip;
#define STATE(lbl) case lbl: puts(#lbl); break
computeGoto:
switch (label) {
case skip: break;
STATE(run);
STATE(jump);
STATE(scamper);
default:
printf("Unknown state: %d\n", label);
exit(0);
}
#undef STATE
label = foo();
goto computeGoto;
}
If you use this for anything other than an obfuscated C contest, I will hunt you down and hurt you.
Solution 5
In the very very very old version of C language (think of the time dinosaurs roamed the Earth), known as "C Reference Manual" version (which refers to a document written by Dennis Ritchie), labels formally had type "array of int" (strange, but true), meaning that you could declare an int *
variable
int *target;
and assign the address of label to that variable
target = label; /* where `label` is some label */
Later you could use that variable as the operand of goto
statement
goto target; /* jumps to label `label` */
However, in ANSI C this feature was thrown out. In the standard modern C you cannot take address of a label and you cannot do "parametrized" goto
. This behavior is supposed to be simulated with switch
statements, pointers-to-functions and other methods etc. Actually, even "C Reference Manual" itself said that "Label variables are a bad idea in general; the switch statement makes them almost always unnecessary" (see "14.4 Labels").
CanadianGirl827x
Updated on July 05, 2022Comments
-
CanadianGirl827x almost 2 years
I know everyone hates gotos. In my code, for reasons I have considered and am comfortable with, they provide an effective solution (ie I'm not looking for "don't do that" as an answer, I understand your reservations, and understand why I am using them anyway).
So far they have been fantastic, but I want to expand the functionality in such a way that requires me to essentially be able to store pointers to the labels, then go to them later.
If this code worked, it would represent the type of functionality that I need. But it doesn't work, and 30 min of googling hasn't revealed anything. Does anyone have any ideas?
int main (void) { int i=1; void* the_label_pointer; the_label: the_label_pointer = &the_label; if( i-- ) goto *the_label_pointer; return 0; }
-
mrduclaw over 14 years+1 for just doing it in assembly, that's how I solved a similar issue previously.
-
RickNZ over 14 yearsJust a caution that setjmp/longjmp can be slow, since they save and restore much more than just the program counter.
-
Ahmed over 14 yearsWhat is the difference between puts(#lbl) and puts(lbl)?
-
outis over 14 yearsThe
#
is the preprocessor stringizing operator (en.wikipedia.org/wiki/C_preprocessor#Quoting_macro_arguments). It converts identifiers into strings.puts(lbl)
won't compile becauselbl
isn't achar *
. -
outis over 14 yearsRather, it will compile with warnings and crash if you run it.
-
EvilTeach about 14 years+1 for evil thinking and use of macros above and beyond the call of duty.
-
sam hocevar over 12 yearsThere is no guarantee that the
switch/case
will be implemented as a computedgoto
. Quite often it is compiled as if it was a series ofif/else if/else if/...
and the generated assembly will test for each value rather than compute a single address to jump to. -
Brian Campbell over 12 years@SamHocevar Sure, you can't depend on how it will be implemented (though cases like this, in which you are using a small range with no holes, are much more likely to be optimized this way). But despite whether the optimization is applied, it is semantically equivalent to a
goto
that is conditional on the value that you pass in, due to the fall-through behavior. The behavior is the same, the implementation only effects the performance. And it seems to be a relevant answer to the OP's question, since he's looking to build a state machine usinggoto
s, for whichswitch
would do the trick. -
Justin Dennahower over 10 yearsgoto label address is great for writing an interpreter.
-
Calmarius over 9 yearsCan't you just
lea eax, label; mov label_ptr, eax
(intel syntax), to store the pointer in a variable? -
Fabel over 9 yearsThere is no doubt it can be implemented in assembly (which maybe could be considered better in this case). One benefit of implementing it in C is that the compiler do some optimizations.
-
Dwayne Robinson over 9 yearsI'd like to know why in the world they used double ampersands (logical and), when the existing get-the-address-of-an-identifier '&' would have made the most sense. The only reason why I can figure is that label identifiers appear to exist in a parallel but separate scope as variable identifiers, and thus there could be ambiguity between getting the address of a label vs variable if both were named the same (arguably though that's just bad practice to declare an int foo and foo: in the same function). If this ever gets into the standard, I'd hope for '&', not '&&'.
-
kungfooman about 8 yearsOne of the best answers here, thanks very much, helped me out in a reverse engineering project.
-
Pietro Braione over 7 yearsNote that this is not standard C++, rather an extension provided by the GNU C++ compiler (see gcc.gnu.org/onlinedocs/gcc-6.2.0/gcc/…). Clang also has this extension, while Visual C++ does not (see stackoverflow.com/questions/6421433/address-of-labels-msvc).
-
Kariddi over 6 yearsTotally do it. If you are writing an interpreter loop that's the way to do it.
-
chqrlie almost 5 yearsYour implementation of Duff's device is broken: the
case 0:
should be moved to the end of thedo
body and followed by an empty statement. As coded, sending 0 bytes incorrectly sends 8 bytes. -
chqrlie almost 5 yearsThis does not work: depending on whether
i
is stored in a register or on the stack, its original value (1
) will be restored bylongjmp()
or not, hence potentially causing an infinite loop. -
HelloWorld over 4 yearsThe benefit of an address label is also having access to the stack, not just the (faster) function call. But indeed might be one of the few solutions for MSVC
-
Peter Cordes about 2 yearsOh interesting, so the GNU C labels as values extension is just reintroducing a historical C feature, with somewhat different syntax (
void *target = &&label
andgoto *target
). -
glades about 2 yearsI think this is actually a very useful feature for border cases where you can't do infinite recursion because it would blow your stackframe and you need to track context without branching everytime before you jump. Sad that its only implemented in gcc :(
-
Fabel about 2 years@glades The same thing can be achieved with a switch statement, since the labels need to belong to a predefined set anyway. If you place all functions in one switch you can both call and goto any any label in perfectly portable C. Yes, case labels can go anywere iin the code, even inside if blocks of whatever. (This is true for the answer rewritten by Peter Cordes, my original answer allowed jumping between code in different object files in a less limited and less secure way.)
-
glades about 2 years@Fabel I'm considering that but how would you do it if your code jumps into a label from multiple places and then has to return to the section it jumped from? It can't be a function for some reason, how would you do that with a switch statement?
-
Fabel about 2 years@glades Let the function have two arguments: the variable used in the switch statement and a pointer to a struct containing the actual arguments for the "function" (which is just one of the cases). This way the single function can be called recursively just like if it's a different function. If the "functions" need to return different kinds of values the struct can be used for that too. Of course each "function" can use a different struct (or a union if preferred). It's perfectly safe since the caller and the "function" agree on it.
-
glades about 2 years@Fabel: That would be a possibility if I could use functions. The problem is that within switch statements I need to recursively call another code section that might itself call this code section again. As nobody knows how many times this will happen I run the risk of a stack overflow.
-
glades about 2 years@chqrlie I guess OP copied the example from wikipedia where it's stated that "This code assumes that initial count > 0." On another note I don't think this kind of loop unrolling makes sense now as the compiler will unroll the loop if it makes sense and even if it doesn't ALU pipelining will forward calculate the exit conditions of the loop for many iterations so that this kind of manual trickery is irrelevant on modern processors.
-
glades about 2 years@PietroBraione It should be in the C standard, it makes sense in some edge cases when you don't want to dive down to assembly just for doing that and for portability reasons.
-
Fabel almost 2 years@glades You can both jump to another case label in the switch statement (as a common state machine), but then not return to the previous state if you haven't saved it in some way. Or you can call the single function recursively and be able to return (as with a normal function) but risk a stack overflow. Those methods can be mixed safely. And you can store a previous state in any way you like (like with a pointer to a label). I fail to see any limitations, except for the finite set of states/functions/labels (you can not add additional "states" in another object file and jump between).