GNU gcc/ld - wrapping a call to symbol with caller and callee defined in the same object file

31,223

Solution 1

You have to weaken and globalize the symbol using objcopy.

-W symbolname
--weaken-symbol=symbolname
    Make symbol symbolname weak. This option may be given more than once.
--globalize-symbol=symbolname
    Give symbol symbolname global scoping so that it is visible outside of the file in which it is defined. This option may be given more than once.

This worked for me

bar.c:

#include <stdio.h>
int foo(){
  printf("Wrap-FU\n");
}

foo.c:

#include <stdio.h>

void foo(){
printf("foo\n");
}

int main(){
printf("main\n");
foo();
}

Compile it

$ gcc -c foo.c bar.c 

Weaken the foo symbol and make it global, so it's available for linker again.

$ objcopy foo.o --globalize-symbol=foo --weaken-symbol=foo foo2.o

Now you can link your new obj with the wrap from bar.c

$ gcc -o nowrap foo.o #for reference
$ gcc -o wrapme foo2.o bar.o

Test

$ ./nowrap 
main
foo

And the wrapped one:

$ ./wrapme 
main
Wrap-FU

Solution 2

You can use __attribute__((weak)) before the implementation of the callee in order to let someone reimplement it without GCC yelling about multiple definitons.

For example suppose you want to mock the world function in the following hello.c code unit. You can prepend the attribute in order to be able to override it.

#include "hello.h"
#include <stdio.h>

__attribute__((weak))
void world(void)
{
    printf("world from lib\n");
}

void hello(void)
{
    printf("hello\n");
    world();
}

And you can then override it in another unit file. Very useful for unit testing/mocking:

#include <stdio.h>
#include "hello.h"

/* overrides */
void world(void)
{
    printf("world from main.c"\n);
}

void main(void)
{
    hello();
    return 0;
}

Solution 3

#include <stdio.h>
#include <stdlib.h>

//gcc -ggdb -o test test.c -Wl,-wrap,malloc
void* __real_malloc(size_t bytes);

int main()
{
   int *p = NULL;
   int i = 0;

   p = malloc(100*sizeof(int));

   for (i=0; i < 100; i++)
       p[i] = i;

   free(p);
   return 0;
}

void* __wrap_malloc(size_t bytes)
{
      return __real_malloc(bytes);
}

And then just compile this code and debug. When you call the reall malloc, the function called will __wrap_malloc and __real_malloc will call malloc.

I think this is the way to intercept the calls.

Basically its the --wrap option provided by ld.

Solution 4

This appears to be working as documented:

 --wrap=symbol
       Use a wrapper function for symbol. 
       Any undefined reference to symbol will be resolved to "__wrap_symbol". ...

Note the undefined above. When the linker processes foo.o, the bar() is not undefined, so the linker does not wrap it. I am not sure why it's done that way, but there probably is a use case that requires this.

Solution 5

You can achieve what you want if you use --undefined with --wrap

  -u SYMBOL, --undefined SYMBOL
                              Start with undefined reference to SYMBOL
Share:
31,223

Related videos on Youtube

luis.espinal
Author by

luis.espinal

Computer Scientist, Software Engineer and developer since 1994, Java/JEE from 1998 till 2009, C/C++ (UNIX and Win32), CORBA, software architecture, network protocols (layer 3 and up) with a defense contractor (General Dynamics C4 Systems) from 2010 to 2013. Then some C#, C++11 and WinRT C++/CX at Citrix doing mobile and desktop app/virtualization work. Currently back in Java and Jython at Vertiv (former Emerson Network Power/Avocent). More precisely, knee-deep with PL/SQL doing time series data processing. Other work I've done involved systems administration, x86 Assembly, VB, and FoxPro in the Pre-Cambrian. Good times, good stuff. I've pursued a MS in Computer Science (with focus on security in distributed systems). Then some grad studies pursuing a MS in Electrical and Computer Engineering. Hopefully I'll resume my studies and complete a MS in CS, or branch out into a MS in MIS (depending on the conditions on the battleground.)

Updated on July 09, 2022

Comments

  • luis.espinal
    luis.espinal almost 2 years

    to clarify, my question refers to wrapping/intercepting calls from one function/symbol to another function/symbol when the caller and the callee are defined in the same compilation unit with the GCC compiler and linker.

    I have a situation resembling the following:

    /* foo.c */
    void foo(void)
    {
      /* ... some stuff */
      bar();
    }
    
    void bar(void)
    {
      /* ... some other stuff */
    }
    

    I would like to wrap calls to these functions, and I can do that (to a point) with ld's --wrap option (and then I implement __wrap_foo and __wrap_bar which in turn call __real_foo and __real_bar as expected by the result of ld's --wrap option).

    gcc -Wl,--wrap=foo -Wl,--wrap=bar ...
    

    The problem I'm having is that this only takes effect for references to foo and bar from outside of this compilation unit (and resolved at link time). That is, calls to foo and bar from other functions within foo.c do not get wrapped.

    calls from within the compilation unit get resolved before the linker's wrapping

    I tried using objcopy --redefine-sym, but that only renames the symbols and their references.

    I would like to replace calls to foo and bar (within foo.o) to __wrap_foo and __wrap_bar (just as they get resolved in other object files by the linker's --wrap option) BEFORE I pass the *.o files to the linker's --wrap options, and without having to modify foo.c's source code.

    That way, the wrapping/interception takes place for all calls to foo and bar, and not just the ones taking place outside of foo.o.

    Is this possible?

    • Chris Stratton
      Chris Stratton over 11 years
      You could probably solve your problem with find/replace in your editor, or using sed...
    • luis.espinal
      luis.espinal over 11 years
      Are you suggesting to simply hack the obj with an editor?
    • Chris Stratton
      Chris Stratton over 11 years
      I'm suggesting you bulk-modify the source code to replace the calls to the function with those to a wrapper, or with something that you can #define to be either the real function or the wrapper.
    • luis.espinal
      luis.espinal over 11 years
      Ok, so how would I go about using sed and an editor to modify the object file so the calls to, say, foo, get replaced with an offset to a symbol, say __real_foo that will be resolved later by the linker? I ask in earnest btw.
    • Chris Stratton
      Chris Stratton over 11 years
      I'd recommend that you modify the source code, not the object file.
    • Chris Stratton
      Chris Stratton over 11 years
      If you must do it to the object file, you'd probably need to over-write the start of the function with a call to a some wrapping logic, but this would requiring understanding the platform-specific function call, register save, etc sequence and hoping that it doesn't change. Just a find-and-replace on address won't work since they are often relative - you could pattern match whatever call instructions you think the compiler will use, work out their targets and change them, but this gets ugly fast.
    • luis.espinal
      luis.espinal over 11 years
      Sorry if I was being sarcastic (frustration got the best of me). If there are not ready-made tools like this (like objcopy), them I'm afraid I will have to follow this route (I will have to decide if the ROI is sufficient to justifying going this way.) Thanks.
    • Chris Stratton
      Chris Stratton over 11 years
      If you can modify the source code / build commands to implement the sort of fix you were hoping for, why can't you simply solve it at the level of the function name in the source? Or move the function to its own compilation unit?
    • luis.espinal
      luis.espinal over 11 years
      Contractual/process/red-tape problems. We need to perform black-box testing of a subsystem A to be linked with another subsystem B (with the later to be linked as-is, in pre-compiled form). And we need to get some tracing of calls a bit different from what we can get with gprof or callgrind. Changing the source is easy, but so procedural/red-tape painful that I'm actually considering if it is worth the trouble of hacking the objs in a manner that is automated and cheap. Let's just say that is not the type of thing normal-thinking people would do under a sensible, normal-looking process :P
    • Chris Stratton
      Chris Stratton over 11 years
      I'm not sure I see the difference between a script which automatically alters a working copy of the source and one that does a much harder to prove out modification of the object. stackoverflow.com/questions/617554/… presents some variations. If it's just for profiling, can you do something with breakpoint debugger functionality?
    • luis.espinal
      luis.espinal over 11 years
      I agree with you. It's just one of those client/contract combo that contractually demands things done a certain way even when it makes no sense :/ The debugger option might be a possible way to go (break at given points, print the stack, resume...)
    • Vegard
      Vegard over 10 years
      This is not exactly what you asked, but I came here looking for a slightly different problem: How do I replace a function in an already compiled object file so that callers inside the existing object file refer to a new function from another file? The answer is to use objcopy --weaken-symbol=called_function and link with a new object that defines called_function().
    • Alexey Yahno
      Alexey Yahno over 9 years
      It's interesting if someone managed to achieve the goal using --wrap? I din't. But I found that the goal may be achieved using a run-time function wrapping using LD_PRELOAD run-time function replacement technique.
  • luis.espinal
    luis.espinal over 11 years
    I use this to wrap calls across compilation units (see my original question for an example). However, it does not work for intercept/wrap alls from within compilation units (which is what I'm interested in intercepting.) Apparently, within the compilation units, the references are resolved. By the time the linker comes in, it is already too late to wrap those calls using the --wrap linker option.
  • Employed Russian
    Employed Russian over 11 years
    @luis.espinal "it is already too late" -- no, it isn't. The linker could easily change the call target; it just doesn't (for reasons I don't know).
  • luis.espinal
    luis.espinal over 11 years
    Well, when I say "it is too late", I say so within the context of GNU ld (not within the context of linkers in general.) Yes, a linker could easily change that call target. But the linker in question (GNU ld) does not. And the reason is that it limits itself to replace/rewrite the references that are not resolved within the compilation unit. It is because of that last step that I say the linking stage is already too late for GN ld (though it would not be too late for a smarter linker.)
  • luis.espinal
    luis.espinal over 9 years
    I know this option. It is pretty much what I use. This does not work in the scenario I mentioned. See my original question again.
  • luis.espinal
    luis.espinal over 6 years
    That's a nice idea. Will use next time. Unfortunately, at the time I asked the question, I was dealing with software that I could not modify to add such an attribute. This is good, however, and will certainly use in my toolbox in the future.
  • MicroJoe
    MicroJoe over 6 years
    Well yes, if you cannot modify the source then @PeterHuewe's answer is the solution using objcpy. If you can modify the source then this one seems easier to set up.
  • Yahya Tawil
    Yahya Tawil over 6 years
    I tried this trick in the following case: 1- I have an SDK for an embedded platform which has a function I need to replace by another deceleration. 2- I made the symbol weak and global again from the object file in the target library using gcc-objcopy after compilation. The problem that building process include making an archive file (called core.a) which include the old library object file. 3- I added a step to delete the object file and replace it with the new one (with weak symbol) using gcc-ar from cora.a. As a result The trick didn't succeed (multiple definition of ..) Help?
  • luis.espinal
    luis.espinal over 4 years
    Thanks! I haven't touched this problem in ages :)
  • Gregory Morse
    Gregory Morse over 3 years
    This script has a bug since it was tested with only a function at 0. Namely objcopy will interpret the value as decimal while objdump gives hexidecimal so "0x" must be prepended e.g. --add-symbol real_foo=.text:0x$(shell objdump -t foo.o | grep foo | grep .text | cut -d' ' -f 1),function,global and --add-symbol real_foo=.text:0x0000000000000000,function,global would make this function outside of this special zero case.
  • Matthijs Kooijman
    Matthijs Kooijman over 3 years
    The example in this answer shows how to use --wrap, but it does not show the case where the wrapped function (malloc in this case), is defined in the same compilation unit as the call, which is the core of the original question. So it's not really an answer to the question and I'll downvote this answer.
  • Matthijs Kooijman
    Matthijs Kooijman over 3 years
    Where would you add this option? Could you maybe show a more complete example? I did a quick try adding -u bar on the linker commandline along with -Wl,--wrap=bar, but that did not seem to change anything? It probably makes foo undefined at the start, but not inside foo.c...
  • Matthijs Kooijman
    Matthijs Kooijman over 3 years
    Interesting, it seems like --defsym just allows overriding existing symbols in the .o files (i.e. there's no multiple definition error here from the --defsym and the foo defined in the .o file). It seems like --defsym is essentially handled the same as assignments in the linker script, which may behave the same. However, I believe this approach does not allow also defining __real_* symbols: As soon as you override the foo symbol, I think you'll loose access to the original symbol...