How can I extract static libs containing repeated object files?

12,967

Solution 1

I tried 'ar p', but talking to a friend it was decided the following python solution could be better. Now it's possible to extract the repeated object files.

def extract_archive(pathtoarchive, destfolder) :

    archive = open(pathtoarchive, 'rb')

    global_header = archive.read(8)
    if global_header != '!<arch>\n' :
        print "Oops!, " + pathtoarchive + " seems not to be an archive file!"
        exit()

    if destfolder[-1] != '/' :
        destfolder = destfolder + '/'

    print 'Trying to extract object files from ' + pathtoarchive

    # We don't need the first and second chunk
    # they're just symbol and name tables

    content_descriptor = archive.readline()
    chunk_size = int(content_descriptor[48:57])
    archive.read(chunk_size)

    content_descriptor = archive.readline()
    chunk_size = int(content_descriptor[48:57])
    archive.read(chunk_size)

    unique_key = 0;

    while True :

        content_descriptor = archive.readline()

        if len(content_descriptor) < 60 :
            break

        chunk_size = int(content_descriptor[48:57])

        output_obj = open(destfolder + pathtoarchive.split('/')[-1] + '.' + str(unique_key) + '.o', 'wb')
        output_obj.write(archive.read(chunk_size))

        if chunk_size%2 == 1 :
            archive.read(1)

        output_obj.close()

        unique_key = unique_key + 1

    archive.close()

    print 'Object files extracted to ' + destfolder + '.'

Solution 2

  1. You have to extract the objects from the static library (the library which contains the duplicated objects)
  2. Then you have to build a new library from the extracted objects.
  3. The new library will contain ONLY ONE instance of the duplicated objects.
  4. You have to use the ar t command to produce the lists of the objects from the two libraries (the original-with duplicates one and the new one - without duplicates).
  5. Then use e.g vimdiff to check the differences between the two list.
  6. Write down all the differences.
  7. Then extract only those objects (step 6) objects from the original library, using the command ar x my_original_lib.a object.o
  8. Then rename the produced extracted object to any name you like
  9. Then use the command ar m my_original_lib.a object.o to rearrange the object.o
  10. Then use the same command, as step 7, and you will extract the second object.o
  11. Give a different name to the newly extracted object
  12. Use both of them to build the new library.
  13. The method holds for ANY number of duplicated objects in the static library. Just use the step 9 and 7 repeatedly to extract all the dublicates
Share:
12,967
Gustavo Meira
Author by

Gustavo Meira

I'm a C++/Python software engineer that's been working in the telecom industry for a few years.

Updated on June 04, 2022

Comments

  • Gustavo Meira
    Gustavo Meira almost 2 years

    I'm trying to build a big static library merging two static libraries. In moment I'm using the 'ar' command, extracting objects, for example, from 'a.a' and 'b.a' and then reassembling these objects using 'ar' again:

    $ ar x a.a
    $ ar x b.a
    $ ar r merged.a *.o
    

    Unfortunately it isn't working for my purpose, since a.a has inside different objects with the SAME NAME. The 'ar' command is extracting the repeated objects and replacing the already extracted ones with the same name. Even with the same name, these objects have different symbols, so I get undefined references since some symbols are being missed together with the replaced files.

    I have no access to the original objects and already tried 'ar xP' and 'ar xv' and lots of 'ar stuff'. Does anyone can help me showing how to merge these libs?

    Thanks in advance.

  • Gustavo Meira
    Gustavo Meira over 12 years
    The problem is not they're repeated between archives, but the objects are repeated inside the same lib, inside the same archive! When 'ar' extracts them, objects, from the same archive, with the same name get overwritten. But thanks for your help :)
  • paulsepolia
    paulsepolia over 9 years
    A C++ code which does merge many libraries into a single new one, without overwriting possible duplicated objects is here: bazaar.launchpad.net/~paulsepolia/+junk/arbet/files/head:/…
  • Hiroshi Ichikawa
    Hiroshi Ichikawa almost 8 years
    Great, I had the same problem and this script solved the issue. But I believe you should write content_descripter[48:58] instead of 48:57. The file size field is 10 bytes instead of 9 bytes according to Wikipedia.
  • Hiroshi Ichikawa
    Hiroshi Ichikawa almost 8 years
    Also, if you handle BSD variant of .a file format (e.g., on Mac), the file name can be a part of the body part, as explained in Wikipedia. So you would also need a logic to strip it out.
  • Tapan Thaker
    Tapan Thaker over 2 years
    Since this problem isn't still solved in 2022. handling macOS variant inspired from the above script: github.com/tapthaker/extract-archive