What is Smali Code Android

46,872

Solution 1

When you create an application code, the apk file contains a .dex file, which contains binary Dalvik bytecode. This is the format that the platform actually understands. However, it's not easy to read or modify binary code, so there are tools out there to convert to and from a human readable representation. The most common human readable format is known as Smali. This is essentially the same as the dissembler you mentioned.

For example, say you have Java code that does something like

int x = 42

Assuming this is the first variable, then the dex code for the method will most likely contain the hexadecimal sequence

13 00 2A 00

If you run baksmali on it, you'd get a text file containing the line

const/16 v0, 42

Which is obviously a lot more readable then the binary code. But the platform doesn't know anything about smali, it's just a tool to make it easier to work with the bytecode.

Dalvik and ART both take .dex files containing dalvik bytecode. It's completely transparent to the application developer, the only difference is what happens behind the scenes when the application is installed and run.

Solution 2

High level language programming include extra tools to make programming easier & save time for the programmer. After compiling the program, if it was to be decompiled, going back to the original source code would need a lot of code analysis, to determine structure & flow of program code, most likely a few more than 1 pass/parse. Then the decompiler would have to structure the source based on the features of the compiler that compiled the code, the version or the compiler, and the operating system it was compiled on eg. if an OS specific features or frameworks or parsers or external libraries were involved, such as .net or dome.dll, and their versions, etc

The next best result would be to output the whole program flow, as if the source code was written in one large file ie. no separate objects, libraries, dependencies, inheritances, classes or api. This is where the decompiler would spit out code which when compiled, would result in errors since there's no access to the source codes & structure of the other files/dependencies. See example here.

The 3rd & best option would be to follow what the operating system is doing based on the programmed instructions, which would be machine code, or dex (in case of Android). Unless you're sitting in the Nebuchadnezzar captained by Morpheus and don't have time to decode every opcode in the instruction set of the architecture your processor is running, you'd want something more readable than unicode characters scrolling on the screen as you monitor the program flow/execution. executing machine code viewed in monitor This is where assembly code makes the difference; it's almost the direct translation of machine code, in a human readable format. I say "almost" direct because microprocessors have helpers like microcodes, multithreaders for pipelining & hardware accelerators to give a better user experience.

If you have the source code, you'd be editing in the language the code is written in. Similarly, if you don't have the source code, and you're editing the compiled app, you'd still be editing in the language the code is written in; in this case, it's machine code, or the next best thing: smali.

Here's a diagram to illustrate "Dalvik VM, dex and Smali" and "its place in chain of compilers". dex from java

Share:
46,872
Admin
Author by

Admin

Updated on August 30, 2021

Comments

  • Admin
    Admin almost 3 years

    I am going to learn a little bit about Dalvik VM, dex and Smali.

    I have read about smali, but still cannot clearly understand where its place in chain of compilers. And what its purpose.
    Here some questions:

    1. As I know, dalvik as other Virtual Machines run bytecode, in case of Android it is dex byte code.
    2. What is smali? Does Android OS or Dalvik Vm work with it directly, or it is just the same dex bytecode but more readable for the human?
    3. Is it something like dissasembler for Windows (like OllyDbg) program executable consist of different machines code (D3 , 5F for example) and there is appropriate assembly command to each machine code, but Dalvik Vm also is software, so smali is readable representation of bytecodes
    4. There is new ART enviroment. Is it still use bytecodes or it executes directly native code?

    Thank you in advance.

  • Admin
    Admin about 9 years
    Can you suggest something to read to get deeper in this ? It is very interesting for me
  • Antimony
    Antimony about 9 years
    You can see the details of the dex format at source.android.com/devices/tech/dalvik/dex-format.html
  • JustAMartin
    JustAMartin about 7 years
    But why use smali at all if we have tools like jadx decompiler, which can get much more human readable Java code instead of smali?
  • Antimony
    Antimony about 7 years
    Because smali can't be mapped to Java code 1:1. Decompilation is lossy.
  • Emanuel Moecklin
    Emanuel Moecklin almost 7 years
    Just to clarify, decompiling to Java is lossy (as Antimony mentioned) meaning the code is probably good enough to get the gist of what it's doing but not good enough to re-build the apk. An apk decompiled to smali with apktool can be modified (change resources, inject code, modify code) and then built into a working apk again.
  • Mehran Torki
    Mehran Torki over 5 years
    @Antimony Can you please clarify why decompilation is lossy? Is it because of some instructions such as goto in Dalvik instruction set that has no equivalent in JVM instruction set?
  • Antimony
    Antimony over 5 years
    @MehranTorki Both Java and Dalvik bytecode have goto instructions. But there is no equivalent of goto at the Java language level, which is perhaps what you meant. Bytecode is not meant to be a 1:1 mapping of the source language.
  • Mehran Torki
    Mehran Torki over 5 years
    @Antimony Aha, i think now i got it, decompilation is lossy because of difference between source language (java) and the JVM/Dalvik bytecode. Thanks.
  • Mehran Torki
    Mehran Torki over 5 years
    @Antimony From this question comments, you say that even jar -> dex or dex -> jar are lossy due to difference in byte code format, am i getting you right? That would be great if you could post an answer for that question as i think the question is interesting for others too.
  • Antimony
    Antimony over 5 years
    @Mehran Yes, jar -> dex and dex -> jar are also lossy, since the two bytecode formats are not identical. However, the differences are relatively slight, so it is not something you have to worry about in common cases.
  • West
    West over 2 years
    So in reality there is no way to decompile and recompile mobile apps right? I tried using apktool on a very simple app I created and got errors on recompiling even with nothing changed
  • IgorGanapolsky
    IgorGanapolsky about 2 years
    @West The only think you can do is modify the Smali code, re-sign the apk with your own certificate, and re-compile the APK. This is the link Zimba posted in his answer above.