How to open a file in assembler and modify it?
Solution 1
This is x86 Linux (x86 is not the only assembly language, and Linux is not the only Unix!)...
section .data
textoutput db 'Hello world!', 10
lentext equ $ - textoutput
filetoopen db 'hi.txt'
The filename string requires a 0-byte terminator: filetoopen db 'hi.txt', 0
section .text
global _start
_start:
mov eax, 5 ;open
mov ebx, filetoopen
mov ecx, 2 ;read and write mode
2
is the O_RDWR
flag for the open
syscall. If you want the file to be created if it doesn't already exist, you will need the O_CREAT
flag as well; and if you specify O_CREAT
, you need a third argument which is the permissions mode for the file. If you poke around in the C headers, you'll find that O_CREAT
is defined as 0100
- beware of the leading zero: this is an octal constant! You can write octal constants in nasm
using the o
suffix.
So you need something like mov ecx, 0102o
to get the right flags and mov edx, 0666o
to set the permssions.
int 80h
The return code from a syscall is passed in eax
. Here, this will be the file descriptor (if the open succeeded) or a small negative number, which is a negative errno
code (e.g. -1 for EPERM
). Note that the convention for returning error codes from a raw syscall is not quite the same as the C syscall wrappers (which generally return -1
and set errno
in the case of an error)...
mov eax, 4
mov ebx, filetoopen ;I'm not sure what do i have to put here, what is the "file descriptor"?
...so here you need to mov ebx, eax
first (to save the open
result before eax
is overwritten) then mov eax, 4
. (You might want to think about checking that the result was positive first, and handling the failure to open in some way if it isn't.)
mov ecx, textoutput
mov edx, lentext
Missing int 80h
here.
mov eax, 1
mov ebx, 0
int 80h ; finish without errors
Solution 2
Did you read the Linux Assembly HOWTO? It covers your question.
You can also compile some C code with gcc -S -fverbose-asm -O1
and look at the generated assembly. For example, with foo.c
, run gcc -S -Wall -fverbose-asm -O1 foo.c
(as a command in some terminal) then look (using some editor -perhaps GNU emacs) into the generated foo.s
assembler file.
At last, I don't think it is worth bothering a lot about assembler. In 2020, a recent GCC compiler will surely generate better code than what you could write (if you invoke it with optimizations, at least -O2
). See this draft report for more.
Rama
Updated on July 05, 2022Comments
-
Rama almost 2 years
I'm starting to learn Assembler and I'm working in Unix. I want to open a file and write 'Hello world' on it.
section .data textoutput db 'Hello world!', 10 lentext equ $ - textoutput filetoopen db 'hi.txt' section .text global _start _start: mov eax, 5 ;open mov ebx, filetoopen mov ecx, 2 ;read and write mode int 80h mov eax, 4 mov ebx, filetoopen ;I'm not sure what do i have to put here, what is the "file descriptor"? mov ecx, textoutput mov edx, lentext mov eax, 1 mov ebx, 0 int 80h ; finish without errors
But when I compile it, it doesn't do anything. What am I doing wrong? When I open a file where does the file descriptor value return to?
-
Zhanger almost 11 yearsGreat answer! I just wanted to mention that the flags for open can be found in
/usr/include/bits/fcntl.h
on debian/ubuntu, here's a gist of the constants -
Peter Cordes about 5 yearsWhy would you want to store the flags / mode in
.data
as qword integers? You could useequ
the same way you are formsglen
to define them there but assemble into mov-immediate. Also, normally you should use mode =0666o
and let the user'sumask
take care of permissions, so they have the choice of644
or 664, or of
600` if they want. -
Peter Cordes about 5 yearsThere's also no reason to store the file descriptor to memory at all. You could copy it to a call-preserved register (e.g.
ebx
) so it survives function calls that clobber RAX. Or toedi
if yourprint
function uses a custom calling convention like Irvine32 that preserves RDI. (BTW,int open(const char *pathname, int flags, mode_t mode);
returns an int, which is only 32 bits in the x86-64 System V ABI. It's pointless to copy around the whole 64-bit register when you only needeax
. Also, this answer apparently depends on somebasicFunctions.asm
which you haven't even linked to. -
Peter Cordes about 5 yearsYou need
fopen
/fprintf
to write to a file. (Or more simply,fputs
if you don't need formatting.) Or you need to makeopen
anddup2
system calls to redirect yourstdout
to the file before using stdioprintf
. Writing asm that uses more portable APIs seems pointless when the calling convention is platform-specific (like for x86-64). 32-bit Windows and Linux calling conventions are close enough for most things, except Windows uses_printf
in 32-bit code, doesn't it? But you forgot to ensure 16-byte stack alignment before thecall
; i386 System V requires that. (sub esp,8
) -
platinoob_ almost 4 yearsExcuse me but how do I have to assemble and run this in the linux terminal bcs with
nasm -f elf my_code.asm
ld -m elf_i386 my_code.o -s -o my_code
mycode
does nothing