How to open a file in assembler and modify it?

27,261

Solution 1

This is x86 Linux (x86 is not the only assembly language, and Linux is not the only Unix!)...

section .data

textoutput db 'Hello world!', 10
lentext equ $ - textoutput
filetoopen db 'hi.txt'

The filename string requires a 0-byte terminator: filetoopen db 'hi.txt', 0

section .text
global _start

_start:

mov eax, 5            ;open
mov ebx, filetoopen
mov ecx, 2            ;read and write mode

2 is the O_RDWR flag for the open syscall. If you want the file to be created if it doesn't already exist, you will need the O_CREAT flag as well; and if you specify O_CREAT, you need a third argument which is the permissions mode for the file. If you poke around in the C headers, you'll find that O_CREAT is defined as 0100 - beware of the leading zero: this is an octal constant! You can write octal constants in nasm using the o suffix.

So you need something like mov ecx, 0102o to get the right flags and mov edx, 0666o to set the permssions.

int 80h

The return code from a syscall is passed in eax. Here, this will be the file descriptor (if the open succeeded) or a small negative number, which is a negative errno code (e.g. -1 for EPERM). Note that the convention for returning error codes from a raw syscall is not quite the same as the C syscall wrappers (which generally return -1 and set errno in the case of an error)...

mov eax, 4
mov ebx, filetoopen   ;I'm not sure what do i have to put here, what is the "file descriptor"?

...so here you need to mov ebx, eax first (to save the open result before eax is overwritten) then mov eax, 4. (You might want to think about checking that the result was positive first, and handling the failure to open in some way if it isn't.)

mov ecx, textoutput
mov edx, lentext

Missing int 80h here.

mov eax, 1
mov ebx, 0
int 80h              ; finish without errors

Solution 2

Did you read the Linux Assembly HOWTO? It covers your question.

You can also compile some C code with gcc -S -fverbose-asm -O1 and look at the generated assembly. For example, with foo.c, run gcc -S -Wall -fverbose-asm -O1 foo.c (as a command in some terminal) then look (using some editor -perhaps GNU emacs) into the generated foo.s assembler file.

At last, I don't think it is worth bothering a lot about assembler. In 2020, a recent GCC compiler will surely generate better code than what you could write (if you invoke it with optimizations, at least -O2). See this draft report for more.

Share:
27,261
Rama
Author by

Rama

Updated on July 05, 2022

Comments

  • Rama
    Rama almost 2 years

    I'm starting to learn Assembler and I'm working in Unix. I want to open a file and write 'Hello world' on it.

    section .data
    
    textoutput db 'Hello world!', 10
    lentext equ $ - textoutput
    filetoopen db 'hi.txt'
    
    section .text
    global _start
    
    _start:
    
    mov eax, 5            ;open
    mov ebx, filetoopen
    mov ecx, 2            ;read and write mode
    int 80h
    
    mov eax, 4
    mov ebx, filetoopen   ;I'm not sure what do i have to put here, what is the "file descriptor"?
    mov ecx, textoutput
    mov edx, lentext
    
    mov eax, 1
    mov ebx, 0
    int 80h              ; finish without errors
    

    But when I compile it, it doesn't do anything. What am I doing wrong? When I open a file where does the file descriptor value return to?

  • Zhanger
    Zhanger almost 11 years
    Great answer! I just wanted to mention that the flags for open can be found in /usr/include/bits/fcntl.h on debian/ubuntu, here's a gist of the constants
  • Peter Cordes
    Peter Cordes about 5 years
    Why would you want to store the flags / mode in .data as qword integers? You could use equ the same way you are for msglen to define them there but assemble into mov-immediate. Also, normally you should use mode = 0666o and let the user's umask take care of permissions, so they have the choice of 644 or 664, or of 600` if they want.
  • Peter Cordes
    Peter Cordes about 5 years
    There's also no reason to store the file descriptor to memory at all. You could copy it to a call-preserved register (e.g. ebx) so it survives function calls that clobber RAX. Or to edi if your print function uses a custom calling convention like Irvine32 that preserves RDI. (BTW, int open(const char *pathname, int flags, mode_t mode); returns an int, which is only 32 bits in the x86-64 System V ABI. It's pointless to copy around the whole 64-bit register when you only need eax. Also, this answer apparently depends on some basicFunctions.asm which you haven't even linked to.
  • Peter Cordes
    Peter Cordes about 5 years
    You need fopen / fprintf to write to a file. (Or more simply, fputs if you don't need formatting.) Or you need to make open and dup2 system calls to redirect your stdout to the file before using stdio printf. Writing asm that uses more portable APIs seems pointless when the calling convention is platform-specific (like for x86-64). 32-bit Windows and Linux calling conventions are close enough for most things, except Windows uses _printf in 32-bit code, doesn't it? But you forgot to ensure 16-byte stack alignment before the call; i386 System V requires that. (sub esp,8)
  • platinoob_
    platinoob_ almost 4 years
    Excuse me but how do I have to assemble and run this in the linux terminal bcs with nasm -f elf my_code.asm ld -m elf_i386 my_code.o -s -o my_code mycode does nothing