Read and print user input with x86 assembly (GNU/Linux)

43,222

Solution 1

str: db 100 is wrong. You allocated one byte with the value 100. Correct is: str: times 100 db 0 to allocate 100 bytes with the value 0.

You've got two problems:

1) To get the number of inputted bytes you can evaluate the return value of the read-function (int 80h / fn 3) in EAX.

2) If you input more characters than "allowed" the rest is stored in the input buffer which you have to empty. A possible method to do this is in the following example:

global _start

section .data
    str: times 100 db 0 ; Allocate buffer of 100 bytes
    lf:  db 10          ; LF for full str-buffer

section .bss
    e1_len resd 1
    dummy resd 1

section .text

_start:
    mov eax, 3          ; Read user input into str
    mov ebx, 0          ; |
    mov ecx, str        ; | <- destination
    mov edx, 100        ; | <- length
    int 80h             ; \

    mov [e1_len],eax    ; Store number of inputted bytes
    cmp eax, edx        ; all bytes read?
    jb .2               ; yes: ok
    mov bl,[ecx+eax-1]  ; BL = last byte in buffer
    cmp bl,10           ; LF in buffer?
    je .2               ; yes: ok
    inc DWORD [e1_len]  ; no: length++ (include 'lf')

    .1:                 ; Loop
    mov eax,3           ; SYS_READ
    mov ebx, 0          ; EBX=0: STDIN
    mov ecx, dummy      ; pointer to a temporary buffer
    mov edx, 1          ; read one byte
    int 0x80            ; syscall
    test eax, eax       ; EOF?
    jz .2               ; yes: ok
    mov al,[dummy]      ; AL = character
    cmp al, 10          ; character = LF ?
    jne .1              ; no -> next character
    .2:                 ; end of loop

    mov eax, 4          ; Print 100 bytes starting from str
    mov ebx, 1          ; |
    mov ecx, str        ; | <- source
    mov edx, [e1_len]   ; | <- length
    int 80h             ; \

    mov eax, 1          ; Return
    mov ebx, 0          ; | <- return code
    int 80h             ; \

Solution 2

Here's one way of calculating the length of a string in x86 assembly:

lea esi,[string]
mov ecx,-1    ; Start with ecx = -1
xor eax,eax   ; Clear eax
cld           ; Make scasb scan forward 
repne scasb   ; while (ecx != 0) { ecx--; if (*esi++ == al) break; }
; ecx now contains -1 - (strlen(string) + 1) == -strlen(string) - 2
not ecx       ; Inverting ecx gives us -(-strlen(string) - 2) - 1 == strlen(string) + 1 
dec ecx       ; Subtract 1 to get strlen(string)

This assumes that the string is NUL-terminated ('\0'). If the string uses some other terminator you'll have to initialize al to that value before repne scasb.

Share:
43,222
Vittorio Romeo
Author by

Vittorio Romeo

I write code, lift weights and play games. I also like everything sci-fi.

Updated on May 06, 2020

Comments

  • Vittorio Romeo
    Vittorio Romeo almost 4 years

    I'm learning x86 assembly on GNU/Linux, and I'm trying to write a program that reads user input from stdin and prints it on stdout.

    The following code does work, but it prints extra characters if the size of the user-entered string is less than 100 bytes.

    section .data
        str: db 100    ; Allocate buffer of 100 bytes
    
    section .bss
    
    section .text
    
    global _start
    
    _start:
        mov eax, 3          ; Read user input into str 
        mov ebx, 0          ; |
        mov ecx, str        ; | <- destination
        mov edx, 100        ; | <- length
        int 80h             ; \
    
        mov eax, 4          ; Print 100 bytes starting from str
        mov ebx, 1          ; |
        mov ecx, str        ; | <- source
        mov edx, 100        ; | <- length
        int 80h             ; \ 
    
        mov eax, 1          ; Return
        mov ebx, 0          ; | <- return code
        int 80h             ; \
    

    How can I reliably calculate the length of the user-entered string?

    How can I avoid printing extra characters?

  • supmethods
    supmethods almost 6 years
    I noticed you used the file descriptor 0 (stdin) to read in the keyboard values, which seem correct to me. However, the following website uses file descriptor 2 instead. Is this an error? tutorialspoint.com/assembly_programming/…
  • rkhb
    rkhb almost 6 years
    @dave558: This is a captivating thing. STDERR is also a stream for reading. So, using file descriptor no. 2 isn't a mistake, but unnecessary and awkward.