This post is a part of my attempts at learning Assembly with NASM as our assembler with explanation of a simple, helloworld program.
; helloworld.nasm ; Author: HRDARJI global _start section .text _start: ; printing hello world mov eax, 0x4 mov ebx, 0x1 mov ecx, message mov edx, mlen int 0x80 ; exiting properly mov eax, 0x1 mov ebx, 0x5 int 0x80 section .data message: db "Hello World" mlen equ $-message ; in the above line mlen will store length of the message.
First, to let the linker know where my program starts, we need to add “_start:” as an identifier to the start of our program (line 8).
To add comments in an assembly program, we can use “;” at the start of the line.
All the code (instructions) for our program will remain in text section. For our program, the text section declaration syntax can be seen on line 6.
All the initialized data will be in our data section. The declaration of data section is on line 24.
The message that we want to print, “Hello World” is initialized under label “message”. db is “define bytes”; It will store our string in consecutive memory location (thanks to nasm).
To print our string, we will use a system call that can print our string.
In IA-32, we will use interrupt 0x80 (line 16)to enter a system call.
For more information on interrupt and int 0x80, I will recommend reading this stackoverflow post and wiki page on interrupt vector table.
How a system call with interrupt instruction works?
When the interrupt handler catches an interrupt, the program will execute as per the current state or values of the registers in the processor. In our case, when the system receives an interrupt on line 18, the value of eax register is 0x4. Which means, that OS will know that write syscall needs to be executed. The mapping of the values to specific syscall can be fonund in “/usr/include/i386-linux-gnu/asm/unistd_32.h” in Ubuntu 21 bit.
There are many system calls defined in “/usr/include/i386-linux-gnu/asm/unistd_32.h” (in Ubuntu 32 OS). Example system calls are printing to the screen, make directory, link files, fork exit. Here is a small preview of the file in which system calls are defined.
#ifndef _ASM_X86_UNISTD_32_H #define _ASM_X86_UNISTD_32_H 1 #define __NR_restart_syscall 0 #define __NR_exit 1 #define __NR_fork 2 #define __NR_read 3 #define __NR_write 4 #define __NR_open 5 #define __NR_close 6 #define __NR_waitpid 7 #define __NR_creat 8 #define __NR_link 9 #define __NR_unlink 10 #define __NR_execve 11 #define __NR_chdir 12 #define __NR_time 13 #define __NR_mknod 14 #define __NR_chmod 15 #define __NR_lchown 16 #define __NR_break 17 #define __NR_oldstat 18 #define __NR_lseek 19 #define __NR_getpid 20 #define __NR_mount 21 #define __NR_umount 22 #define __NR_setuid 23 #define __NR_getuid 24 #define __NR_stime 25
For our example, we will be using wring system call (line 8 in unistd_32.h).
As per Linux manual page for write system call, we need 3 arguments.
ssize_t write(int fd, const void *buf, size_t count);
- fd (file descriptor) is stdout in our case. For any program in Linux, by default, stdin (input) is mapped to fd 0, stdout to fd 1 and stderr fd 2. This will be in ebx register.
- buf will be pointing to our hello world string. This will be in ecx register.
- count will be the length of our string. This will be in edx register.
Almost every system will have a return value and will be stored in eax register. Write syscall’s return value is the length of the string.
Similarly, when the next interrupt is received, the OS will find value 1 in the eax register and will know that this is an exit syscall. The return value is 0x5 as per the argument passed in ebx register.
To summarize, we created two interrupts in our program. One for printing a string to stdout and the second one for exiting with return value 0x5.