Skip to main content

Command Palette

Search for a command to run...

Understanding Compilation in Linux/GCC

Published
2 min read

In Linux the gcc works as follows

  1. Preprocessing stage --> gcc -E file_name -o ouput_file.i or gcc -E -v file_name -o output_file.i

  2. translation process ---> gcc -S -v file_name.i -o output_file.s (generates assembly equivalent instructions in output_file.s)

  3. compilation stage ---> gcc -c -v file_name.s -o output_file.o (generates relocatable object code/raw binary code) here the assembler in the GCC tool kit is invoked by the '-c' flag (This code is not directly executable).

  4. linking process--> gcc file_name.o -o output_file The linker here produces the executable i.e. relocatable object code to executable binary.

View contents of Relocatable/Executable object code

objdump -D file_name.o

objdump -D output_file (executable file)
_init-> Runs global constructors Initializes runtime state

_start -> True entry point of the programSets up Stack: argc/argv/envp and then Calls __libc_start_main

_fini -> release resources that are acquired during _init

Executable code = relocatable object code + run time code(OS specific)

Linker will generate executable code from relocatable Binary code.

C program/process memory layout

The diagram below shows the memory layout of a typical C’s process (Linux/Unix). The process load segments (corresponding to "text" and "data" in the diagram) at the process's base address.

1. The main stack is located just below and grows downwards. Any additional threads or function calls that are created will have their own stacks, located below the main stack. Each of the stack frames is separated by a guard page to detect stack overflows among stacks frame.
2. The heap is located above the process and grows upwards.
3. In the middle of the process's address space, there is a region is reserved for shared objects. When a new process is created, the process manager first maps the two segments from the executable into memory. It then decodes the program's ELF header. If the program header indicates that the executable was linked against a shared library, the process manager (PM) will extract the name of the dynamic interpreter from the program header. The dynamic interpreter points to a shared library that contains the runtime linker code.
4. The process manager will load this shared library in memory and will then pass control to the runtime linker code in this library.

More from this blog

Systems-N-Sentience

12 posts

Systems N Sentience is a technology blog that explores the foundations and evolution of modern computing systems.