Running Code from NULL Pointer in C

If you ever programmed in C, you know you can’t deference a NULL pointer. Consider the following program:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <unistd.h>
#include <sys/syscall.h>

__attribute__((section(".zerotext")))
static void foo() {
syscall(SYS_write, 1, "Hello World!\n", 13);
syscall(SYS_exit, 0);
}

int main() {
void (*f)() = 0;
f();
}

This should segmentation fault, right? Let’s run it.

1
2
$ ./a.out
Hello World!

It prints “Hello World!”, or more precisely, foo is called. How is this possible? In this article, we will explain the technical details about how to achieve this. First, we show how to deference a NULL pointer, and then create an executable that load codes into NULL address.

Dereferencing NULL Pointer

Dereferencing NULL pointer is actually not difficult to do:

1
2
void *p = mmap(0, sysconf(_SC_PAGESIZE), PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_PRIVATE | MAP_FIXED | MAP_ANONYMOUS, -1, 0);

Usually mmap returns -1 indicates MMAP_FAILED. But if you run program with root privilege or vm.mmap_min_addr=0, it actually returns a valid NULL adress. THen fill p with opcodes somehow you synthesize and you are ready to go. But this is not elegent, I want my program to load to NULL address in the first place. To achieve this, we need some ELF hacks.

Linker script

You must have noticed the __attribute__((section(".zerotext"))) above foo function, that tells the compiler to put foo into a custom section zerotext. By default, codes are put into .text section, initialized globals .data section and uninitialized globals .bss section. Linker will order these sections and form the program image. Usually you don’t have to modify them, but sometimes you do need to control program image layout, for example bare metal embeded systems, libopencm3 is a good example for that.

To control program image layout, you need linker script.

To dump default linker script, use ld -v --verbose.

1
2
3
4
5
6
7
8
9
10
11
OUTPUT_FORMAT("elf64-littleaarch64", "elf64-bigaarch64",
"elf64-littleaarch64")
OUTPUT_ARCH(aarch64)
ENTRY(_start)
SEARCH_DIR("=/usr/local/lib/aarch64-linux-gnu"); SEARCH_DIR("=/lib/aarch64-linux-gnu"); SEARCH_DIR("=/usr/lib/aarch64-linux-gnu"); SEARCH_DIR("=/usr/local/lib"); SEARCH_DIR("=/lib"); SEARCH_DIR("=/usr/lib"); SEARCH_DIR("=/usr/aarch64-linux-gnu/lib");
SECTIONS
{
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }
/* ... */

The most important is the SECTIONS part. . means current address, and . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS; means to make the current address equals to SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS, as you can see, that might be something greater than 0x400000. As it’s the first line, that is the program’s lowest load address.

The .interp : { *(.interp) } means to create an .interp section, and put all secions named .interp into it. The point is, a executable file is linked from multiple object files, the former .interp means to create an .interp secion for the executable file, and its content is from the latter .interp, namely all these object files’ .interp section.

To load our .zerotext section to 0 is fairly easy, just put .zerotext : { *(.zerotext) } into the first line of SECTIONS,

1
2
3
4
5
6
SECTIONS
{
.zerotext : { *(.zerotext) }
/* Read-only sections, merged into text segment: */
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
.interp : { *(.interp) }

Compiling and Running the Program

Let’s compile our code with linker script we created aboved and run it with root privilege. You will see “Hello World!” printed!

1
2
3
$ gcc -nostartfiles -static -fno-pie -fno-stack-protector -emain -T aarch64.ld  main.c
$ sudo ./a.out
Hello World!