Native Backend
Overview
The native backend compiles TAC IR directly to machine code bytes. No LLVM, no intermediate assembly — tape emits raw instructions into a code buffer that the binary emitters wrap in PE/ELF/Mach-O format.
Two code generators exist:
| Backend | Architecture | Targets |
|---|---|---|
x64_codegen | x86-64 | win64, linux, uusi |
aarch64_codegen | ARM64 | macos-arm64 |
Both produce an x64_module struct (shared output format) that the binary emitters consume.
Register strategy
Both backends use “spill everything” — every TAC virtual register lives at a fixed stack slot. Before each operation, operands are loaded into scratch registers; after, the result is stored back. This is correct-first, fast-later:
; TAC: %r5 = add %r3, %r4
mov rax, [rbp - 8*4] ; load %r3
mov rcx, [rbp - 8*5] ; load %r4
add rax, rcx
mov [rbp - 8*6], rax ; store %r5Stack slot for TAC register N: [RBP - 8*(N+1)] (x64) or [x29 - 8*(N+1)] (ARM64).
x86-64 scratch registers
| Register | Purpose |
|---|---|
| RAX | Primary scratch, return value, left operand |
| RCX | Right operand, shift count, arg passing (Win64) |
| RDX | Division, arg passing |
| R8, R9 | Arg passing |
| R11 | Extra scratch |
ARM64 scratch registers
| Register | Purpose |
|---|---|
| x0 | Primary scratch, return value, left operand |
| x1 | Right operand, arg passing |
| x2 | Extra scratch, arg passing |
| x9 | Address calculations |
| x10 | Memcpy loops |
| x16 | Indirect calls (intra-procedure-call scratch) |
Calling convention
The compiler uses the platform ABI for all calls — both tape-to-tape and tape-to-extern go through the same convention:
Win64
- Args 1–4: RCX, RDX, R8, R9 (positional — float args use XMM0–3 in same position)
- Args 5+: stack (right-to-left)
- 32-byte shadow space reserved by caller
- Return: RAX (or XMM0 for floats)
SysV (Linux, macOS x64)
- Integer args: RDI, RSI, RDX, RCX, R8, R9 (separate counter)
- Float args: XMM0–7 (separate counter)
- Args overflow: stack
- Return: RAX (or XMM0 for floats)
AAPCS64 (macOS ARM64)
- Args: x0–x7 (integer), d0–d7 (float)
- Return: x0 (or d0 for floats)
- Frame pointer: x29, Link register: x30
- SP 16-byte aligned at call sites
Fixups
During code emission, forward references (branch targets, function calls, string addresses) emit placeholder bytes and record a fixup entry:
| Fixup kind | Purpose |
|---|---|
FIXUP_REL32 | Direct relative call to internal function |
FIXUP_BLOCK | Intra-function branch to a basic block |
FIXUP_EXTERN_RIP | Indirect call through IAT/GOT (extern functions) |
FIXUP_STRING_RIP | RIP-relative load of string constant address |
For ARM64, the same fixup kinds map to different encodings: BL (26-bit immediate), B/B.cond (26/19-bit), ADRP+ADD+BLR sequences.
The binary emitters resolve all fixups once final code layout is known.
Instruction encoding
x86-64
Direct byte emission: REX prefixes, ModR/M, SIB, displacement, immediates. A small built-in assembler provides helpers like emit_mov_reg_imm64, emit_load_slot, emit_store_slot.
ARM64
Fixed 32-bit instruction words. Helpers for MOVZ/MOVK (64-bit immediates), LDR/STR (scaled offset), ADD/SUB, B/BL/B.cond, STP/LDP.
Output formats
PE64 (Windows) — pe64_emit / pe64_emit_dll
- Image base:
0x140000000 .text: entry stub + user code.rdata: import directory, IAT, hint/name table, string constants- Entry stub: calls
main(), thenExitProcessvia IAT - DLL mode: adds export directory table (sorted by name)
ELF64 Uusi — elf64_emit
- Virtual base:
0x100000000000 - Single LOAD segment (flat binary, no dynamic linker)
- Entry stub: calls
main, passes result toint 0x80exit syscall - Static linking only
ELF64 Uusi STO — elf64_emit_sto
- Shared object (
.stofile, ET_DYN) - Exports vtable: array of function pointers (one per
pub fn) - R_X86_64_RELATIVE relocations for the DSO loader to fixup at load time
ELF64 Linux — elf64_emit_linux
- Virtual base:
0x400000 - Dynamically linked (ET_EXEC with PT_INTERP →
/lib64/ld-linux-x86-64.so.2) - GOT/PLT for extern functions (R_X86_64_JUMP_SLOT)
- Entry stub:
_start→ callsmain, thenexitsyscall (nr 231)
ELF64 Object — elf64_emit_obj
- Relocatable object file (
.o) for external linkers
Mach-O (macOS) — macho64_emit / macho64_emit_obj
- Text vmaddr:
0x100000000 - Page size: 16KB (0x4000)
- Segments:
__TEXT(code + string constants),__LINKEDIT - Dyld chained fixups for extern function binding
- Object mode: relocatable
.ofor external linking
Codegen output structure
x64_module {
code // all function code concatenated (raw bytes)
functions[] // per-function: name, code_offset, code_size, frame_size, is_export
externs[] // extern functions: name, dll/library, fn_index
strings[] // string constants: data pointer + length
fixups[] // unresolved references for emitters to patch
target // which platform this was generated for
}Last modified: