Part 08 — The Freya Stack

91compose layout

Our docker-compose layout

The KKB host (kkbfcfreyasrv01, internal 192.168.35.197, public 185.199.89.19) runs a single docker-compose.onprem.yml. There is no Kubernetes here, no Swarm, no orchestrator beyond docker compose up -d. That is the entire deployment surface. A single git pull on freya-onprem and a docker compose up -d ships a release.

Two things matter about the network layout:

Asterisk, the 30 voice-agent containers, and coturn run on network_mode: host. RTP sockets and SIP sockets bind directly to the host's NIC. Docker's userland NAT never touches voice. This is non-negotiable — once you NAT RTP through Docker bridge, jitter and packet loss climb, and SDP advertisements lie about the reachable IP.
Everything stateful or HTTP-only is on a docker bridge. Postgres, Valkey, MinIO, dashboard, the workers. They talk to each other by service name on the internal freya bridge network, and only the dashboard is exposed externally (via nginx + Cloudflare tunnel for kkb.freya.host).

Voice-agents are a fleet. kkb-freya-voice-agent-1 through kkb-freya-voice-agent-30 are 30 identical pipecat-agent containers, each one a concurrent-call slot. Asterisk's dialplan picks an idle one when a call lands. Thirty was sized to fit comfortably on the box's 4× H100 NVL GPUs once STT, TTS, LLM, and noise cancellation are also resident.

KKB host architecture (live diagram — hover any service)

hover services

telephony / orchestration

GPU inference

HTTP / dashboard

postgres

valkey

minio (S3)

Solid arrows = SIP / RTP / API calls. Dashed = bridge-internal traffic. Hover any service for image, ports, and role.

Compose service explorer — click to expand

Mock environment values shown for shape. Real secrets live in .env on the host, never in the compose file.

92configs on host

Where configs live on the host

The compose file does one important trick: it bind-mounts /etc/asterisk-custom/ on the host into /etc/asterisk/ inside freya-asterisk. That way you edit Asterisk config on the host with whatever editor you like, and the container picks it up after a reload — no image rebuild, no container restart.

# on kkbfcfreyasrv01 — host filesystem /home/freya/freya-onprem/ ├── docker-compose.onprem.yml # the single compose file ├── kkb/ # KKB-specific overrides │ ├── docker-compose.yml # merged with above │ └── nginx.conf # dashboard ingress └── playbooks/ # runbooks └── sip-asterisk-debug.md /etc/asterisk-custom/ → mounted as /etc/asterisk in freya-asterisk ├── pjsip.conf # endpoints, AORs, identifies — per customer ├── extensions.conf # dialplan, from-trunk + from-internal contexts ├── rtp.conf # rtpstart=10000 rtpend=10499 ├── websocket_client.conf # chan_websocket — the bridge to voice-agent ├── logger.conf # SIP logger toggles ├── modules.conf # whitelist of loaded modules └── ari.conf # REST interface for campaign-worker /var/log/asterisk/ # also visible via `docker logs freya-asterisk` └── full /var/spool/asterisk/recordings/ # MixMonitor wav files before MinIO upload

The first time you SSH onto the host you should know two paths by heart: /home/freya/freya-onprem/docker-compose.onprem.yml (what runs) and /etc/asterisk-custom/pjsip.conf (what the trunks look like).

Reloading is a one-line dance — never restart the container, you would drop active calls:

Try this on KKB

$ ssh freya@192.168.35.197 "docker exec freya-asterisk asterisk -rx 'pjsip reload'"
$ ssh freya@192.168.35.197 "docker exec freya-asterisk asterisk -rx 'dialplan reload'"
$ ssh freya@192.168.35.197 "docker exec freya-asterisk asterisk -rx 'pjsip show endpoints'"

If you ever find yourself docker compose restart freya-asterisk — stop. Use pjsip reload, dialplan reload, or core reload instead. The container restart kills every PJSIP dialog, drops every RTP socket, and hangs up every active call mid-sentence. Live calls survive a reload; they do not survive a restart.

93pjsip knobs

Common Freya PJSIP knobs

Every customer endpoint in pjsip.conf starts from the same template. The values below are not Asterisk defaults — every one of them was changed in response to a specific customer incident, and reverting any of them re-introduces a bug we already paid for. Click a row to read the story behind it.

[provider-template](!)
type=endpoint
disallow=all
allow=ulaw,alaw
direct_media=no
rtp_symmetric=yes
force_rport=no
rewrite_contact=no
trust_id_inbound=yes
identify_by=ip
dtmf_mode=rfc4733
context=from-trunk

PJSIP knobs — Freya defaults vs Asterisk defaults

click any row

Setting Ours Default Why we override Driver

"Driver" is the customer or incident that forced the override. "Default" is what stock Asterisk PJSIP ships with.

One pattern repeats: most of these knobs disable an Asterisk feature that quietly mutates SIP headers. The reasoning is consistent — when there is a SIP proxy, an SBC, or a carrier between us and the UA, we want to pass headers through, not rewrite them. Asterisk's defaults are tuned for the era when it was the edge SBC. We are not the edge anymore; the customer's SBC is.

94custom headers

Custom SIP headers we use

SIP allows arbitrary X-… headers on any request. We use them to pass small bits of out-of-band metadata that don't fit cleanly into the standard headers — direction, our internal call UUID, eventually a workflow ID. They survive end-to-end as long as no proxy in the path strips unknown headers (most don't).

X-Freya-Direction

Set by dialplan / campaign-worker. Disambiguates inbound vs outbound on a single trunk endpoint that handles both.

X-Freya-Direction: outbound
X-Freya-Direction: inbound

X-Freya-Call-Id

Set by campaign-worker before Originate. Carries our internal call UUID into the carrier so the BYE / 200 OK response can be correlated back without depending on Asterisk's Call-ID.

X-Freya-Call-Id: 5f3a9c1e-7b22-4d8e-
                  b1c0-9f4e22aa01c3

X-Freya-Workflow-Id

Header carries the workflow ID into the dialplan so we can route to a specific voice-agent slot without an extra DB lookup at register_call time. Saves one RTT on the boot path.

X-Freya-Workflow-Id: wf_kkb_collections
                      _v3

Two rules for adding custom headers — both bought with customer pain:

Header names must be ASCII. The Anadolu Sigorta A4 incident: Genesys's outbound INVITE contained header names with Turkish characters (X-Genesi̇s-… with a dotted i). PJSIP's parser silently dropped the entire INVITE — no error, no log line, just tcpdump showing packets on the wire that never reached the dialplan. ASCII names only. Values can be any UTF-8, but rare to need.
If a customer's SBC strips unknown headers, you'll know fast. Test with one header you control end-to-end before you start designing routing logic around it. pjsip set logger on on both legs of the bridge tells you exactly what arrived and what didn't.

Adding a header from the dialplan:

exten => _X.,n,Set(PJSIP_HEADER(add,X-Freya-Direction)=outbound)
exten => _X.,n,Set(PJSIP_HEADER(add,X-Freya-Call-Id)=${CALL_UUID})
exten => _X.,n,Dial(PJSIP/${EXTEN}@providers,60,U(handle-answer))

95campaign-worker flow

The campaign-worker outbound flow

Outbound calls do not start at Asterisk — they start at the dashboard. A user creates a campaign (workspace, phone list, agent ID, schedule), the campaign-worker picks initiate_call jobs off the queue, resolves the agent config, and asks Asterisk to dial via the ARI REST API. Asterisk emits the INVITE, waits for 200 OK, then bridges the answered channel onto a WebSocket leg that lands in a voice-agent container.

Dashboard publishes a campaign row (workspace, phone list, agent ID).
Campaign-worker pulls initiate_call jobs from the queue.
HTTP POST to dashboard /api/v2/agent/resolve — returns workflow + caller-ID.
HTTP POST to Asterisk ARI /channels with Originate.
Asterisk sends INVITE to the trunk; on 200 OK, dialplan kicks Dial(WebSocket/ai_media).
Voice-agent boot manifest hits Valkey, audio media starts flowing.
On hangup, hangup-handler runs, recording uploads to MinIO, dashboard notified.
Campaign-worker updates the call row, schedules retries if needed.

Outbound originate ladder — campaign-worker → Asterisk → trunk → handset

Three legs glued together: REST (campaign-worker → ARI), SIP (Asterisk → trunk → handset), and WebSocket (Asterisk → voice-agent). The bridge happens after answer.

The worker currently fires every pending job in one tick — no rate limiting, no per-caller-ID concurrency cap. Per the KKB 2026-04-24 outbound concurrency analysis, the next concrete change is a per-caller-ID concurrency cap and a small inter-INVITE gap (a few hundred ms) so we interact more politely with carriers like Verimor that throttle bursts. Verimor's 603 / Q.850 41 responses on five specific destinations during the 24 April test are stable carrier-side blocks (whitelist or capacity), but the SIP-level retry storm wasn't helping us tell those apart from transient congestion.

One bug to remember from that same day: a campaign run dialed every destination with a stray 94350 prefix instead of +90. Zero connections, all 16:31 calls returned 404. The fix is in the campaign upload path, but the deeper lesson is that the dashboard should surface SIP response codes and Q.850 cause inline so the operator notices the dialing pattern is wrong without needing shell access to the box.

Checkpoint

You SSH into KKB, edit /etc/asterisk-custom/pjsip.conf to add a new trunk, and want the change to take effect without dropping the 14 active calls. What do you run, and what do you not run?

Show answer

Run docker exec freya-asterisk asterisk -rx 'pjsip reload'. Do not run docker compose restart freya-asterisk or docker restart freya-asterisk — that kills every PJSIP dialog and hangs up all 14 calls mid-conversation. PJSIP reload is hot: it adds the new endpoint without disturbing existing dialogs. Confirm with pjsip show endpoints.