High-scale cleartext RTP generation¶
Gossipper can drive thousands of parallel cleartext synthetic RTP streams toward a single SBC or media gateway (for example 10,000 PCMU streams at 20 ms packetization) without the per-stream goroutine and ticker model used by the default media layer.
This document describes the problem, the recommended approach, how to use -media_scale, and how the implementation fits into the codebase.
For a short operational cheat sheet, see also media-scale.md.
Summary¶
| Topic | Recommendation |
|---|---|
| Goal | Send-only cleartext RTP load (synthetic silence) to one or few peers |
| Enable | -media_scale |
| Standalone | -rtp_send -media_scale -rtp_streams N |
| Scenarios | exec rtp_stream="synthetic,…" with -media_scale |
| Not supported in scale mode | SRTP/DTLS, PCAP/mic/echo, per-packet HEP RTP, RTCP/recv loops |
| Kernel bypass (XDP/eBPF) | Not used for this path; userspace scheduler + batched UDP send is sufficient for ~500k pps |
Problem: default media model¶
Each active RTP flow in the classic Session path typically has:
- One
net.ListenUDPsocket bound to[media_port](per call:localPort + 2 + (callNumber-1)*2inengine.go); - A dedicated
streamLoopgoroutine with its owntime.Ticker(default 20 ms); - One
WriteTosyscall per packet; - In full dialog mode: additional goroutines for RTP receive, RTCP send, and RTCP receive (roughly 3–6 goroutines per stream).
At 10,000 streams and 50 packets per second per stream (20 ms PCMU):
| Metric | Approximate value |
|---|---|
| Aggregate packet rate | ~500,000 UDP packets/s |
| Payload size | ~12 B RTP header + 160 B PCMU ≈ ~80–100 Mbps |
| Goroutines (classic path) | ~30k–60k (streams + SIP) |
| Syscalls | ~500k WriteTo/s without batching |
On modern hardware the NIC bandwidth is not the bottleneck. Scheduling overhead (tickers, goroutines, mutex traffic in Session) and syscall rate are.
Why not XDP / AF_XDP / eBPF first?¶
These technologies excel at ingress processing (filtering, redirect, mirroring at line rate) or a small number of very high-rate flows. They are a poor first fit for generating thousands of independent RTP flows from userspace:
| Technology | Strong fit | Weak fit for gossipper send-only load |
|---|---|---|
| XDP redirect | Ingress filter, DDoS, mirror to userspace | RTP state (SSRC, seq, timestamp) lives naturally in the generator |
| AF_XDP | Few flows, zero-copy RX/TX rings | 10k distinct 5-tuples and local ports complicate UMEM and binding |
| eBPF | Per-flow stats, drop rules | Does not remove userspace RTP state |
| DPDK | Sustained multi‑Mpps | Overkill for ~0.5 Mpps cleartext |
XDP may become relevant later if you need to mirror many inbound RTP streams to Homer without userspace receive—not for cleartext load generation.
Solution: ScaleEngine¶
Implementation: internal/media/scale_engine.go.
flowchart LR
calls[engine executeCall]
reg[StreamRegistry]
sched[Min-heap scheduler 1ms tick]
pool[Sender worker pool]
udp[UDP socket per local port]
calls -->|RegisterStream| reg
reg --> sched
sched -->|due batches| pool
pool -->|batched UDP send| udp
Components¶
scaleStream— compact per-flow state: local/remote endpoints, prebuilt RTP template (synthetic silence), sequence/timestamp, next send time,*net.UDPConn.- Scheduler — one goroutine, 1 ms tick, min-heap ordered by
nextSendAt; no per-stream tickers. - Sender pool — worker goroutines drain a queue; packets grouped by socket; buffers from
sync.Pool; Linux path batches via groupedWriteTo(structure allows futuresendmmsg). - Scale mode constraints — no
rtpReceiveLoop,rtcpLoop, or HEP observer on the hot path; synthetic cleartext only.
Expected effects¶
- Goroutine count O(workers) instead of O(streams).
- Syscall pressure reduced by batching sends per socket.
- Memory churn reduced via buffer pool and in-place RTP header patching (sequence, timestamp).
CLI and configuration¶
| Flag | Description |
|---|---|
-media_scale |
Enable ScaleEngine for eligible RTP actions |
-rtp_streams N |
With -rtp_send -media_scale, start N parallel streams (local ports increment by 2) |
Validation:
-media_scalecannot be combined with-media_srtp.-rtp_streams> 1 requires-media_scale.
Engine wiring: internal/engine/engine_scale.go, internal/engine/actions_exec.go (rtp_stream synthetic + scale), internal/launcher/rtp_sender_scale.go (standalone).
Built-in scenario: invite_media_scale¶
The standard SIP + RTP path for thousands of calls is the built-in scenario -sn invite_media_scale (same XML as testdata/scenarios/uac_invite_media_scale.xml).
- INVITE with SDP (
m=audio [media_port] RTP/AVP 0) - On
200 OK:exec rtp_stream="synthetic,0,0,PCMU/8000,20"(ScaleEngine) - 30 s media hold (
pause), thenrtp_stream stopand BYE
-media_scale is enabled automatically when you use -sn invite_media_scale (or any scenario whose XML name is invite_media_scale). For -sf you must still pass -media_scale explicitly.
Example toward one SBC:
gossipper sipp -sn invite_media_scale \
-rsa 10.0.0.5:5060 -i 10.0.0.1 -p 5060 \
-m 10000 -l 10000 -r 20 -t u1
Tune -m, -l, -r, and the scenario pause for your lab (shorter pause = faster call churn).
Usage examples¶
Standalone generator (no SIP)¶
# 1000 PCMU streams to one SBC
gossipper sipp -rtp_send -media_scale -rtp_streams 1000 \
-rtp_addr 10.0.0.5:30000 \
-rtp_codec PCMU/8000 \
-rtp_freq 20 \
-i 10.0.0.1
Optional duration:
gossipper sipp -rtp_send -media_scale -rtp_streams 10000 \
-rtp_addr 10.0.0.5:30000 -rtp_codec PCMU/8000 -rtp_freq 20 \
-rtp_dur 300000 -i 10.0.0.1
Inside a SIP scenario¶
Scenario action:
<exec rtp_stream="synthetic,0,0,PCMU/8000,20"/>
Run gossipper with -media_scale. The remote RTP address is still parsed from SDP (ParseAudioEndpoint). Local bind uses [media_port] as in normal UAC scenarios.
Pause / resume / stop map to ScaleEngine.PauseCall, ResumeCall, and UnregisterCall for the call ID.
OS and host tuning¶
Before runs at 5k–10k streams:
1. File descriptors¶
One UDP socket per stream plus SIP sockets:
ulimit -n 20000
2. UDP sysctl¶
Increase kernel buffer limits (same order of magnitude as Homer):
sudo scripts/tune-udp-sysctl.sh
# persistent: sudo cp examples/sysctl/gossipper-high-scale.conf /etc/sysctl.d/99-gossipper.conf && sudo sysctl --system
ulimit -n 1048576
Or manually:
sudo sysctl -w net.core.rmem_max=33554432
sudo sysctl -w net.core.wmem_max=33554432
sudo sysctl -w net.core.wmem_default=4194304
sudo sysctl -w net.core.netdev_max_backlog=250000
sudo sysctl -w net.ipv4.udp_mem="262144 524288 1048576"
sudo sysctl -w fs.file-max=2097152
Media scale sockets also set large SO_SNDBUF / SO_RCVBUF in scale_socket.go.
3. CPU¶
Set GOMAXPROCS to the number of physical cores on the generator host.
Acceptance criteria (10k / single SBC)¶
Use these checks in the lab before production load tests:
- 10,000 concurrent synthetic PCMU streams, 20 ms interval, stable for ≥ 5 minutes.
media.rtp_packets_sentin summary/JSON ≈streams × (1000 / freq_ms) × duration_seconds(within ~±1%).- Goroutines stay < 500 (not tens of thousands).
- CPU on generator: target < 4–8 cores at ~500k pps cleartext (measure on your hardware).
- No sustained
sendtoerrors on the generator; SBC should not show mass ICMP port unreachable.
What scale mode does not do¶
- SRTP / DTLS-SRTP — use the normal
Sessionpath with-media_srtp. - PCAP replay, microphone, echo,
rtpcheck— unchanged classic media paths only. - Per-packet HEP RTP — automatically off when
-media_scaleis set (SendMediaReportis not passed to HEP for scale runs). Use SIP HEP (-hep_addr) only if you need signaling capture. - RTCP — not sent in scale mode (enable only if your SBC strictly requires it; would be a future optional aggregated worker).
Testing in the repository¶
| Test / bench | Location |
|---|---|
| Unit: register, send, pause/resume | internal/media/scale_engine_test.go |
| Soak: 100 / 1000 streams | internal/media/scale_engine_soak_test.go |
| Benchmark | internal/media/scale_engine_bench_test.go |
| Engine integration | internal/engine/actions_exec_scale_test.go |
| CLI validation | internal/cli/config_test.go (TestParseMediaScale*) |
Run:
go test ./internal/media/... -run TestScale
go test ./internal/media/... -run TestScaleEngineSoak -timeout 120s
go test ./internal/engine/... -run TestApplyExecRTPStreamScale
Risks and mitigations¶
| Risk | Mitigation |
|---|---|
| SBC session/port limits | Ramp 1k → 5k → 10k; watch SBC counters |
Port range / clampMediaPort |
Verify media port span for max concurrent calls |
localhost port exhaustion |
Bind real -i / media IP, not only loopback |
| Generator FD limit | ulimit -n before large runs |
Performance options (Linux)¶
| Flag / env | Effect |
|---|---|
-media_scale |
Central scheduler, per-stream UDP socket, batched send via sendmmsg (golang.org/x/net/ipv4.WriteBatch) |
-media_iouring or GOSSIPPER_MEDIA_IOURING=1 |
Send batches inline from the scheduler (no sender worker queue); lower goroutine count |
-t u1 |
Single UDP socket for SIP (recommended for 10k calls) |
GOMAXPROCS |
Set to physical core count on the generator |
Future work (only if scale mode is insufficient)¶
- XDP only for ingress mirror to Homer, not for RTP generation.
- Kernel io_uring
IORING_OP_SENDMSGpool (not wired yet;-media_iouringis userspace direct-batch mode today).
See media-roadmap.md for SRTP/ICE/WebRTC milestones (separate from this load-generator path).