Before Baby Monitor Timmy can transmit audio and video, two devices must find each other and establish mutual trust. This process — pairing — is the most security-critical moment in the entire workflow. In this article, we explain step by step how pairing works, the cryptographic protocol behind it, and why a nearby attacker cannot silently hijack the connection.
The problem: How does my device know who it's talking to?
When two devices connect for the first time, they face a fundamental question: How does Device A know it's really talking to Device B — and not to an attacker who has inserted themselves in between? This problem is known in cryptography as a man-in-the-middle attack (MITM).
Timmy solves this through an Elliptic Curve Diffie-Hellman (ECDH) key exchange over Firebase, combined with visual verification by the user.
The following diagram shows the complete pairing flow at a glance:
sequenceDiagram
autonumber
participant A as 📱 Device A
participant F as ☁️ Firebase
participant B as 📱 Device B
Note over A,B: Phase 1 — Discovery
A->>A: Generate ECDH key pair (P-256)
B->>B: Generate ECDH key pair (P-256)
alt Auto-Pairing (Nearby BLE)
A-->>B: BLE broadcast: SBM:XKQM
B-->>A: BLE broadcast: SBM:R7NP
Note over A,B: Lower code wins → determines creator/joiner
else Manual Pairing
A->>A: Display 4-char code
Note right of A: User reads code
B->>B: User enters code
end
Note over A,B: Phase 2 — ECDH Key Exchange
A->>F: Write public key (PubA) to meeting doc
B->>F: Write public key (PubB) to meeting doc
F-->>B: Read PubA
F-->>A: Read PubB
Note over A,B: Phase 3 — Shared Secret
A->>A: sharedSecret = ECDH(privA, PubB)
B->>B: sharedSecret = ECDH(privB, PubA)
Note over A,B: Both compute identical 32-byte secret
A->>A: SAS = SHA-256("sas:" + sort(PubA,PubB) + secret) → 2-digit number
B->>B: SAS = SHA-256("sas:" + sort(PubA,PubB) + secret) → 2-digit number
Note over A,B: Phase 4 — Visual Verification
A->>A: Display SAS: 42
B->>B: Display SAS: 42
Note over A,B: 👤 User compares numbers on both screens
A->>A: User confirms ✓
B->>B: User confirms ✓
Note over A,B: Phase 5 — Key Derivation
A->>A: pairingKey = SHA-256("pair:" + secret)
B->>B: pairingKey = SHA-256("pair:" + secret)
A->>A: docKey = SHA-256("doc:" + pairingKey)
A->>A: encKey = SHA-256("enc:" + pairingKey)
Note over A,B: ✅ Paired — all future signaling encrypted with AES-256-GCM
Complete pairing protocol sequence — editable source: docs/diagrams/pairing-sequence.mmd
Step 1: Each device generates a key pair
When opening the pairing screen, each device generates an ephemeral ECDH key pair on the P-256 curve (secp256r1):
- A private key — stays exclusively on the device
- A public key — exchanged over Firebase
The keys are created using a cryptographically secure random number generator (Random.secure())
and are valid only for this single pairing attempt. Fresh keys are generated
for every new attempt.
Step 2: Exchange public keys over Firebase
Both devices need a way to find each other. Timmy uses a short 4-character meeting code for this — it has no cryptographic value and serves purely as a routing address in Firebase Firestore.
In auto-pairing mode, Nearby Connections (Google) discovers nearby devices via Bluetooth Low Energy (BLE) and transmits the meeting code automatically. In manual pairing mode, the user reads the code from one screen and types it on the other device. Either way, both devices end up at the same Firestore document.
Once both devices share the meeting code, each one writes its ECDH public key to the corresponding Firestore document (compressed, 33 bytes as base64url). The other device reads it from there.
Crucially: Only the public key is sent. The private key never leaves the device. An attacker who reads the Firestore document sees only public keys — and cannot compute the shared secret from them. This relies on the mathematical difficulty of the Elliptic Curve Discrete Logarithm Problem (ECDLP).
Step 3: Computing the shared secret
Once both devices have discovered each other's public key, they independently compute the same shared secret:
sharedSecret = ECDH(myPrivateKey, remotePublicKey)
→ 32 bytes (identical on both devices)
The mathematics of elliptic curves guarantees that both computations yield the same result, even though each device only knows its own private key and the other's public key.
Step 4: The verification number (SAS)
From the shared secret, a Short Authentication String (SAS) is derived — a two-digit number displayed on both devices:
hash = SHA-256("sas:" + sort(pubkeyA, pubkeyB) + sharedSecret)
number = (hash[0] × 256 + hash[1]) mod 100 → 00 to 99
Both devices display the same number — for example, 42. The user visually compares whether the numbers on both screens match, then confirms on each device individually.
Why an attacker cannot forge this
A man-in-the-middle would need to intercept the key exchange in Firebase. Specifically, they would need to:
- Replace the real public keys stored in the Firestore document with their own
- Establish separate shared secrets with each device
sequenceDiagram
autonumber
participant A as 📱 Device A
participant M as 🕵️ Attacker (MITM)
participant B as 📱 Device B
Note over A,B: Attacker intercepts the Firebase key exchange
A->>A: Generate key pair (privA, PubA)
B->>B: Generate key pair (privB, PubB)
M->>M: Generate TWO key pairs (privM1, PubM1) + (privM2, PubM2)
A->>M: Write PubA to Firebase
M->>M: Replace PubA with PubM1
M->>B: B reads PubM1 (thinks it is PubA)
B->>M: Write PubB to Firebase
M->>M: Replace PubB with PubM2
M->>A: A reads PubM2 (thinks it is PubB)
Note over A,B: Each device computes a DIFFERENT shared secret
A->>A: secret_A = ECDH(privA, PubM2)
M->>M: secret_A = ECDH(privM2, PubA)
M->>M: secret_B = ECDH(privM1, PubB)
B->>B: secret_B = ECDH(privB, PubM1)
Note over A,M: secret_A ≠ secret_B
A->>A: SAS_A = SHA-256("sas:" + sort(PubA,PubM2) + secret_A) → 73
B->>B: SAS_B = SHA-256("sas:" + sort(PubM1,PubB) + secret_B) → 18
rect rgb(255, 230, 230)
Note over A,B: ❌ User sees DIFFERENT numbers!
A->>A: Display: 73
B->>B: Display: 18
Note over A,B: 👤 User notices mismatch → cancels pairing
end
Note over A,B: 🛡️ Attack detected — MITM cannot force SAS match (P = 1/100)
Man-in-the-middle detection via SAS mismatch — editable source: docs/diagrams/mitm-detection.mmd
In this case, the attacker computes a shared secret S_A with Device A and a
different shared secret S_B with Device B. Since S_A ≠ S_B,
the devices compute different verification numbers.
The attacker cannot make the numbers match because:
- They don't know the devices' private keys
- SHA-256 is not reversible
- The probability of a random match is only 1 in 100
The user sees different numbers on the screens and cancels the pairing. The attack has failed.
Step 5: Completing the pairing
Only after the user has confirmed verification on both devices does pairing complete:
- A 64-character pairing key (256 bits) is derived from the shared secret:
SHA-256("pair:" + sharedSecret) → pairingKey - The document key is derived as
SHA-256("doc:" + pairingKey)and serves as the Firestore document key - The encryption key is derived as
SHA-256("enc:" + pairingKey)and provides the AES-256-GCM key for encrypted signaling - Both devices store the same pairing key and navigate to mode selection
From this point on, all further connection attempts (Firestore signaling, WebRTC setup) are encrypted with the shared AES-256-GCM key. The pairing key is never sent to the backend — only its SHA-256 hash.
System architecture
The following diagram shows all components involved in the pairing and communication process and how they interact:
flowchart TB
BABY["📱 Baby Phone
Baby Mode"]
PARENT["📱 Parent Phone
Parent Mode"]
BABY <==>|"🔒 WebRTC Peer-to-Peer · DTLS-SRTP
Audio · Video · DataChannel"| PARENT
BABY -.-|"🔵 Bluetooth LE · Nearby
Auto-Discovery"| PARENT
subgraph FIREBASE["☁️ Firebase (Google Cloud)"]
direction LR
AUTH["🪪 Anonymous
Authentication"]
FS["📄 Firestore
Pairing + Signaling"]
CF["⚡ Cloud Functions
getTurnCredentials"]
end
BABY <-->|"🔐 AES-256-GCM encrypted
SDP · ICE · ECDH keys"| FS
FS <-->|"🔐 AES-256-GCM encrypted
SDP · ICE · ECDH keys"| PARENT
BABY -.->|Token| AUTH
PARENT -.->|Token| AUTH
STUN["📡 STUN server
stun.cloudflare.com:3478"]
TURN["🔄 TURN relay
local or Cloudflare"]
BABY & PARENT -->|Short-lived credentials| CF
CF -->|local first, Cloudflare fallback| TURN
BABY & PARENT -.->|NAT Traversal| STUN
BABY -.->|"Relay Fallback"| TURN
TURN -.->|"Relay Fallback"| PARENT
style BABY fill:#f0f7ff,stroke:#6BAFB2,stroke-width:2px
style PARENT fill:#f0f7ff,stroke:#6BAFB2,stroke-width:2px
style FIREBASE fill:#fff5f5,stroke:#E9B44C,stroke-width:2px
style AUTH fill:#E9B44C,stroke:#2B2D42
style FS fill:#E9B44C,stroke:#2B2D42
style CF fill:#E9B44C,stroke:#2B2D42
style STUN fill:#D4EEEF,stroke:#6BAFB2
style TURN fill:#7BC47F,stroke:#2B2D42
System architecture overview — editable source: docs/diagrams/pairing-architecture.mmd
Communication paths in detail:
- WebRTC peer-to-peer (thick line): Audio, video and DataChannel flow directly between devices — encrypted with DTLS-SRTP. No server sees this data.
- Firebase Firestore (solid line): Pairing data (ECDH keys) and signaling (SDP/ICE) go through Firestore — end-to-end encrypted with AES-256-GCM. Firebase cannot decrypt the data.
- STUN server: Both devices discover their public IP address so a direct peer-to-peer connection can be established.
- TURN relay: If a direct connection is not possible (e.g., on mobile data), the selected local or Cloudflare TURN server relays the encrypted media. Short-lived credentials (24h) are fetched via Firebase Cloud Functions.
- Bluetooth LE (dotted line): Nearby Connections discovers nearby devices automatically — only the meeting code is transmitted, no key material.
Fallback: Manual code entry
If Bluetooth is unavailable (e.g., on older devices), the 4-character meeting code can also be entered manually. One device displays the code and the user types it on the other device. From that point on, the exact same ECDH + SAS verification flow applies: public keys are exchanged over Firebase, a shared secret is computed, and both devices display a verification number for the user to compare.
Security is identical in both modes because the ECDH key exchange always happens over Firebase — regardless of whether the meeting code was discovered via BLE or entered by hand. The 4-character code is purely a routing address and carries no cryptographic value. The resulting encryption key is always 256 bits, derived from the ECDH shared secret.
Summary
| Security mechanism | Protects against |
|---|---|
| ECDH key exchange (P-256) | Eavesdropping on key exchange traffic |
| Ephemeral key pairs | Forward secrecy — past pairings remain safe |
| Visual verification number (SAS) | Man-in-the-middle (MITM) during key exchange |
| SHA-256 hash as document key | Code extraction from Firestore |
| AES-256-GCM encryption | Eavesdropping on signaling data |
| Dual-side confirmation | One-sided pairing without user knowledge |
| DTLS-SRTP (WebRTC) | Eavesdropping on audio/video |
Each security layer complements the others. ECDH protects the key exchange. The verification number protects against MITM. AES-256-GCM protects signaling. WebRTC protects communication. Together, they form a chain that an attacker cannot break at any point without being detected.