Secure Pairing in Baby Monitor Timmy: ECDH and SAS Explained

Before Baby Monitor Timmy transmits audio and video, two devices need to find each other and trust each other. This pairing step is the most critical moment in the whole flow. Here I explain how Timmy pairs, which cryptography sits behind it, and why a nearby attacker cannot take over the connection unnoticed.

The problem: How does my device know who it's talking to?

When two devices connect for the first time, the central question is: is Device A really talking to Device B, or is someone sitting in between? In cryptography, that is called a man-in-the-middle attack (MITM).

Timmy solves this through an Elliptic Curve Diffie-Hellman (ECDH) key exchange over Firebase, combined with visual verification by the user.

The following diagram shows the complete pairing flow at a glance:

sequenceDiagram
    autonumber
    participant A as 📱 Device A
    participant F as ☁️ Firebase
    participant B as 📱 Device B

    Note over A,B: Phase 1 — Discovery

    A->>A: Generate ECDH key pair (P-256)
    B->>B: Generate ECDH key pair (P-256)

    alt Auto-Pairing (Nearby BLE)
        A-->>B: BLE broadcast: SBM:XKQM
        B-->>A: BLE broadcast: SBM:R7NP
        Note over A,B: Lower code wins → determines creator/joiner
    else Manual Pairing
        A->>A: Display 4-char code
        Note right of A: User reads code
        B->>B: User enters code
    end

    Note over A,B: Phase 2 — ECDH Key Exchange

    A->>F: Write public key (PubA) to meeting doc
    B->>F: Write public key (PubB) to meeting doc
    F-->>B: Read PubA
    F-->>A: Read PubB

    Note over A,B: Phase 3 — Shared Secret

    A->>A: sharedSecret = ECDH(privA, PubB)
    B->>B: sharedSecret = ECDH(privB, PubA)
    Note over A,B: Both compute identical 32-byte secret

    A->>A: SAS = SHA-256("sas:" + sort(PubA,PubB) + secret) → 2-digit number
    B->>B: SAS = SHA-256("sas:" + sort(PubA,PubB) + secret) → 2-digit number

    Note over A,B: Phase 4 — Visual Verification

    A->>A: Display SAS: 42
    B->>B: Display SAS: 42
    Note over A,B: 👤 User compares numbers on both screens

    A->>A: User confirms ✓
    B->>B: User confirms ✓

    Note over A,B: Phase 5 — Key Derivation

    A->>A: pairingKey = SHA-256("pair:" + secret)
    B->>B: pairingKey = SHA-256("pair:" + secret)
    A->>A: docKey = SHA-256("doc:" + pairingKey)
    A->>A: encKey = SHA-256("enc:" + pairingKey)

    Note over A,B: ✅ Paired — all future signaling encrypted with AES-256-GCM

Complete pairing protocol sequence — editable source: docs/diagrams/pairing-sequence.mmd

Step 1: Each device generates a key pair

When opening the pairing screen, each device generates an ephemeral ECDH key pair on the P-256 curve (secp256r1):

A private key — stays exclusively on the device
A public key — exchanged over Firebase

The keys are created using a cryptographically secure random number generator (Random.secure()) and are valid only for this single pairing attempt. Fresh keys are generated for every new attempt.

Step 2: Exchange public keys over Firebase

To let two devices find each other, Timmy uses a 4-character code as a meeting point. This code can be discovered automatically via Nearby Connections (Bluetooth Low Energy) or entered manually. It has no cryptographic value; it only makes both devices find the same Firebase Firestore document.

Once both devices know the code, each writes its public ECDH key to a shared Firestore document. Then each device reads the other device's public key from that document.

Crucially: only the public key is sent. The private key never leaves the device. Anyone watching Firebase traffic sees public keys, but cannot compute the shared secret from them. That relies on the difficulty of the Elliptic Curve Discrete Logarithm Problem (ECDLP).

Step 3: Computing the shared secret

Once both devices have discovered each other's public key, they independently compute the same shared secret:

sharedSecret = ECDH(myPrivateKey, remotePublicKey)
             → 32 bytes (identical on both devices)

The mathematics of elliptic curves guarantees that both computations yield the same result, even though each device only knows its own private key and the other's public key.

Step 4: The verification number (SAS)

From the shared secret, a Short Authentication String (SAS) is derived — a two-digit number displayed on both devices:

hash   = SHA-256("sas:" + sort(pubkeyA, pubkeyB) + sharedSecret)
number = (hash[0] × 256 + hash[1]) mod 100   → 00 to 99

Both devices display the same number — for example, 42. The user visually compares whether the numbers on both screens match, then confirms on each device individually.

Why an attacker cannot forge this

A man-in-the-middle would need to intercept the key exchange in Firebase. Specifically, they would need to:

Replace the real public keys stored in the Firestore document with their own
Establish separate shared secrets with each device

sequenceDiagram
    autonumber
    participant A as 📱 Device A
    participant M as 🕵️ Attacker (MITM)
    participant B as 📱 Device B

    Note over A,B: Attacker intercepts the Firebase key exchange

    A->>A: Generate key pair (privA, PubA)
    B->>B: Generate key pair (privB, PubB)
    M->>M: Generate TWO key pairs (privM1, PubM1) + (privM2, PubM2)

    A->>M: Write PubA to Firebase
    M->>M: Replace PubA with PubM1
    M->>B: B reads PubM1 (thinks it is PubA)

    B->>M: Write PubB to Firebase
    M->>M: Replace PubB with PubM2
    M->>A: A reads PubM2 (thinks it is PubB)

    Note over A,B: Each device computes a DIFFERENT shared secret

    A->>A: secret_A = ECDH(privA, PubM2)
    M->>M: secret_A = ECDH(privM2, PubA)
    M->>M: secret_B = ECDH(privM1, PubB)
    B->>B: secret_B = ECDH(privB, PubM1)

    Note over A,M: secret_A ≠ secret_B

    A->>A: SAS_A = SHA-256("sas:" + sort(PubA,PubM2) + secret_A) → 73
    B->>B: SAS_B = SHA-256("sas:" + sort(PubM1,PubB) + secret_B) → 18

    rect rgb(255, 230, 230)
        Note over A,B: ❌ User sees DIFFERENT numbers!
        A->>A: Display: 73
        B->>B: Display: 18
        Note over A,B: 👤 User notices mismatch → cancels pairing
    end

    Note over A,B: 🛡️ Attack detected — MITM cannot force SAS match (P = 1/100)

Man-in-the-middle detection via SAS mismatch — editable source: docs/diagrams/mitm-detection.mmd

In this case, the attacker computes a shared secret S_A with Device A and a different shared secret S_B with Device B. Since S_A ≠ S_B, the devices compute different verification numbers.

The attacker cannot make the numbers match because:

They don't know the devices' private keys
SHA-256 is not reversible
The probability of a random match is only 1 in 100

The user sees different numbers on the screens and cancels pairing. At that point, the attack has become visible.

Step 5: Completing the pairing

Only after the user has confirmed verification on both devices does pairing complete:

A 64-character pairing key (256 bits) is derived from the shared secret: SHA-256("pair:" + sharedSecret) → pairingKey
The document key is derived as SHA-256("doc:" + pairingKey) and serves as the Firestore document key
The encryption key is derived as SHA-256("enc:" + pairingKey) and provides the AES-256-GCM key for encrypted signaling
Both devices store the same pairing key and navigate to mode selection

From this point on, all further connection attempts (Firestore signaling, WebRTC setup) are encrypted with the shared AES-256-GCM key. The pairing key is never sent to the backend; only its SHA-256 hash is used as the document identifier.

System architecture

The following diagram shows the components involved in pairing and communication:

flowchart TB
    BABY["📱 Baby Phone
Baby Mode"]
    PARENT["📱 Parent Phone
Parent Mode"]

    BABY <==>|"🔒 WebRTC Peer-to-Peer · DTLS-SRTP
Audio · Video · DataChannel"| PARENT
    BABY -.-|"🔵 Bluetooth LE · Nearby
Auto-Discovery"| PARENT

    subgraph FIREBASE["☁️ Firebase (Google Cloud)"]
        direction LR
        AUTH["🪪 Anonymous
Authentication"]
        FS["📄 Firestore
Pairing + Signaling"]
        CF["⚡ Cloud Functions
getTurnCredentials"]
    end

    BABY <-->|"🔐 AES-256-GCM encrypted
SDP · ICE · ECDH keys"| FS
    FS <-->|"🔐 AES-256-GCM encrypted
SDP · ICE · ECDH keys"| PARENT

    BABY -.->|Token| AUTH
    PARENT -.->|Token| AUTH

    STUN["📡 STUN server
stun.cloudflare.com:3478"]
    TURN["🔄 TURN relay
local or Cloudflare"]

    BABY & PARENT -->|Short-lived credentials| CF
    CF -->|local first, Cloudflare fallback| TURN

    BABY & PARENT -.->|NAT Traversal| STUN

    BABY -.->|"Relay Fallback"| TURN
    TURN -.->|"Relay Fallback"| PARENT

    style BABY fill:#FBF6F0,stroke:#B5734A,stroke-width:2px
    style PARENT fill:#FBF6F0,stroke:#B5734A,stroke-width:2px
    style FIREBASE fill:#fff5f5,stroke:#E9B44C,stroke-width:2px
    style AUTH fill:#E9B44C,stroke:#2B2D42
    style FS fill:#E9B44C,stroke:#2B2D42
    style CF fill:#E9B44C,stroke:#2B2D42
    style STUN fill:#F6E3D2,stroke:#B5734A
    style TURN fill:#7BC47F,stroke:#2B2D42

System architecture overview — editable source: docs/diagrams/pairing-architecture.mmd

Communication paths in detail:

WebRTC peer-to-peer (thick line): Audio, video and DataChannel flow directly between devices — encrypted with DTLS-SRTP. No server sees this data.
Firebase Firestore (solid line): Pairing data (ECDH keys) and signaling (SDP/ICE) go through Firestore — end-to-end encrypted with AES-256-GCM. Firebase cannot decrypt the data.
STUN server: Both devices discover their public IP address so a direct peer-to-peer connection can be established.
TURN relay: If a direct connection is not possible (e.g., on mobile data), the selected local or Cloudflare TURN server relays the encrypted media. Short-lived credentials (24h) are fetched via Firebase Cloud Functions.
Bluetooth LE (dotted line): Nearby Connections discovers nearby devices automatically — only the meeting code is transmitted, no key material.

Fallback: Manual code entry

If Bluetooth is unavailable (e.g., on older devices), the 4-character code can also be typed in manually. Manual entry uses the same ECDH key exchange and the same SAS verification as automatic pairing. The only difference is that the code is read and typed by the user instead of discovered through BLE.

Because the ECDH key exchange happens over Firebase in both cases, security is identical. The 4-character code is only a meeting point; the real encryption is based on the 256-bit key derived from ECDH.

Summary

Security mechanism	Protects against
ECDH key exchange (P-256)	Eavesdropping on key exchange traffic
Ephemeral key pairs	Forward secrecy — past pairings remain safe
Visual verification number (SAS)	Man-in-the-middle (MITM) during key exchange
SHA-256 hash as document key	Code extraction from Firestore
AES-256-GCM encryption	Eavesdropping on signaling data
Dual-side confirmation	One-sided pairing without user knowledge
DTLS-SRTP (WebRTC)	Eavesdropping on audio/video

These layers fit together: ECDH protects the key exchange, the verification number protects against MITM, AES-256-GCM protects signaling, and WebRTC protects media. An attacker would have to break this chain in several places without devices or parents noticing.

← Back to Blog

Secure Pairing in Baby Monitor Timmy