Baby Monitor Timmy started with a simple idea: a baby monitor that respects your privacy. No cloud recordings, no subscription fees, no data leaving your home unnecessarily. What made this project unusual is how it was built — from the very first line of code, GitHub Copilot served as an AI pair programmer, turning a concept into a fully functional app in a fraction of the time traditional development would require.
The Human-AI Workflow
The development process followed a clear division of labor: the human acts as CTO — defining features, setting priorities, and making architectural decisions. GitHub Copilot handles the implementation: writing code, running tests, debugging issues, and even distributing releases to testers.
A typical sprint looks like this:
- Feature description: The developer describes what the next feature should do, including edge cases and constraints.
- Implementation: Copilot writes the code, following the project's existing conventions and architecture.
- Testing: Automated end-to-end tests run on two emulators to verify the feature works correctly.
- Distribution: Once tests pass, a release APK is built and distributed to testers via Firebase App Distribution.
This cycle — describe, implement, test, distribute — repeats for every feature and bugfix. The human never needs to write code manually, but retains full control over what gets built and why.
From Concept to WebRTC
The project began with the core challenge: real-time audio and video between two phones. WebRTC was the obvious choice, but integrating it with Flutter — handling ICE candidates, SDP negotiation, TURN fallback, and DataChannels — involves significant complexity.
Copilot navigated this complexity step by step: setting up the peer connection, implementing the correct ordering (DataChannel before offer, onTrack before setRemoteDescription), and building the signaling layer on top of Firebase Firestore. Each piece was tested on dual emulators before moving to the next.
Secure Pairing with ECDH
One of the most critical features was the secure pairing system. Two devices need to establish mutual trust without relying on a central server to vouch for their identity. The solution: an ECDH P-256 key exchange over Firebase, combined with a visual verification number (SAS) that detects man-in-the-middle attacks.
Copilot implemented the entire cryptographic chain — key generation, public key exchange, shared secret derivation, SAS computation, and AES-256-GCM encryption for all subsequent signaling. The pairing key never touches the backend; only its SHA-256 hash is used as a Firestore document identifier.
Security Audit: Finding and Fixing Vulnerabilities
AI-assisted development isn't just about writing code faster — it's also about catching mistakes. During a dedicated security audit sprint, Copilot analyzed the entire codebase for vulnerabilities and found six issues that needed fixing:
- Missing input validation on signaling data
- Potential race conditions in the ICE candidate handling
- Stale session data that wasn't being cleaned up properly
- Firestore security rules that were too permissive
- Missing certificate pinning considerations
- Insufficient error handling in the TURN credential flow
All six were fixed in the same sprint. This kind of systematic code review — examining every file for security implications — is exactly where AI assistance shines.
Iterative Sprints: How the App Evolved
The app evolved through rapid iteration. Each version brought significant improvements:
- v1.8: Complete pairing redesign — 4-char code + ECDH P-256 over Firebase replaced the old direct-key approach.
- v1.10: Security hardening sprint — the six-vulnerability audit and fix cycle.
- v1.11: Dark mode across all screens, plus the homepage and blog you're reading right now.
- v1.12: Major parent screen overhaul, night vision mode, and motion detection via camera frame analysis.
Each sprint followed the same pattern: describe the goal, let Copilot implement it, verify with automated tests, and ship to testers.
E2E Testing with Dual Emulators
Testing a baby monitor requires two devices: one in baby mode, one in parent mode. The project uses two Android emulators running simultaneously, with an automated test script that:
- Installs the app on both emulators
- Navigates through pairing on both devices
- Verifies that audio and video connections are established
- Tests push-to-talk, camera control, and other features
Since both emulators share the same IP address (10.0.2.15), a direct
peer-to-peer connection via STUN is impossible. This forces every test run through the
Cloudflare TURN relay — which actually provides better test coverage, since it exercises
the most complex connection path.
What We Learned
Building an entire app with an AI pair programmer taught us several things:
- Architecture matters more than ever. Clear conventions and a well-documented codebase let the AI produce consistent, high-quality code. Ambiguity leads to inconsistency.
- Testing is non-negotiable. AI-generated code needs the same rigorous testing as human-written code. Automated E2E tests caught issues that would have been easy to miss in manual review.
- The human stays in the loop. Every architectural decision, every security trade-off, every user-facing choice was made by a human. The AI accelerates implementation, but it doesn't replace judgment.
- Speed enables quality. Because features ship in hours instead of days, there's more time for testing, polish, and iteration. Faster doesn't mean sloppier — it means more cycles of improvement.
Looking Ahead
Baby Monitor Timmy continues to evolve. Upcoming milestones include the iOS version (Phase 3), additional sensor features, and continued security hardening. The development workflow remains the same: a human with a vision and an AI that makes it real.
The security-relevant building blocks now live under clear boundaries in the public baby-monitor-timmy-core repository — including the architectural decisions around pairing, signaling, and backend interfaces.