3 Native
iOS Extensions
Streaming
TTS Model
Remotion Lambda
Video Engine
RAG-powered
Memory
Confide is a daily video journaling app with an AI chatbot called 'Bestie' that remembers your entries, generates personalized weekly recap videos, and helps manage screen time. Users record daily video journal entries, and the AI extracts memories, builds a searchable knowledge base, and produces highlight reels.
The app features dual recording (video + speech transcript), streaming text-to-speech for AI responses, and iOS Screen Time integration through 3 native Swift extensions — enabling users to set app usage limits directly from the journal app.
The backend orchestrates a complex pipeline: each journal entry triggers memory extraction, vector storage, recap generation, and weekly video compilation using Remotion Lambda — all coordinated through Inngest durable functions.
The dual recording system composes a SpeechRecorder and VideoRecorder around a react-native-vision-camera ref. Speech recognition drives the UX — if no speech is detected after recording, the video is automatically deleted. Transcripts are extracted in real-time and displayed as the user records.
ElevenLabs streaming TTS uses a WebSocket connection to the stream-input endpoint with the eleven_turbo_v2_5 model at PCM 16kHz. Base64 PCM chunks are accumulated, then assembled into a WAV file (with proper header construction) and played through expo-audio. An FFT-based volume analysis drives a visual waveform animation during playback.
The iOS Screen Time integration required 3 native Swift extension targets: ActivityMonitor (tracks app usage via DeviceActivity framework), ShieldAction (handles user interaction with shield overlays), and ShieldConfiguration (customizes shield appearance). All three share data with the main app through an App Group and UserDefaults — the React Native side writes configuration, the extensions read it.
The Inngest pipeline for weekly recaps is multi-step: CreateWeeklyRecap generates a narrative and frame props (mapped to Remotion composition frames), with concurrency limits and idempotent 'existing recap' handling. RenderWeeklyRecap triggers a Remotion Lambda render, which calls back via a signed webhook to FinalizeWeeklyRecap — storing stills and updating the recap status.
The RAG system for 'Bestie' chat indexes memories extracted from journal entries: each memory is embedded with OpenAI text-embedding-3-large, split with RecursiveCharacterTextSplitter, and stored in a Pinecone index with userId and memoryId metadata. During chat, the system retrieves top-K similar memories to provide contextual, personalized responses.