MobileAINodogoro

Confide

AI Daily Video Journaling App with Memory & Screen Time

Visit Project

3 Native

iOS Extensions

Streaming

TTS Model

Remotion Lambda

Video Engine

RAG-powered

Memory

ExpoReact NativeVision CameraElevenLabsRevenueCatRemotionInngestPineconeLangChainOpenAI

01Overview

Confide is a daily video journaling app with an AI chatbot called 'Bestie' that remembers your entries, generates personalized weekly recap videos, and helps manage screen time. Users record daily video journal entries, and the AI extracts memories, builds a searchable knowledge base, and produces highlight reels.

The app features dual recording (video + speech transcript), streaming text-to-speech for AI responses, and iOS Screen Time integration through 3 native Swift extensions — enabling users to set app usage limits directly from the journal app.

The backend orchestrates a complex pipeline: each journal entry triggers memory extraction, vector storage, recap generation, and weekly video compilation using Remotion Lambda — all coordinated through Inngest durable functions.

02My Role & Impact

Built the Expo mobile app with dual recording (Vision Camera + SpeechRecorder), ElevenLabs streaming TTS via WebSocket, and RevenueCat subscription management.
Developed 3 iOS native Swift extensions (ActivityMonitor, ShieldAction, ShieldConfiguration) for Screen Time integration, bridged via App Group + UserDefaults.
Architected the Inngest-orchestrated backend pipeline: entry → memory extraction (OpenAI structured output) → Pinecone vectorization → recap generation → Remotion Lambda video rendering.
Implemented the Pinecone memory RAG for the AI chatbot using text-embedding-3-large, RecursiveCharacterTextSplitter, and top-K similarity retrieval.

03Technical Deep Dive

The dual recording system composes a SpeechRecorder and VideoRecorder around a react-native-vision-camera ref. Speech recognition drives the UX — if no speech is detected after recording, the video is automatically deleted. Transcripts are extracted in real-time and displayed as the user records.

ElevenLabs streaming TTS uses a WebSocket connection to the stream-input endpoint with the eleven_turbo_v2_5 model at PCM 16kHz. Base64 PCM chunks are accumulated, then assembled into a WAV file (with proper header construction) and played through expo-audio. An FFT-based volume analysis drives a visual waveform animation during playback.

The iOS Screen Time integration required 3 native Swift extension targets: ActivityMonitor (tracks app usage via DeviceActivity framework), ShieldAction (handles user interaction with shield overlays), and ShieldConfiguration (customizes shield appearance). All three share data with the main app through an App Group and UserDefaults — the React Native side writes configuration, the extensions read it.

The Inngest pipeline for weekly recaps is multi-step: CreateWeeklyRecap generates a narrative and frame props (mapped to Remotion composition frames), with concurrency limits and idempotent 'existing recap' handling. RenderWeeklyRecap triggers a Remotion Lambda render, which calls back via a signed webhook to FinalizeWeeklyRecap — storing stills and updating the recap status.

The RAG system for 'Bestie' chat indexes memories extracted from journal entries: each memory is embedded with OpenAI text-embedding-3-large, split with RecursiveCharacterTextSplitter, and stored in a Pinecone index with userId and memoryId metadata. During chat, the system retrieves top-K similar memories to provide contextual, personalized responses.

04Tech Stack Breakdown

Mobile

Expo ~52 (React Native 0.76)
Vision Camera
expo-audio
expo-sensors
RevenueCat (subscriptions)
AppsFlyer (attribution)
PostHog + Sentry

Native iOS

ActivityMonitor extension (DeviceActivity)
ShieldAction extension
ShieldConfiguration extension
App Group + UserDefaults bridge

AI & Media

OpenAI (memory extraction, structured output)
ElevenLabs (streaming TTS via WebSocket)
Pinecone + LangChain (memory RAG)
Remotion Lambda (video generation)

Backend

Next.js (API)
Inngest (orchestration)
MongoDB + Mongoose
GCS + AWS S3

PreviousFigmaker NextSeenJeem

01Overview

02My Role & Impact

Built the Expo mobile app with dual recording (Vision Camera + SpeechRecorder), ElevenLabs streaming TTS via WebSocket, and RevenueCat subscription management.

Developed 3 iOS native Swift extensions (ActivityMonitor, ShieldAction, ShieldConfiguration) for Screen Time integration, bridged via App Group + UserDefaults.

Architected the Inngest-orchestrated backend pipeline: entry → memory extraction (OpenAI structured output) → Pinecone vectorization → recap generation → Remotion Lambda video rendering.

Implemented the Pinecone memory RAG for the AI chatbot using text-embedding-3-large, RecursiveCharacterTextSplitter, and top-K similarity retrieval.