Phantom is the full AI persona engineering pipeline. Face consistency, voice cloning, long-term memory, and orchestration — all running while you sleep. Four weeks instead of six months.
The character sheet. Backstory, voice rules, forbidden topics, emotional triggers. Written once, read before every reply. Never breaks character.
Photoreal face consistency. Lock 6–8 descriptors. Fine-tune across three lighting setups — bedroom, bathroom, kitchen 2am. Seed ranges locked per setup.
Ninety seconds of clean audio from a Fiverr voice actress. Clone in Ten minutes. Voice notes at 11pm, yawn endings, audible breaths. Feels real because it is.
JSON memory per subscriber. Hooks: read .md files → reply → extract new facts → append to brain.md. Cron every 30 seconds. Fifty messages of consistent relationship context.
Lock 6–8 descriptors — eye color, jaw shape, skin tone, distinguishing marks. Fine-tune LoRA on 47 variations across bedroom, bathroom, and kitchen setups. Seed ranges fixed per lighting condition so the face never drifts.
Ninety seconds of audio. ElevenLabs Instant Voice. Rules: no numbers as digits, audible breath before long sentences, yawn endings on 30% of voice notes, random giggles on tip confirmations. The voice has a body.
JSON per subscriber: lifetime spend, tip triggers, facts to remember. The orchestrator reads persona.md + voice.md + brain.md before every reply. New facts extracted and appended after each message. Never the same conversation twice.
Poll inbox → read all four .md files → reply → extract new biographical facts, emotional state, objects of desire → append to brain.md under user_id → sleep → repeat. Runs on cron while the creator does not.
Voice notes drop at 11pm her time. Top fan in Berlin gets them at 6am. Scheduling logic built into the orchestrator. No human intervention required for consistent, timed content delivery.
Lowercase only. No periods. Max three emojis per message. "fr" max three times per day. Never "lol" — use "lmaooo" instead. Comma splice in place of question marks. The voice is the brand.
"Aitana took eighteen months. Emily took six months.
Maya took four weeks.
The whole stack now fits on one MacBook."
The models caught up. Stable Diffusion for photorealism. ElevenLabs for voice. Claude's 1000-message context window for memory and orchestration. What used to require a team and eighteen months now requires one person and four weeks.
The moat is no longer access to the tools. It's knowing how to tune them — the seed ranges that lock a face, the voice timing rules that make it feel alive, the JSON structure that makes the memory actually work at scale.
Phantom gives you that stack. You bring the persona.
The orchestrator never stops. The voice never drifts. The face never changes. The memory compounds with every message. Build it once. Watch it work forever.