Platform

The full stack behind every SoulBox.

Cloud personality AI, voice, avatars, firmware, and fleet ops — built as one integrated platform. Use it as a service or partner with us to ship your own.

Personality AI Cloud

A managed AI runtime for bots with their own voice, mood, and memory. Streaming chat, tool-calling, and persistent personalities.

  • Multi-provider model routing with automatic failover
  • Persistent bot memory + mood
  • Streaming token/audio responses
  • Per-tenant isolation and quotas

Voice Stack

Advanced voice cloning, streaming speech-to-text, and a managed TTS fleet — the same voices on the web, on phones, and on firmware devices.

  • Voice cloning from uploaded samples
  • Streaming speech-to-text recognition
  • Multi-provider TTS fleet with failover
  • Avatar-driven talking-head playback

Firmware Runtime

A SoulBox runtime for ESP32-S3 boards. On-device wake words, low-latency audio, and OTA fleet updates from the cloud.

  • Custom microWakeWord (Modal-trained)
  • Streaming I²S audio in/out
  • OTA updates with staged rollout
  • Drop-in for AiPi Lite, DFR1221, custom boards

Avatar & Image Fleet

Generate avatars, talking-head video, and AI imagery on managed GPU fleets. Per-bot avatars, on-demand portraits, and image gen.

  • Talking-head video (SadTalker / EchoMimic)
  • Avatar pack management
  • Image generation fleet (Z-Image, custom)
  • Backend-routed asset storage in MinIO/S3

Fleet Operations

A control plane for multi-tenant device, voice, image, and inference fleets. Observe latency, scale instances, and manage providers.

  • TTS / STT / image / inference fleets
  • AWS EC2 + Modal scale-to-zero workers
  • Per-fleet metrics + health
  • Encrypted provider API keys

Auth & Billing

OAuth + SSO, multi-tenant accounts, Stripe billing, and roles. Everything you need to ship a real multi-tenant product.

  • Google + Apple SSO, email login
  • Multi-tenant servers, roles, invites
  • Stripe plans + subscriptions
  • reCAPTCHA + rate limiting
External web device view
Usage accounting dashboard
Bot creation wizard
Voice library
End to end

From wake word to spoken reply.

Every layer of the SoulBox platform is tuned for sub-second voice round trips — cloud, codec, and firmware moving as one.

  1. Step 01
    Wake

    Device wakes on a custom on-device wake word — no server round-trip required.

  2. Step 02
    Listen

    Stream audio to the SoulBox cloud, transcribed live with faster-whisper.

  3. Step 03
    Think

    Routed through the right model with persistent personality, mood, and memory.

  4. Step 04
    Speak

    Streaming voice playback with avatar lip-sync — back to your device in under a second.

Ship on the SoulBox platform.

Bring your own hardware, your own voices, or your own brand. We'll handle the cloud, the firmware, and the fleet underneath.