BLUE RHINO AI
Avatar Training Videos • uploadvideos.bluerhinoai.com

Record & Upload Your Training Videos

🧠 The #1 thing to know: Your AI avatar is trained on your voice tone, inflections, and body movements — not your words. It doesn't matter what you say in these videos. Talk about your weekend, your favorite restaurant, how your day is going — literally anything. The AI only cares about how you sound and how you move, not the words coming out of your mouth.
🎬 Goal: Record up to 5 short videos at different framings so we can build rotating AI twins that look and sound like you. Different camera distances give the avatar different "looks" so your content never feels repetitive. More videos = more variety, but you can start with as few as 2.
🎤 Why vocal energy matters: Even though the words don't matter, your energy and tone absolutely do. If you record all 5 videos in the same flat, monotone voice, your avatar will only have one gear. Mix it up — be excited in one, calm in another, fired up in the next. Give the AI a full range to work with.
📡 Tip: You can upload from your phone or computer. Make sure you're on Wi-Fi — these are large video files and cellular uploads can be slow or fail.

🎥 Watch This First

If the video doesn't load, refresh the page and try again.

🎯 The 5 Framings at a Glance

#FramingWhat's in frameEnergy
1 Knees-Up Knees to head — NO feet Excited & passionate
2 Knees-Up (different outfit) Same as #1, different shirt Calm & friendly
3 Waist-Up Head to waist, hands visible Warm & empathetic
4 Waist-Up (different outfit) Same as #3, different shirt Direct & authoritative
5 Chest-Up Head to mid-chest Confident & upbeat

Different framings = different avatar "looks." Different vocal energy = a natural-sounding avatar that doesn't feel robotic. Talk about anything you want — the AI only learns from how you sound and move.

Recording Guide

  • Use your phone's normal camera app (not TikTok, Instagram, etc.)
  • Set video quality to 4K if available (iPhone: Settings → Camera → Record Video → "4K at 30 fps" • Android: camera app → video settings → "UHD 4K" or "3840x2160")
  • Have someone else hold the camera — a friend, family member, or staff member. They should hold with both hands and keep their elbows tucked against their body for stability
  • Phone held in landscape (sideways), at your eye level
  • Face a window or use soft directional light — avoid overhead-only lighting
  • The person filming should tap your face on screen to focus — if possible, lock focus and exposure (on iPhone: long-press your face until you see "AE/AF Lock")
  • Record in a quiet room — no music, no background conversation. Your phone's built-in mic is fine if the room is quiet
  • Videos 1 & 2 — Knees-Up: Stand 6–8 feet from the camera. Frame yourself from your knees to above your head. Do NOT include your feet in the frame. This is the most important framing rule.
  • Videos 3 & 4 — Waist-Up: Stand 4–5 feet from the camera. Frame from your waist to above your head. Hands may be visible — keep them still or use small natural gestures.
  • Video 5 — Chest-Up: Stand 3–4 feet from the camera. Frame from mid-chest to above your head.

Between videos: Just step forward or back to change the framing. For Videos 2 and 4, change your shirt to a different solid color before recording.

  • Wear a solid-color shirt — no patterns, stripes, logos, or shiny fabric
  • Wear one shirt for Videos 1, 3, and 5
  • Change to a different solid-color shirt for Videos 2 and 4
  • Good colors: blue, teal, green, burgundy, light gray. Avoid pure white, and don't wear all black for every video
  • Each video should be 2–3 minutes, one continuous take
  • Look directly into the camera lens the entire time — not the screen, not around the room
  • Talk about literally anything. Your weekend plans, your favorite food, a funny story — the AI doesn't care about the words. It's learning your voice, your tone, and your movements
  • Don't read or recite — just talk naturally like you're having a real conversation
  • Minimal hand movement — small natural gestures are fine, but no big arm movements
  • Each clip should be one continuous take — no cuts, no edits

The words are irrelevant, but the energy is everything. Each video should sound noticeably different. Here's a guide:

  • Video 1 — Excited & passionate: Bring real energy. Talk like you're telling a friend an exciting story. Lean in, raise your voice, show some fire.
  • Video 2 — Calm & friendly: Relax. Talk like you're catching up with a neighbor. Easygoing, approachable, warm.
  • Video 3 — Warm & empathetic: Slow it down a bit. Talk like you're comforting someone who's having a rough day. Genuine warmth.
  • Video 4 — Direct & authoritative: Be straightforward and confident. Talk like you're the expert correcting bad information. Clear, strong, no hesitation.
  • Video 5 — Confident & upbeat: Be positive and welcoming. Talk like you're greeting someone you're happy to see. Upbeat and genuine.

The key: If all 5 videos sound the same, your avatar will only have one vocal gear. Give it range — that's what makes it sound human.

  • Don't record handheld by yourself — have someone else film you
  • Don't include your feet in the Knees-Up videos
  • Don't use filters
  • Don't sit with a bright window behind you
  • Don't wear patterns, stripes, or logos
  • Don't move closer or farther during a take (only between videos)
  • Don't speak in a flat, monotone voice for every video — mix up the energy
Upload Your Videos

Select your clips below — each one matches a specific framing. All 5 is ideal, but you can upload as few as 2 to get started. Max 2 GB per file. Best on Wi‑Fi. Don't close this tab during upload.

Best on Wi‑Fi • Don't close this tab during upload