close
Skip to content

vadim0x60/RobotRave

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RobotRave

Make a Tony Pro humanoid robot dance autonomously to music at a robot rave.

Quick Setup

git clone https://github.com/YOUR_USERNAME/RobotRave.git
cd RobotRave
./setup.sh

The setup script will:

  1. Clone the MINT repository (Google's FACT model implementation)
  2. Download pre-trained model checkpoints (~500MB) from Google Drive
  3. Optionally install Python dependencies

Note: The model weights are not stored in this repo due to size. The setup script uses gdown to fetch them automatically.

The Approach

We use Google's FACT model (Full-Attention Cross-modal Transformer) from their AI Choreographer research to generate human dance movements from music. The model was trained on the AIST++ dataset containing 1,408 dance sequences across 10 genres.

Music Audio → FACT Model → SMPL Human Motion → Retarget → Tony Pro Servos → Robot Dances!

Why FACT?

We evaluated three approaches:

  • MDLT (Music to Dance Language Translation) - No pre-trained weights, requires ~140 hours training (paper)
  • DFKI Robot Dance - Physically feasible trajectories but code not publicly available
  • FACT (Google) - Pre-trained checkpoints available, state-of-the-art quality (paper)

FACT was the clear winner: Google already spent months training it on professional dancers, and we can just download and use it.

Fallback: Beat-Sync Dancing

If FACT inference is too slow or fails, we have rule-based beat-synchronized dancing using aubio for real-time beat/onset detection with pre-defined dance moves.


How It Works (Plain English)

What is the FACT Model?

FACT is a neural network that learned to dance by watching 1,408 videos of professional dancers. Google researchers trained it to understand the relationship between music and movement - things like:

  • "When there's a drum hit, raise your arms"
  • "During a bass drop, crouch down"
  • "When the melody rises, extend outward"

You feed it a song, and it outputs a sequence of body poses - one for every frame (60 per second). It's like having a professional choreographer that works instantly.

┌─────────────────────────────────────────────────────────┐
│  INPUT: Music (tempo, beats, melody, energy)            │
│         + 2-second "seed" motion to start from          │
│                          ↓                              │
│  ┌─────────────────────────────────────────────────┐    │
│  │         FACT Neural Network                      │    │
│  │                                                  │    │
│  │   Learned from watching 5.2 hours of dance      │    │
│  │   videos across 10 genres (pop, hip-hop,        │    │
│  │   breakdance, house, krump, etc.)               │    │
│  └─────────────────────────────────────────────────┘    │
│                          ↓                              │
│  OUTPUT: Body pose every 1/60th of a second             │
│          (72 numbers describing joint angles)           │
└─────────────────────────────────────────────────────────┘

What is AIST++?

AIST++ is a massive dance dataset that Google created:

  • 1,408 dance sequences performed by 30 professional dancers
  • 10 dance genres: pop, lock, waack, jazz, house, middle hip-hop, break, krump, street jazz, LA house
  • 5.2 hours of motion capture data (10.1 million frames)
  • Every dance was recorded with 9 cameras and precise motion tracking

This is what FACT learned from. The quality of the output depends on the quality of the training data, and AIST++ contains genuinely skilled dancers.

What is SMPL?

SMPL (Skinned Multi-Person Linear Model) is a standard way to describe human body poses mathematically. Instead of describing movement in words ("raise your right arm"), SMPL uses numbers.

         head (joint 15)
           │
         neck (joint 12)
           │
    ┌──────┴──────┐
 L shoulder    R shoulder
 (joint 16)    (joint 17)
    │              │
 L elbow       R elbow
 (joint 18)    (joint 19)
    │              │
 L wrist       R wrist
 (joint 20)    (joint 21)
           │
        spine
           │
    ┌──────┴──────┐
  L hip        R hip
 (joint 1)    (joint 2)
    │              │
  L knee       R knee
 (joint 4)    (joint 5)
    │              │
 L ankle       R ankle
 (joint 7)    (joint 8)

Each of the 24 joints has 3 rotation values (x, y, z axes). So 72 numbers completely describe any human pose. FACT outputs these 72 numbers for every frame of the dance.

How We Translate 24 Joints → 16 Servos

The challenge: SMPL describes a human body with 24 joints, but Tony Pro only has 16 servos. They don't match up perfectly.

What we keep:

SMPL Joint Tony Pro Servo Why
head head_pitch (8), head_yaw (16) Direct match
r_shoulder right_shoulder_pitch (7) Take the forward/back rotation
r_elbow right_elbow_pitch (5) Take the bend angle
l_shoulder left_shoulder_pitch (15) Take the forward/back rotation
l_elbow left_elbow_pitch (13) Take the bend angle
r_hip right_hip_roll/yaw/pitch (1,2,4) 3 DOF match
r_knee right_knee_pitch (3) Direct match
r_ankle right_ankle_pitch (6) Direct match
(same for left leg) ... ...

What we lose:

  • Shoulder roll (Tony Pro arms only move forward/back, not sideways)
  • Wrist rotation (no wrist servos)
  • Spine/torso twist (no torso servo)
  • Finger movements (no hand servos)

The conversion process:

# SMPL gives us: right_shoulder rotation = [0.5, -0.3, 0.2] radians (x, y, z)
# Tony Pro only has shoulder pitch (forward/back motion)

# 1. Extract the relevant axis (x = pitch = forward/back)
angle = smpl_right_shoulder[0]  # 0.5 radians

# 2. Convert radians to servo pulse (0-1000 scale, 500 = center)
pulse = 500 + (angle * 191)  # 191 pulse units per radian

# 3. Clamp to safe range (don't break the servo!)
pulse = clamp(pulse, 200, 800)

# 4. Send to robot
servo.set_position(7, pulse)  # Servo 7 = right shoulder

Safety scaling:

  • Arms and head: 100% of SMPL motion (safe to move freely)
  • Legs: 30% of SMPL motion (robot can fall over if legs move too much)

The result isn't a perfect recreation of the human dance, but it captures the essence - when the dancer's arms go up, the robot's arms go up. When they bob their head, the robot bobs its head.

For detailed output format documentation, see MODEL_OUTPUT.md.


Tony Pro Servo Mapping

After discovery, the correct 16-servo mapping:

ID Joint Group
1 right_hip_roll right_leg
2 right_hip_yaw right_leg
3 right_knee_pitch right_leg
4 right_hip_pitch right_leg
5 right_elbow_pitch right_arm
6 right_ankle_pitch right_leg
7 right_shoulder_pitch right_arm
8 head_pitch head
9 left_hip_roll left_leg
10 left_hip_yaw left_leg
11 left_knee_pitch left_leg
12 left_hip_pitch left_leg
13 left_elbow_pitch left_arm
14 left_ankle_pitch left_leg
15 left_shoulder_pitch left_arm
16 head_yaw head

Summary:

  • Head: 2 DOF (pitch + yaw)
  • Each arm: 2 DOF (shoulder pitch + elbow pitch)
  • Each leg: 5 DOF (hip roll/yaw/pitch + knee + ankle)

Project Structure

RobotRave/
├── tony_pro.py              # Central servo config & controller module
├── tony_pro_config.json     # Servo mapping in JSON format
├── beat_sync_dance.py       # Fallback: beat-triggered dance moves
├── smart_dance.py           # Style-adaptive dancing (chill/pop/edm/hiphop)
├── play_dance.py            # Play pre-generated dance sequences
├── retarget_to_tonypi.py    # SMPL → servo command conversion
├── generate_dance.py        # Run FACT model on audio files
├── fact_server.py           # Cloud GPU server for FACT inference
├── fact_client.py           # Robot client for cloud inference
├── discover_servos.py       # Interactive servo discovery tool
├── setup.sh                 # Setup script (downloads model weights)
├── RESEARCH.md              # Detailed research notes & approach analysis
├── CLOUD_SETUP.md           # GPU cloud deployment guide
├── MODEL_OUTPUT.md          # FACT model output format documentation
└── (submodules - not tracked)
    ├── mint/                # Google FACT model
    ├── TonyPi/              # Hiwonder SDK
    └── MDLT/                # Alternative approach (unused)

Saturday Checklist

1. Initial Connection (~15 min)

# Connect to robot's WiFi network: HW-67B69FFB

# SSH into the robot's Raspberry Pi
ssh pi@192.168.149.1
# Password: raspberrypi

# Copy project files to robot
scp -r *.py tony_pro_config.json pi@192.168.149.1:~/RobotRave/

2. Verify Servo Mapping (~30 min)

# On the robot - test the servo map is correct
python tony_pro.py --map

# If servos aren't right, run discovery
python discover_servos.py

3. Test Dance Moves (~30 min)

# Simulation mode (prints moves, no robot motion)
python beat_sync_dance.py --test --simulate

# Real robot - test all moves
python beat_sync_dance.py --test

4. Beat-Sync Dancing (Fallback)

# Dance to an audio file
python beat_sync_dance.py --audio your_song.wav

# Live microphone input
python beat_sync_dance.py --live

5. Smart Style-Adaptive Dancing

# Detects music genre and adapts style
python smart_dance.py --live

# Or with a file
python smart_dance.py --audio song.wav

# Demo all styles
python smart_dance.py --demo --simulate

6. FACT Model Dancing (If Cloud Server Ready)

# On cloud GPU server
python fact_server.py --port 8765

# On robot
python fact_client.py --server ws://<server-ip>:8765

7. Pre-Generated Dances

# Generate dance from audio (requires FACT)
python generate_dance.py --audio song.mp3 --output dance.json

# Play on robot
python play_dance.py --input dance.json

Quick Reference

Python Imports

from tony_pro import SERVO, MOVES, TonyProController

# Access servo IDs
SERVO.HEAD_PITCH  # 8
SERVO.R_SHOULDER  # 7
SERVO.L_ELBOW     # 13

# Use pre-defined moves
controller = TonyProController(simulate=False)
controller.execute_move('arms_up')
controller.execute_move('celebrate')
controller.go_neutral()

Safety Notes

  • Leg servos are limited to 30% range to prevent falls
  • Start with --simulate flag to test without robot motion
  • Use --neutral-first to reset pose before playing sequences
  • Head movements are limited to prevent strain

Cloud GPU Setup

For real-time FACT inference, see CLOUD_SETUP.md.

Recommended: AWS g5.xlarge (~$1/hr) with your credits.


Dependencies

On Robot (Raspberry Pi):

pip install numpy aubio sounddevice soundfile
# HiwonderSDK is pre-installed

On Development Machine:

pip install numpy aubio sounddevice soundfile scipy

For FACT Model (Cloud GPU):

pip install tensorflow numpy scipy websockets soundfile

References


Troubleshooting

Problem Solution
Servo moves wrong direction Flip sign in tony_pro.py scale factor
Movement too jerky Increase time_ms parameter
Robot falls over Reduce leg servo ranges in SERVO_LIMITS
No beat detection Check microphone with --simulate flag
Can't connect to robot Verify IP, try ping first

Built for a 24-hour robot rave hackathon.

About

Tony Pro humanoid robot dance control system - beat-sync and AI-generated dancing

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 92.5%
  • Shell 7.5%