This project utilizes computer vision and machine learning to detect and classify sign language gestures in real-time. The system leverages MediaPipe for hand landmark detection and scikit-learn for classification.
This application recognizes hand gestures corresponding to the 26 letters of the American Sign Language (ASL) alphabet using a webcam. The system extracts hand landmark features and employs a Random Forest classifier to predict the corresponding letter.
- 🎥 Real-time hand gesture recognition
- 🔠 Support for all 26 ASL alphabet letters
- 📈 99.6% classification accuracy
- 👀 Visual feedback with landmark visualization
- 🖥️ Easy-to-use interface
- Python 3.8 or higher
- Webcam
- Required Python packages (listed in
requirements.txt)
-
Download the zip file
cd sign-language-detector-python -
Install dependencies:
pip install -r requirements.txt
sign-language-detection/
├── collect_imgs.py # Collect hand gesture images
├── create_dataset.py # Process images and extract features
├── train_classifier.py # Train the Random Forest model
├── inference_classifier.py # Real-time sign language detection
├── requirements.txt # Required dependencies
├── data/ # Collected image data
│ ├── 0/ # Images for specific gestures
│ ├── 1/
│ ├── 2/
│ └── ...
└── models/ # Trained models
└── random_forest_model.pkl # Trained classifier
If you want to collect your own dataset (skip if using pre-existing data):
python collect_imgs.pyFollow the on-screen prompts to capture images for each gesture. Press 'q' to stop collecting images for a gesture.
Process the collected images and extract hand landmark features:
python create_dataset.pyThis will generate a structured dataset for training.
Train the Random Forest classifier:
python train_classifier.pyThe trained model will be saved in the models/ directory.
Run real-time sign language detection:
python inference_classifier.pyMake hand gestures in front of your webcam and see the detected letters displayed in real time.
- Hand Detection: MediaPipe extracts 21 key landmarks from each hand.
- Feature Extraction: Each landmark provides x and y coordinates, yielding 42 features per gesture.
- Model: Random Forest classifier with optimized hyperparameters.
- Preprocessing: Landmark coordinates are normalized for position-invariant features.
- ✅ 99.6% accuracy on the test dataset
- 🚀 Real-time inference at ~30 FPS (depends on hardware)
- 🖐️ Robust to variations in hand size and positioning
All dependencies are listed in requirements.txt:
opencv-python==4.7.0.68
mediapipe==0.9.0.1
scikit-learn==1.2.0
numpy>=1.20.0
matplotlib>=3.5.0
| Issue | Solution |
|---|---|
| ❌ No webcam detected | Ensure your webcam is connected and not used by another app. |
| 🤏 Poor recognition accuracy | Adjust lighting and ensure your hand is fully visible. |
| 📂 Model not found | Run train_classifier.py before inference. |
- Hand landmark detection powered by MediaPipe
- Classification powered by scikit-learn