AfriSignEncoder Exp1 — CASL Baseline (LandmarkTransformer)
Part of the AfriSignEncoder research project: a multilingual African sign language recognition benchmark. This checkpoint is the Experiment 1 single-language baseline for Central African Sign Language (CASL).
Model Description
A ViT-style transformer that treats T=64 MediaPipe Holistic landmark frames as a token sequence.
| Component | Value |
|---|---|
| Architecture | LandmarkTransformer (custom) |
| Input | (B, 64, 225) float32 — 75 keypoints × 3 coords per frame |
| Embedding dim | 256 |
| Attention heads | 8 |
| Encoder layers | 4 |
| Feed-forward dim | 1,024 |
| Positional encoding | Learned |
| Classification head | Linear 256 → 60 |
| Parameters | ~3.25 M |
Dataset
CASL-W60 — 60 word-level Central African Sign Language glosses.
| Split | Samples |
|---|---|
| Train | 3,667 |
| Test | 2,222 |
Source: Kaggle mwakalucky/casl-w60 → parquet at luciayen/CASL-W60-Landmarks.
Landmark format: MediaPipe Holistic (pose 33 + left hand 21 + right hand 21) × xyz = 225D per frame.
Training
| Setting | Value |
|---|---|
| Optimiser | AdamW (lr=3e-4, wd=1e-4) |
| LR schedule | OneCycleLR cosine |
| Max epochs | 60 |
| Batch size | 64 |
| Loss | CrossEntropy + label_smoothing=0.1 |
| Early stopping | patience=12 on val acc |
| Normalisation | Per-feature z-score (stats stored in checkpoint) |
Results
| Metric | Value |
|---|---|
| Best validation accuracy | 71.92% |
| Best checkpoint epoch | 35 |
| Final epoch (early stop) | 47 |
| Random-chance baseline | 1.67% |
Checkpoint Contents
import torch
ck = torch.load("pytorch_model.bin", map_location="cpu")
# Keys: epoch, val_acc, model (state_dict), l2i (label→index dict),
# mean (tensor 225,), std (tensor 225,)
The l2i dict maps 60 CASL gloss strings to integer class indices.
mean and std are the per-feature normalisation statistics computed on the training set.
Limitations
- Landmark-only (no RGB appearance) — sets a lower bound for CASL recognition.
- Train/test split is from the original dataset; signer independence has not been verified.
- 60 glosses is a small fraction of full CASL vocabulary.
Citation / Project
AfriSignEncoder research project, CMU, 2026. GitHub: africansl_encoder.
Dataset used to train luciayen/afrisign-exp1-casl-baseline
Collection including luciayen/afrisign-exp1-casl-baseline
Evaluation results
- Validation Accuracy on CASL-W60-Landmarksself-reported0.719