Ropedia Xperience-10M Task Baselines
This repo stores the minimal baseline weights, neural MLP task-head checkpoints, and metrics for the 12-task Xperience-10M episode suite, plus four lightweight direction-extension probes. It is a baseline-model artifact repo for research development, not a robot foundation model.
The source Xperience-10M sample spans video, audio, depth, pose, motion capture, inertial sensing, and language annotation. The committed minimal and neural task heads use the current 8,378-d feature manifest; audio is documented in the figures but is not yet extracted into a model input feature block.
The tabbed research website, task-first 12-head map, responsive modality atlas,
interactive scrub/play storyboard, website HTML mirrors, brand_assets.json, and
scripts/build_brand_assets.py are included so this model repo stays aligned
with the public Space and artifact dataset.
Evidence Boundary
| Claim layer | Evidence | Boundary |
|---|---|---|
| Project status | PROJECT_STATUS.md, metrics/project_status.json |
compact verified/data-gated/not-redistributed decision table |
| Baseline weights | artifacts/**/model.npz |
lightweight heads only |
| Neural checkpoints | artifacts/episode_task_suite/neural_mlp/**/model.pt |
same single-episode windows and splits |
| Metrics | artifacts/**/metrics.json, prediction CSV/NPZ files |
debugging and task-contract evidence |
| Feature contract | artifacts/**/feature_manifest.json |
audio documented but not featurized |
| Evaluation protocol | EVALUATION_PROTOCOL.md, metrics/evaluation_protocol.json |
windowing, chronological split, leakage controls, and task metrics |
| Qwen3-Omni | companion blocker and access-status reports | readiness-only until 32 valid episodes are available |
| Source alignment | SOURCE_ALIGNMENT_AUDIT.md, metrics/source_alignment_audit.json, scripts/validate_source_alignment.py |
validates full-dataset facts, sample-card facts, API-listing caveats, and public-card boundary markers |
| Task surface integrity | metrics/task_surface_integrity.json, scripts/validate_task_surface.py |
task cards use human-readable research names, modality thumbnails, and the interactive storyboard data contract |
| Artifact index | metrics/artifact_index.json |
compact catalog of project-critical supporting artifacts |
| Reproducibility | REPRODUCIBILITY.md, metrics/reproducibility_matrix.json |
public commands, expected outputs, exact-match reproduction evidence, and non-reproducible boundaries |
90-Second Research Project Path
| Step | Question | Primary artifacts |
|---|---|---|
| 1 | What has been implemented? | PROJECT_STATUS.md, metrics/project_status.json, EVIDENCE_CONTRACT.md, ARTIFACT_GUIDE.md, QUALITY_GATES.md, FIGURE_INDEX.md, metrics/artifact_index.json, metrics/figure_index.json, metrics/live_publication_status.json, metrics/quality_gates.json, metrics/mirror_parity.json, metrics/scope_claims_audit.json, metrics/publication_audit.json, metrics/task_surface_integrity.json, metrics/website_integrity.json, metrics/project_manifest.json |
| 2 | Are source facts consistently presented? | SOURCE_ALIGNMENT_AUDIT.md, metrics/source_alignment_audit.json, scripts/validate_source_alignment.py |
| 3 | How do I reproduce it? | REPRODUCIBILITY.md, metrics/reproducibility_matrix.json, companion GitHub notes/reproducibility_audit.md |
| 4 | What is one model input? | artifacts/episode_task_suite/feature_manifest.json, artifacts/episode_task_suite/available_modalities.json, companion artifact dataset windows.csv |
| 5 | Are the task results backed by files? | artifacts/episode_task_suite/summary_report.json, artifacts/episode_task_suite/neural_mlp/, metrics/summary_metrics.json |
| 6 | What is still pending? | companion GitHub results/omni_finetune/DATA_BLOCKER_REPORT.md and MULTI_EPISODE_ACCESS_STATUS.md |
Official Dataset Alignment
The model card mirrors the official-source alignment artifact at
metrics/xperience10m_dataset_card_alignment.json plus
XPERIENCE10M_DATASET_CARD_ALIGNMENT.md. That file records the official
ropedia-ai/xperience-10m card scope, manually gated access, full-scale
modalities, episode layout, intended uses, and the claims this small baseline
repo does not make. It also records the public sample card (cc-by-nc-4.0,
HOMIE Toolkit, Rerun 0.29.0 .rrd visualization) and the current HF API
listing snapshot: 803 session folders and 12,103 episode folders with
annotation.hdf5, plus the live HF 31.9 TB file-size display. The 31.9 TB
display is tracked separately from the official card's about-1PB full-scale
storage statement. Those are upstream metadata facts, not local downloads,
raw-data redistribution, or model-quality evidence.
Qwen3-Omni LoRA Boundary
The companion GitHub repo includes scripts for Xperience-10M multi-episode access, staging, manifest building, and a Qwen3-Omni LoRA pilot path. The current LoRA checkpoint is a readiness artifact from one locally available episode and 128 train windows. It is not a full 32-episode result.
The next real model milestone is a 32-episode held-out-episode LoRA pilot after
access to ropedia-ai/xperience-10m is approved. The staging plan selects 32
complete episodes from 32 different top-level session UUIDs, then builds
held-out episode manifests for training and evaluation.
Minimal and Neural Architecture
The committed heads are intentionally small:
- z-score + linear softmax classifiers
- dual ridge regression/projection heads
- sigmoid multi-label logistic regression
- cosine ranking for retrieval tasks
- z-score + PyTorch MLP heads for all 12 human-readable task cards
Metrics Snapshot
These are single-episode chronological-split metrics. They are useful for debugging task definitions and input contracts, not for claiming cross-episode generalization.
| Task | Neural MLP metric | Minimal metric |
|---|---|---|
| Action Recognition macro-F1 | 0.0263 | 0.0500 |
| Procedure Step Recognition macro-F1 | 0.0175 | 0.0495 |
| Action Boundary Detection macro-F1 | 0.6485 | 0.6552 |
| Next-Action Prediction macro-F1 | 0.0235 | 0.0593 |
| Hand Trajectory Forecasting MPJPE, lower is better | 0.1116 | 0.8223 |
| Contact State Prediction macro-F1 | 1.0000 | 1.0000 |
| Object Relevance Prediction micro-F1 | 0.1798 | 0.1839 |
| Language Grounding MRR | 0.0178 | 0.0172 |
| Cross-Modal Retrieval MRR | 0.1530 | 0.2634 |
| Cross-Modal Reconstruction R2 | -0.0102 | -0.0160 |
| Temporal Order Verification F1 | 0.8718 | 0.5487 |
| Multimodal Synchronization Detection F1 | 0.7335 | 0.4866 |
Included
artifacts/**/model.npz: minimal baseline weights, scalers, and labelsartifacts/episode_task_suite/neural_mlp/**/model.pt: neural MLP task-head checkpointsartifacts/episode_task_suite/neural_mlp/**/history.json: neural training tracesartifacts/**/metrics.json: committed metricsartifacts/**/feature_manifest.json: feature block boundaries where relevantassets/: mirrored figures, modality thumbnails, and brand assetsmetrics/: mirrored project status, protocol, source-alignment, publication, and scope-claim JSON filesscripts/: reproduction, visualization, and validation scripts
Data Notice
This repo does not redistribute raw Xperience-10M videos or raw
annotation.hdf5. Download the original sample from Ropedia / Hugging Face and
follow the dataset terms:
- https://huggingface.co/datasets/ropedia-ai/xperience-10m-sample
- https://huggingface.co/datasets/ropedia-ai/xperience-10m
- https://ropedia.com/dataset
Links
| Resource | URL |
|---|---|
| Hugging Face Space | https://huggingface.co/spaces/cy0307/ropedia-xperience-10m-task-suite |
| Live Hugging Face app | https://cy0307-ropedia-xperience-10m-task-suite.static.hf.space/ |
| Artifact dataset | https://huggingface.co/datasets/cy0307/ropedia-xperience-10m-task-suite-artifacts |
| GitHub repo | https://github.com/ChaoYue0307/ropedia-xperience-10m-task-suite |
| GitHub Pages dashboard | https://chaoyue0307.github.io/ropedia-xperience-10m-task-suite/ |
| Xperience-10M website | https://ropedia.com/dataset |
| Xperience-10M release page | https://ropedia.com/blog/20260316_xperience_10m |
| Ropedia GitHub organization | https://github.com/Ropedia |
| HOMIE Toolkit | https://github.com/Ropedia/HOMIE-toolkit |
Collections including cy0307/ropedia-xperience-10m-task-baselines
Evaluation results
- top-5 retrieval accuracy on Xperience-10M public sample episodeself-reported0.376
- mean reciprocal rank on Xperience-10M public sample episodeself-reported0.263
- macro-F1 on Xperience-10M public sample episodeself-reported0.655
- neural MLP F1 on Xperience-10M public sample episodeself-reported0.872


