Ropedia Xperience-10M Task Baselines

This repo stores the minimal baseline weights, neural MLP task-head checkpoints, and metrics for the 12-task Xperience-10M episode suite, plus four lightweight direction-extension probes. It is a baseline-model artifact repo for research development, not a robot foundation model.

Ropedia Xperience-10M Task Suite logo

12-task suite with sample modalities

The source Xperience-10M sample spans video, audio, depth, pose, motion capture, inertial sensing, and language annotation. The committed minimal and neural task heads use the current 8,378-d feature manifest; audio is documented in the figures but is not yet extracted into a model input feature block.

The tabbed research website, task-first 12-head map, responsive modality atlas, interactive scrub/play storyboard, website HTML mirrors, brand_assets.json, and scripts/build_brand_assets.py are included so this model repo stays aligned with the public Space and artifact dataset.

Evidence Boundary

Claim layer Evidence Boundary
Project status PROJECT_STATUS.md, metrics/project_status.json compact verified/data-gated/not-redistributed decision table
Baseline weights artifacts/**/model.npz lightweight heads only
Neural checkpoints artifacts/episode_task_suite/neural_mlp/**/model.pt same single-episode windows and splits
Metrics artifacts/**/metrics.json, prediction CSV/NPZ files debugging and task-contract evidence
Feature contract artifacts/**/feature_manifest.json audio documented but not featurized
Evaluation protocol EVALUATION_PROTOCOL.md, metrics/evaluation_protocol.json windowing, chronological split, leakage controls, and task metrics
Qwen3-Omni companion blocker and access-status reports readiness-only until 32 valid episodes are available
Source alignment SOURCE_ALIGNMENT_AUDIT.md, metrics/source_alignment_audit.json, scripts/validate_source_alignment.py validates full-dataset facts, sample-card facts, API-listing caveats, and public-card boundary markers
Task surface integrity metrics/task_surface_integrity.json, scripts/validate_task_surface.py task cards use human-readable research names, modality thumbnails, and the interactive storyboard data contract
Artifact index metrics/artifact_index.json compact catalog of project-critical supporting artifacts
Reproducibility REPRODUCIBILITY.md, metrics/reproducibility_matrix.json public commands, expected outputs, exact-match reproduction evidence, and non-reproducible boundaries

90-Second Research Project Path

Step Question Primary artifacts
1 What has been implemented? PROJECT_STATUS.md, metrics/project_status.json, EVIDENCE_CONTRACT.md, ARTIFACT_GUIDE.md, QUALITY_GATES.md, FIGURE_INDEX.md, metrics/artifact_index.json, metrics/figure_index.json, metrics/live_publication_status.json, metrics/quality_gates.json, metrics/mirror_parity.json, metrics/scope_claims_audit.json, metrics/publication_audit.json, metrics/task_surface_integrity.json, metrics/website_integrity.json, metrics/project_manifest.json
2 Are source facts consistently presented? SOURCE_ALIGNMENT_AUDIT.md, metrics/source_alignment_audit.json, scripts/validate_source_alignment.py
3 How do I reproduce it? REPRODUCIBILITY.md, metrics/reproducibility_matrix.json, companion GitHub notes/reproducibility_audit.md
4 What is one model input? artifacts/episode_task_suite/feature_manifest.json, artifacts/episode_task_suite/available_modalities.json, companion artifact dataset windows.csv
5 Are the task results backed by files? artifacts/episode_task_suite/summary_report.json, artifacts/episode_task_suite/neural_mlp/, metrics/summary_metrics.json
6 What is still pending? companion GitHub results/omni_finetune/DATA_BLOCKER_REPORT.md and MULTI_EPISODE_ACCESS_STATUS.md

Official Dataset Alignment

The model card mirrors the official-source alignment artifact at metrics/xperience10m_dataset_card_alignment.json plus XPERIENCE10M_DATASET_CARD_ALIGNMENT.md. That file records the official ropedia-ai/xperience-10m card scope, manually gated access, full-scale modalities, episode layout, intended uses, and the claims this small baseline repo does not make. It also records the public sample card (cc-by-nc-4.0, HOMIE Toolkit, Rerun 0.29.0 .rrd visualization) and the current HF API listing snapshot: 803 session folders and 12,103 episode folders with annotation.hdf5, plus the live HF 31.9 TB file-size display. The 31.9 TB display is tracked separately from the official card's about-1PB full-scale storage statement. Those are upstream metadata facts, not local downloads, raw-data redistribution, or model-quality evidence.

Qwen3-Omni LoRA Boundary

The companion GitHub repo includes scripts for Xperience-10M multi-episode access, staging, manifest building, and a Qwen3-Omni LoRA pilot path. The current LoRA checkpoint is a readiness artifact from one locally available episode and 128 train windows. It is not a full 32-episode result.

The next real model milestone is a 32-episode held-out-episode LoRA pilot after access to ropedia-ai/xperience-10m is approved. The staging plan selects 32 complete episodes from 32 different top-level session UUIDs, then builds held-out episode manifests for training and evaluation.

Minimal and Neural Architecture

Minimal 12-task architecture

The committed heads are intentionally small:

  • z-score + linear softmax classifiers
  • dual ridge regression/projection heads
  • sigmoid multi-label logistic regression
  • cosine ranking for retrieval tasks
  • z-score + PyTorch MLP heads for all 12 human-readable task cards

Metrics Snapshot

These are single-episode chronological-split metrics. They are useful for debugging task definitions and input contracts, not for claiming cross-episode generalization.

Task Neural MLP metric Minimal metric
Action Recognition macro-F1 0.0263 0.0500
Procedure Step Recognition macro-F1 0.0175 0.0495
Action Boundary Detection macro-F1 0.6485 0.6552
Next-Action Prediction macro-F1 0.0235 0.0593
Hand Trajectory Forecasting MPJPE, lower is better 0.1116 0.8223
Contact State Prediction macro-F1 1.0000 1.0000
Object Relevance Prediction micro-F1 0.1798 0.1839
Language Grounding MRR 0.0178 0.0172
Cross-Modal Retrieval MRR 0.1530 0.2634
Cross-Modal Reconstruction R2 -0.0102 -0.0160
Temporal Order Verification F1 0.8718 0.5487
Multimodal Synchronization Detection F1 0.7335 0.4866

Included

  • artifacts/**/model.npz: minimal baseline weights, scalers, and labels
  • artifacts/episode_task_suite/neural_mlp/**/model.pt: neural MLP task-head checkpoints
  • artifacts/episode_task_suite/neural_mlp/**/history.json: neural training traces
  • artifacts/**/metrics.json: committed metrics
  • artifacts/**/feature_manifest.json: feature block boundaries where relevant
  • assets/: mirrored figures, modality thumbnails, and brand assets
  • metrics/: mirrored project status, protocol, source-alignment, publication, and scope-claim JSON files
  • scripts/: reproduction, visualization, and validation scripts

Data Notice

This repo does not redistribute raw Xperience-10M videos or raw annotation.hdf5. Download the original sample from Ropedia / Hugging Face and follow the dataset terms:

Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Collections including cy0307/ropedia-xperience-10m-task-baselines

Evaluation results

  • top-5 retrieval accuracy on Xperience-10M public sample episode
    self-reported
    0.376
  • mean reciprocal rank on Xperience-10M public sample episode
    self-reported
    0.263
  • macro-F1 on Xperience-10M public sample episode
    self-reported
    0.655
  • neural MLP F1 on Xperience-10M public sample episode
    self-reported
    0.872