Amazing work!!

May 30

Hi, thanks for sharing your ForceVLA fork and Hugging Face resources.

I am trying to understand whether your checkpoint tshiamor/forcevla-sfp-sc-data24 can be used as a working ForceVLA inference/checkpoint-loading reference.

I have installed the official ft-robotic/ForceVLA repo and compared it with your fork. Your fork was very useful because it seems to fix the sample_actions() path by using prefix_out_fix instead of prefix_out, and by padding the LIMoE input sequence before calling self.limoe(...).

Could you clarify a few things?

Which GitHub branch and commit should be used with tshiamor/forcevla-sfp-sc-data24?
Which config name loads this checkpoint correctly?
Is this checkpoint a full OpenPI/ForceVLA checkpoint, a LoRA checkpoint, or a train-state checkpoint?
Does it require use_joint_state=True?
What exact observation schema does it expect? For example, image keys, state layout, and action layout.
Is the prefix_out_fix + LIMoE padding change required for inference with this checkpoint?
Do you have a minimal command or script for running sample_actions() on one example observation?

I am trying to keep my local ForceVLA patch minimal and avoid copying dataset-specific changes unless they are required.

Thanks.

tshiamor

Owner 29 days ago

•

edited 29 days ago

I'm glad you find the work helpful.
It was just my attempt at learning ForceVLA and at the same time trying to rush a solution I was learning on the fly for a competition (intrinsic AI for Industry Challenge ). However the qualification event has passed and I missed it, as i did not have a good dataset during the deadline.
This is a work in progress and a bit rough as I'm still fixing several items during spare time. The repo to look out for is develop ( https://github.com/tshiamor/ForceVLA/tree/develop ). The latest dataset i'm working on is : https://huggingface.co/datasets/tshiamor/aic_gt_sfp_all_trimmed_v2 and its checkpoint : https://huggingface.co/tshiamor/forcevla-sfp-all-trimmed-v2 ( wouldn't use yet until verified) also with joint states : https://huggingface.co/datasets/tshiamor/aic_gt_sfp_all_trimmed_v3 (ongoing) . i think these two datasets are a bit more complete and would advice those instead of forcevla-sfp-sc-data24.
I have a created a guide to as an overview and to try to answer some of your questions while putting things together : https://github.com/tshiamor/ForceVLA/blob/develop/GUIDE.md .

argousaxthechaos

28 days ago

Thanks, this is really helpful. I am going through GUIDE.md and will test the v2 checkpoint as an inference reference. I am doing my Master's Thesis in multi-geometry, multi-orientation peg-in-hole insertion on a doosan m1013 robot. I have not started collecting data yet (planning on teleoperation on the real robot), but your work gives me hope that ForceVLA is something worth considering.

Three small clarifications:

For forcevla_sfp_all_trimmed_v2, are the 7D action deltas expressed in the robot base frame, TCP/local end-effector frame, or another task frame?
Is the wrench in observation.state.wrench expressed in the wrist sensor frame, TCP frame, or robot base frame?
Do you have a standalone non-ROS Python script that loads tshiamor/forcevla-sfp-all-trimmed-v2 and runs sample_actions() on one example observation/checkpoint? I want to test checkpoint loading without the AIC ROS stack first.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment