KlingTeam
/

RoboMaster

Diffusers

Safetensors

Model card Files Files and versions

xet

Community

Add pipeline tag, library name and link to project page

by nielsr HF Staff - opened Jun 4

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+41

-6

Files changed (1) hide show

README.md +41 -6

README.md CHANGED Viewed

@@ -1,13 +1,48 @@
 ---
-license: apache-2.0
 ---
-# RoboMaster
-It synthesizes realistic robotic manipulation video given an initial frame, a prompt, a user-defined object mask, and a collaborative trajectory describing the motion of both robotic arm and manipulated object in decomposed interaction phases. It supports diverse manipulation skills and can generalize to in-the-wild scenarios.
-## Usage
-This is the implementation based on CogVideoX-5B. Please refer to our [github](https://github.com/KwaiVGI/RoboMaster) for details on usage.
-<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/63aef2cafcca84593e6682db/M7xBPv-NmqZeCvLRoDlu6.mp4"></video>

 ---
+license: cc-by-4.0
+task_categories:
+- image-text-to-text
+configs:
+- config_name: default
+  data_files:
+  - split: HCMAS_train
+    path: version_v4/HCMAS-train.json
+  - split: HCMAS_test
+    path: version_v4/HCMAS-test.json
+  - split: HCSHR_train
+    path: version_v4/HCSHR-train.json
+  - split: HCSHR_test
+    path: version_v4/HCSHR-test.json
 ---
+# Aligning VLM Assistants with Personalized Situated Cognition (ACL 2025 main)
+[![GitHub Stars](https://img.shields.io/github/stars/your-username/PCogAlign?style=social)](https://github.com/liyongqi2002/PCogAlign)
+[![Hugging Face Dataset](https://img.shields.io/badge/dataset-PCogAlignBench-blue)](https://huggingface.co/datasets/YongqiLi/PCogAlignBench)
+[![arXiv](https://img.shields.io/badge/arXiv-2506.00930-orange)](https://arxiv.org/abs/2506.00930)
+This repository contains the constructed benchmark in our ACL 2025 main paper **"Aligning VLM Assistants with Personalized Situated Cognition"**.
+> ⚠️ This project is for academic research only and not intended for commercial use.
+## Abstract
+Vision-language models (VLMs) aligned with general human objectives, such as being harmless and hallucination-free, have become valuable assistants of humans in managing visual tasks.
+However, people with diversified backgrounds have different cognition even in the same situation. Consequently, they may have personalized expectations for VLM assistants.
+This highlights the urgent need to align VLM assistants with personalized situated cognition for real-world assistance.
+To study this problem, we first simplify it by characterizing individuals based on the sociological concept of Role-Set. Then, we propose to evaluate the individuals' actions to examine whether the personalized alignment is achieved.
+Further, we construct a benchmark named PCogAlignBench, which includes 18k instances and 20 individuals with different Role-Sets.
+Finally, we present a framework called PCogAlign, which constructs a cognition-aware and action-based reward model for personalized alignment.
+Experimental results and human evaluations demonstrate the reliability of the PCogAlignBench and the effectiveness of our proposed PCogAlign.
+## 🙌 Acknowledgments
+All datasets and models used are obtained through legal and ethical means. For detailed ethical considerations, please refer to our paper's Ethics Statement section.
+## 📬 Contact
+For any questions or feedback, feel free to reach out to us at [liyongqi@whu.edu.cn].
+---
+✨ Thank you for your interest in PCogAlign! Stay tuned for more updates.