nicklashansen
/

tdmpc2

Reinforcement Learning

reinforcement learning

continuous control

Model card Files Files and versions Community

nicklashansen commited on Oct 26, 2023

Commit

8fb2a82

•

1 Parent(s): 9895c7c

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Official release of TD-MPC2 model checkpoints for the paper
 [Nicklas Hansen](https://nicklashansen.github.io), [Hao Su](https://cseweb.ucsd.edu/~haosu)\*, [Xiaolong Wang](https://xiaolonw.github.io)\* (UC San Diego)
-**Quick links:** [[Website]](https://nicklashansen.github.io/td-mpc2) [[Paper]](https://arxiv.org/abs/2310.16828) [[Dataset]](https://www.tdmpc2.com/dataset)
 ## Model Details
@@ -33,7 +33,7 @@ We open-source a total of 324 TD-MPC2 model checkpoints, including 12 multi-task
 ### Model Sources
 - **Repository:** [https://github.com/nicklashansen/tdmpc2](https://github.com/nicklashansen/tdmpc2)
-- **Paper:** [https://www.tdmpc2.com](https://arxiv.org/abs/2310.16828)
 ## Uses
@@ -57,7 +57,7 @@ We describe the training procedure for single-task and multi-task model checkpoi
 ### Training Procedure (Single-task)
-Single-task checkpoints are trained using the [official implementation](https://github.com/nicklashansen/tdmpc2) with default hyperparameters. All models have 5M parameters. Most, but not all, models are trained until convergence. Refer to the individual task curves in our [paper](https://www.tdmpc2.com) for a detailed breakdown of model performance on each task.
 ### Training Procedure (Multi-task)

 [Nicklas Hansen](https://nicklashansen.github.io), [Hao Su](https://cseweb.ucsd.edu/~haosu)\*, [Xiaolong Wang](https://xiaolonw.github.io)\* (UC San Diego)
+**Quick links:** [[Website]](https://nicklashansen.github.io/td-mpc2) [[Paper]](https://arxiv.org/abs/2310.16828) [[Dataset]](https://huggingface.co/datasets/nicklashansen/tdmpc2)
 ## Model Details
 ### Model Sources
 - **Repository:** [https://github.com/nicklashansen/tdmpc2](https://github.com/nicklashansen/tdmpc2)
+- **Paper:** [https://arxiv.org/abs/2310.16828](https://arxiv.org/abs/2310.16828)
 ## Uses
 ### Training Procedure (Single-task)
+Single-task checkpoints are trained using the [official implementation](https://github.com/nicklashansen/tdmpc2) with default hyperparameters. All models have 5M parameters. Most, but not all, models are trained until convergence. Refer to the individual task curves in our [paper](https://arxiv.org/abs/2310.16828) for a detailed breakdown of model performance on each task.
 ### Training Procedure (Multi-task)