Spaces:

cosco
/

chat_with_langchain

Sleeping

App Files Files Community

chat_with_langchain / database /readme_db /hugging-rl /README.md

cosco

Upload 304 files

184a47b verified 9 months ago

preview code

raw

history blame contribute delete

3.81 kB

	# Hugging Robot Learning

	[![Awesome](https://awesome.re/badge.svg)](https://awesome.re) [![LICENSE](https://img.shields.io/badge/license-Anti%20996-blue.svg)](https://github.com/996icu/996.ICU/blob/master/LICENSE)

	该项目旨在梳理应用于连续动作空间控制的强化学习、模仿学习、以及离线强化学习相关算法，方便进一步学习。

	在之前两版本梳理之后，发现，完全根据综述梳理会造成知识不够具体。因此，接下来准备更关注经典算法，而不是单纯的综述。

	内容一直在修改，博客中内容为最新版 :exclamation::exclamation::exclamation:



	本项目计划分为三个阶段，分别是

	- 知识梳理阶段 :point_left: 正在进行:sparkles:
	- 算法复现阶段
	- 项目优化阶段

	知识梳理阶段主要关注知识框架的搭建；算法复现阶段主要关注经典算法的代码复现；项目优化阶段主要关注知识完整性和准确性、排版整洁性、以及代码准确性。

	欢迎批评指正～

	欢迎一起做项目～



	## 内容导航

	### 基础篇

	\| 章节 \| 内容 \|
	\| ------ \| ------------------------------------------------------------ \|
	\| 第一章 \| [DDPMs：去噪扩散概率模型](https://www.robotech.ink/index.php/Foundation/172.html) \|



	### 在线强化学习算法

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| [MCAC：蒙特卡洛增强的Actor-Critic算法](https://www.robotech.ink/index.php/RL/139.html) \|



	### 模仿学习篇

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| 模仿学习简介 \|
	\| 第二章 \| [GAIL：生成式对抗模仿学习](https://www.robotech.ink/index.php/AIL/187.html) \|
	\| 第三章 \| [IBC算法](https://www.robotech.ink/index.php/Robot-Learning/232.html) \|
	\| 第三章 \| [BeT：一次克隆k个模式](https://www.robotech.ink/index.php/Robot-Learning/224.html) \|
	\| 第五章 \| [扩散策略：通过动作扩散进行的视觉策略学习](https://www.robotech.ink/index.php/Robot-Learning/106.html) \|



	### 离线强化学习篇

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| 离线强化学习简介 \|
	\| 第二章 \| [基于策略约束的方法与BCQ算法](https://www.robotech.ink/index.php/Policy-Constrained/181.html) \|
	\| 第三章 \| [基于正则化的方法与CQL算法](https://www.robotech.ink/index.php/Regularization/120.html) \|
	\| 第四章 \| [基于不确定性估计的方法与REM算法](https://www.robotech.ink/index.php/Uncertainty/191.html) \|
	\| 第五章 \| Diffuser：敏捷行为合成的扩散规划器 \|





	## 关注我们

	<div align=center>
	<p>扫描下方二维码关注公众号：Datawhale</p>
	<img src="https://raw.githubusercontent.com/datawhalechina/pumpkin-book/master/res/qrcode.jpeg" width = "180" height = "180">
	</div>


	## LICENSE

	<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="知识共享许可协议" style="border-width:0" src="https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey" /></a><br />本作品采用<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议</a>进行许可。

	注：默认使用CC 4.0协议，也可根据自身项目情况选用其他协议

	# Hugging Robot Learning

	[![Awesome](https://awesome.re/badge.svg)](https://awesome.re) [![LICENSE](https://img.shields.io/badge/license-Anti%20996-blue.svg)](https://github.com/996icu/996.ICU/blob/master/LICENSE)

	该项目旨在梳理应用于连续动作空间控制的强化学习、模仿学习、以及离线强化学习相关算法，方便进一步学习。

	在之前两版本梳理之后，发现，完全根据综述梳理会造成知识不够具体。因此，接下来准备更关注经典算法，而不是单纯的综述。

	内容一直在修改，博客中内容为最新版 :exclamation::exclamation::exclamation:



	本项目计划分为三个阶段，分别是

	- 知识梳理阶段 :point_left: 正在进行:sparkles:
	- 算法复现阶段
	- 项目优化阶段

	知识梳理阶段主要关注知识框架的搭建；算法复现阶段主要关注经典算法的代码复现；项目优化阶段主要关注知识完整性和准确性、排版整洁性、以及代码准确性。

	欢迎批评指正～

	欢迎一起做项目～



	## 内容导航

	### 基础篇

	\| 章节 \| 内容 \|
	\| ------ \| ------------------------------------------------------------ \|
	\| 第一章 \| [DDPMs：去噪扩散概率模型](https://www.robotech.ink/index.php/Foundation/172.html) \|



	### 在线强化学习算法

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| [MCAC：蒙特卡洛增强的Actor-Critic算法](https://www.robotech.ink/index.php/RL/139.html) \|



	### 模仿学习篇

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| 模仿学习简介 \|
	\| 第二章 \| [GAIL：生成式对抗模仿学习](https://www.robotech.ink/index.php/AIL/187.html) \|
	\| 第三章 \| [IBC算法](https://www.robotech.ink/index.php/Robot-Learning/232.html) \|
	\| 第三章 \| [BeT：一次克隆k个模式](https://www.robotech.ink/index.php/Robot-Learning/224.html) \|
	\| 第五章 \| [扩散策略：通过动作扩散进行的视觉策略学习](https://www.robotech.ink/index.php/Robot-Learning/106.html) \|



	### 离线强化学习篇

	\| 章节 \| 内容 \|
	\| :----- \| :----------------------------------------------------------- \|
	\| 第一章 \| 离线强化学习简介 \|
	\| 第二章 \| [基于策略约束的方法与BCQ算法](https://www.robotech.ink/index.php/Policy-Constrained/181.html) \|
	\| 第三章 \| [基于正则化的方法与CQL算法](https://www.robotech.ink/index.php/Regularization/120.html) \|
	\| 第四章 \| [基于不确定性估计的方法与REM算法](https://www.robotech.ink/index.php/Uncertainty/191.html) \|
	\| 第五章 \| Diffuser：敏捷行为合成的扩散规划器 \|





	## 关注我们

	<div align=center>
	<p>扫描下方二维码关注公众号：Datawhale</p>
	<img src="https://raw.githubusercontent.com/datawhalechina/pumpkin-book/master/res/qrcode.jpeg" width = "180" height = "180">
	</div>


	## LICENSE

	<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="知识共享许可协议" style="border-width:0" src="https://img.shields.io/badge/license-CC%20BY--NC--SA%204.0-lightgrey" /></a><br />本作品采用<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议</a>进行许可。

	注：默认使用CC 4.0协议，也可根据自身项目情况选用其他协议