SNUMPR
/

vlm_rlaif_video_llava_7b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

vlm_rlaif_video_llava_7b / README.md

SNUMPR's picture

Update README.md

3b33e89 verified 5 months ago

|

history blame contribute delete

525 Bytes

	# VLM-RLAIF: Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback

	## Model Summary

	This Hub repository contains a HuggingFace's `transformers` implementation of VLM-RLAIF model of SNUMPR lab.

	* VLM-RLAIF-7b [[HF]](https://huggingface.co/SNUMPR/vlm_rlaif_video_llava_7b): 7B RLAIF model
	<!-- \| Model \| Model size \| Model Description \|
	\| ------- \| ------------- \| ------------- \|
	\| VLM-RLAIF-7b [[HF]](https://huggingface.co/SNUMPR/vlm_rlaif_video_llava_7b) \| 7B \| RLAIF model
	-->