--- license: cc-by-nc-sa-4.0 --- This model contains the weights of NExT-GPT covering text-image-video-audio (tiva), which is built upon 1) Vicuna-7B version-0, 2) ImageBind, 3) Stable Diffusion v1.5, 4) AudioLDM-l-full, and 5) ZeroScope v2_576w. For more details about the usage of the model, please refer to our [code repository](https://github.com/NExT-GPT/NExT-GPT).