How can the model be commercial, if it is using OpenAI CLIP?

by dchichkov - opened Jan 19, 2024

Jan 19, 2024

I've looked at the repository at https://huggingface.co/fireworks-ai/FireLLaVA-13b and its using LlavaForConditionalGeneration. I understand that the CLIP encoder that you've used is "clip_vision_model" as per your config. Which translates to "openai/clip-vit-base-patch32". And as per the model card at: https://huggingface.co/openai/clip-vit-base-patch32 this is a research/non-commercial model.

I understand that as per convention, any checkpoints that used research/non-commercial models in the pipeline are also considered to be non-commercial. And a combination of models that include non-commercial parts are also non-commercial.

Please, can you explain how your model checkpoint can be commercial or deployed for commercial/API use, while complying with the CLIP license? Or correct the claim of the model being commercial?

Also, in general, it is good practice to include rough composition of the datasets that constitute the model advertised for commercial use. Otherwise is not clear, if that data is "clean enough" for the model to be considered commercial. A quick query to your model reveals Vicuna data for example.

websterbei

Fireworks AI org Jan 20, 2024

Hi Dmitry, thank you for testing out the model and raising the concern!
The underlying vision encoder being used is from https://huggingface.co/openai/clip-vit-large-patch14-336, while the page itself does not contain any information regarding license, we believe CLIP itself is under MIT license (https://github.com/openai/CLIP/blob/main/LICENSE). Models such as SDXL are similarly using the TextEncoder from CLIP if I understand correctly.

As for the composition of data, we briefly mentioned it in our separate blog post, but here is a more through list: https://github.com/haotian-liu/LLaVA#train
The only difference we made, is swapping out the GPT generated portion with our own.

shuver

Jan 22, 2024

Many of the underlying images in COCO used in the visual instruction finetuning stage are non-commercial.

It's unclear from "mixed from the permissive portion of the original LLaVA training data and Fireworks.ai generated training data" whether these were in fact removed or not. Do you have an explicit list of underlying images used?

Thanks

dchichkov

Jan 22, 2024

@websterbei The MIT license is for the code, no? Model's (weights) license is in the Model Card, as far as I understand:
https://github.com/openai/CLIP/blob/main/model-card.md

And it's: "any deployed use case of the model - whether commercial or not - is currently out of scope". Not legally binding?

SorenF

Feb 1, 2024

Any chance for a cpu-only version?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment