Instructions to use lmms-lab-encoder/onevision-encoder-large-lang with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lmms-lab-encoder/onevision-encoder-large-lang with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="lmms-lab-encoder/onevision-encoder-large-lang", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("lmms-lab-encoder/onevision-encoder-large-lang", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Add metadata and link to paper/code
#1
by nielsr HF Staff - opened
This PR improves the model card by adding relevant YAML metadata (license, library name, and pipeline tag). It also links the repository to the primary research paper "LLaVA-OneVision-2: Towards Next-Generation Perceptual Intelligence" and the associated technical report "OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence." Additionally, it includes links to the official GitHub repository and project page to improve discoverability and provides sample usage for both images and video.
xiangan changed pull request status to merged