** Model Detail

** Training datasets:

  • Pretrain: LLaVA 595k
  • Fine-tune: LLaVA 665k

** Evaluation dataset

Currently, we tested RWKV7 SigLIP2 on 4 benchmarks proposed for instruction-following LMMs. More benchmarks will be released soon.

  • Benchmarks

  • Encoder LLM VQAV2 TextVQA GQA ScienceQA
    SigLIP2 RWKV7-3B 78.30 51.09 60.75 70.93
  • Inference

  • from infer.worldmodel import Worldinfer
    from PIL import Image
    
    
    llm_path='WorldRWKV/RWKV7-3B-siglip2/rwkv-0' #Local model path
    encoder_path='google/siglip2-base-patch16-384'
    encoder_type='siglip'
    
    model = Worldinfer(model_path=llm_path, encoder_type=encoder_type, encoder_path=encoder_path)
    
    img_path = './docs/03-Confusing-Pictures.jpg'
    image = Image.open(img_path).convert('RGB')
    
    text = '\x16User: What is unusual about this image?\x17Assistant:'
    
    result = model.generate(text, image)
    
    print(result)
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.