llmware
/

dragon-llama2-ov

Model card Files Files and versions Community

dragon-llama2-ov

dragon-llama2-ov is a high-quality, fact-based question-answering model, designed for retrieval augmented generation (RAG) with complex business documents, quantized and packaged in OpenVino int4 for AI PCs using Intel GPU, CPU and NPU.

This model provides a good combination of accuracy and inference performance.

Model Description

Developed by: llmware
Model type: llama2
Parameters: 7 billion
Quantization: int4
Model Parent: llmware/dragon-llama-7b-v0
Language(s) (NLP): English
License: Llama2 Community License
Uses: Fact-based question-answering, RAG
RAG Benchmark Accuracy Score: 97.25

Model Card Contact

llmware on github
llmware on hf
llmware website

Downloads last month: 5

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llmware/dragon-llama2-ov

Base model

llmware/dragon-llama-7b-v0

Quantized

(1)

this model

Collections including llmware/dragon-llama2-ov

DRAGON Models

Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models • 23 items • Updated Feb 23 • 46

Model Depot

Leading generative models packaged in OpenVino format optimized for use on AI PCs • 51 items • Updated Feb 11 • 6