singtan commited on
Commit
5a13b18
Β·
verified Β·
1 Parent(s): 5586bf7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +17 -20
README.md CHANGED
@@ -1,17 +1,20 @@
1
  ---
2
- {}
 
 
 
 
 
 
 
 
 
3
  ---
4
 
5
- # πŸ“‚ Solvrays Llm (Ground-Truth Precise)
6
 
7
  ## 🌟 Overview
8
- This is a specialized fine-tuned version of **Gemma 2B**, optimized for **High-Precision Document Retrieval**. It has been trained using strict grounding templates to ensure zero-hallucination and deterministic factual responses.
9
-
10
- ## πŸ›  Key Advanced Features
11
- - **Zero-Hallucination Mode**: Deterministic greedy decoding by default.
12
- - **Negative Constraint Awareness**: Trained to avoid guessing when information is missing.
13
- - **Domain Agnostic**: Works for any technical or non-technical PDF provided as context.
14
- - **Standalone Conversion**: Fully merged FP16 weights for production deployment.
15
 
16
  ## πŸ’» Quick Start (Inference)
17
  ```python
@@ -22,22 +25,16 @@
22
  tokenizer = AutoTokenizer.from_pretrained(model_id)
23
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
24
 
25
- instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided. If the information is not present, you must state that it is not documented."
26
  prompt = f"### Instruction: {instruction}
27
- ### Source: Document_Name.pdf
28
- ### Content: Your Query Here
29
  ### Verified Response:"
30
 
31
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
32
- outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False, repetition_penalty=1.5)
33
- print(tokenizer.decode(outputs[0], skip_special_tokens=True).split("### Verified Response:")[-1].strip())
34
  ```
35
 
36
- ## πŸ“Š Training methodology
37
- - **Base Model**: google/gemma-2b
38
- - **Quantization**: 4-bit (NormalFloat4)
39
- - **LoRA Config**: r=16, alpha=32, target_modules=All linears
40
- - **Epochs**: 5 (Intensive Reinforcement)
41
-
42
  ---
43
  **Fine-tuned by Bibek Lama Singtan**
 
1
  ---
2
+ base_model: google/gemma-2b
3
+ language: en
4
+ library_name: transformers
5
+ license: apache-2.0
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - fine-tuned
9
+ - pdf-grounded
10
+ - zero-hallucination
11
+ - precise-retrieval
12
  ---
13
 
14
+ # πŸ“‚ Solvrays Llm (Ground-Truth Precise)
15
 
16
  ## 🌟 Overview
17
+ This is a specialized fine-tuned version of **Gemma 2B**, optimized for **High-Precision Retrieval**. It uses deterministic grounding templates to minimize hallucinations.
 
 
 
 
 
 
18
 
19
  ## πŸ’» Quick Start (Inference)
20
  ```python
 
25
  tokenizer = AutoTokenizer.from_pretrained(model_id)
26
  model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.float16)
27
 
28
+ instruction = "Analyze the following document and provide a precise, factual response based strictly on the content provided."
29
  prompt = f"### Instruction: {instruction}
30
+ ### Source: Document.pdf
31
+ ### Content: Query
32
  ### Verified Response:"
33
 
34
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
35
+ outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
36
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
37
  ```
38
 
 
 
 
 
 
 
39
  ---
40
  **Fine-tuned by Bibek Lama Singtan**