mostafaamiri commited on
Commit
daf97fa
1 Parent(s): 2714f76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -1
README.md CHANGED
@@ -8,4 +8,65 @@ tags:
8
  - 7B
9
  - Alpaca
10
  - Quantize
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - 7B
9
  - Alpaca
10
  - Quantize
11
+ ---
12
+ # Model Card for Model ID
13
+
14
+ <!-- Provide a quick summary of what the model is/does. -->
15
+ ## How to run in `llama.cpp`
16
+
17
+
18
+ ```
19
+ ./main -t 10 -ngl 32 -m persian_llama_7b.f32.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: یک شعر حماسی در مورد کوه دماوند بگو ### Input: ### Response:"
20
+ ```
21
+ Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`.
22
+
23
+ Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
24
+
25
+ Tto have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins`
26
+
27
+ ## How to run in `text-generation-webui`
28
+
29
+ Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md).
30
+
31
+ ## How to run using `LangChain`
32
+
33
+ ##### Instalation on CPU
34
+ ```
35
+ pip install llama-cpp-python
36
+ ```
37
+ ##### Instalation on GPU
38
+ ```
39
+ CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
40
+ ```
41
+
42
+ ```python
43
+ from langchain.llms import LlamaCpp
44
+ from langchain import PromptTemplate, LLMChain
45
+ from langchain.callbacks.manager import CallbackManager
46
+ from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
47
+
48
+ n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool.
49
+ n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU.
50
+ n_ctx=2048
51
+
52
+ callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])
53
+
54
+ # Make sure the model path is correct for your system!
55
+ llm = LlamaCpp(
56
+ model_path="./persian_llama_7b.f32.gguf",
57
+ n_gpu_layers=n_gpu_layers, n_batch=n_batch,
58
+ callback_manager=callback_manager,
59
+ verbose=True,
60
+ n_ctx=n_ctx
61
+ )
62
+
63
+ llm("""### Instruction:
64
+ یک شعر حماسی در مورد کوه دماوند بگو
65
+
66
+ ### Input:
67
+
68
+ ### Response:""")
69
+ ```
70
+ For more information refer [LangChain](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/llamacpp)
71
+
72
+