MLP-KTLim
/

llama-3-Korean-Bllossom-8B-gguf-Q4_K_M

@@ -2,12 +2,18 @@
 language:
 - en
 - ko
-license: llama3
 library_name: transformers
 base_model:
 - meta-llama/Meta-Llama-3-8B
 ---
 <a href="https://github.com/MLP-Lab/Bllossom">
   <img src="https://github.com/teddysum/bllossom/blob/main//bllossom_icon.png?raw=true" width="40%" height="50%">
 </a>
@@ -43,6 +49,9 @@ The Bllossom language model is a Korean-English bilingual language model based o
 * **Vision-Language Alignment**: Aligning the vision transformer with this language model
 **This model developed by [MLPLab at Seoultech](http://mlp.seoultech.ac.kr), [Teddysum](http://teddysum.ai/) and [Yonsei Univ](https://sites.google.com/view/hansaemkim/hansaem-kim)**
 ## Demo Video
@@ -76,119 +85,37 @@ The Bllossom language model is a Korean-English bilingual language model based o
 ## Example code
-### Colab Tutorial
- - [Inference-Code-Link](https://colab.research.google.com/drive/1fBOzUVZ6NRKk_ugeoTbAOokWKqSN47IG?usp=sharing)
-### Install Dependencies
 ```bash
-pip install torch transformers==4.40.0 accelerate
 ```
-### Python code with Pipeline
-```python
-import transformers
-import torch
-model_id = "MLP-KTLim/llama-3-Korean-Bllossom-8B"
-pipeline = transformers.pipeline(
-    "text-generation",
-    model=model_id,
-    model_kwargs={"torch_dtype": torch.bfloat16},
-    device_map="auto",
-)
-pipeline.model.eval()
-PROMPT = '''당신은 유용한 AI 어시스턴트입니다. 사용자의 질의에 대해 친절하고 정확하게 답변해야 합니다.
-You are a helpful AI assistant, you'll need to answer users' queries in a friendly and accurate manner.'''
-instruction = "서울과학기술대학교 MLP연구실에 대해 소개해줘"
-messages = [
-    {"role": "system", "content": f"{PROMPT}"},
-    {"role": "user", "content": f"{instruction}"}
-    ]
-prompt = pipeline.tokenizer.apply_chat_template(
-        messages,
-        tokenize=False,
-        add_generation_prompt=True
-)
-terminators = [
-    pipeline.tokenizer.eos_token_id,
-    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
-]
-outputs = pipeline(
-    prompt,
-    max_new_tokens=2048,
-    eos_token_id=terminators,
-    do_sample=True,
-    temperature=0.6,
-    top_p=0.9,
-    repetition_penalty = 1.1
-)
-print(outputs[0]["generated_text"][len(prompt):])
-# 서울과학기술대학교 MLP연구실은 멀티모달 자연어처리 연구를 하고 있습니다. 구성원은 임경태 교수와 김민준, 김상민, 최창수, 원인호, 유한결, 임현석, 송승우, 육정훈, 신동재 학생이 있습니다.
 ```
-### Python code with AutoModel
-```python
-import os
-import torch
-from transformers import AutoTokenizer, AutoModelForCausalLM
-model_id = 'MLP-KTLim/llama-3-Korean-Bllossom-8B'
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(
-    model_id,
-    torch_dtype=torch.bfloat16,
-    device_map="auto",
-)
-model.eval()
-PROMPT = '''당신은 유용한 AI 어시스턴트입니다. 사용자의 질의에 대해 친절하고 정확하게 답변해야 합니다.
-You are a helpful AI assistant, you'll need to answer users' queries in a friendly and accurate manner.'''
-instruction = "서울과학기술대학교 MLP연구실에 대해 소개해줘"
-messages = [
-    {"role": "system", "content": f"{PROMPT}"},
-    {"role": "user", "content": f"{instruction}"}
-    ]
-input_ids = tokenizer.apply_chat_template(
-    messages,
-    add_generation_prompt=True,
-    return_tensors="pt"
-).to(model.device)
-terminators = [
-    tokenizer.eos_token_id,
-    tokenizer.convert_tokens_to_ids("<|eot_id|>")
-]
-outputs = model.generate(
-    input_ids,
-    max_new_tokens=2048,
-    eos_token_id=terminators,
-    do_sample=True,
-    temperature=0.6,
-    top_p=0.9,
-    repetition_penalty = 1.1
-)
-print(tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True))
-# 서울과학기술대학교 MLP연구실은 멀티모달 자연어처리 연구를 하고 있습니다. 구성원은 임경태 교수와 김민준, 김상민, 최창수, 원인호, 유한결, 임현석, 송승우, 육정훈, 신동재 학생이 있습니다.
 ```
 ## Citation
 **Language Model**
 ```text

 language:
 - en
 - ko
+license: apache-2.0
 library_name: transformers
+tags:
+- llama-cpp
+- gguf-my-repo
 base_model:
 - meta-llama/Meta-Llama-3-8B
+- jeiku/Average_Test_v1
+- ResplendentAI/RP_Format_QuoteAsterisk_Llama3
 ---
 <a href="https://github.com/MLP-Lab/Bllossom">
   <img src="https://github.com/teddysum/bllossom/blob/main//bllossom_icon.png?raw=true" width="40%" height="50%">
 </a>
 * **Vision-Language Alignment**: Aligning the vision transformer with this language model
 **This model developed by [MLPLab at Seoultech](http://mlp.seoultech.ac.kr), [Teddysum](http://teddysum.ai/) and [Yonsei Univ](https://sites.google.com/view/hansaemkim/hansaem-kim)**
+This model was converted to GGUF format from [`ResplendentAI/SOVL_Llama3_8B`](https://huggingface.co/ResplendentAI/SOVL_Llama3_8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
+Refer to the [original model card](https://huggingface.co/ResplendentAI/SOVL_Llama3_8B) for more details on the model.
 ## Demo Video
 ## Example code
+## Use with llama.cpp
+Install llama.cpp through brew.
 ```bash
+brew install ggerganov/ggerganov/llama.cpp
 ```
+Invoke the llama.cpp server or the CLI.
+CLI:
+```bash
+llama-cli --hf-repo jeiku/SOVL_Llama3_8B-Q4_K_M-GGUF --model sovl_llama3_8b.Q4_K_M.gguf -p "The meaning to life and the universe is"
 ```
+Server:
+```bash
+llama-server --hf-repo jeiku/SOVL_Llama3_8B-Q4_K_M-GGUF --model sovl_llama3_8b.Q4_K_M.gguf -c 2048
+```
+Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
+```
+git clone https://github.com/ggerganov/llama.cpp &&             cd llama.cpp &&             make &&             ./main -m sovl_llama3_8b.Q4_K_M.gguf -n 128
 ```
 ## Citation
 **Language Model**
 ```text