--- base_model: - NousResearch/Hermes-2-Pro-Llama-3-8B - cognitivecomputations/dolphin-2.9-llama3-8b - NousResearch/Meta-Llama-3-8B - winglian/llama-3-8b-256k-PoSE - maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 - asiansoul/Llama-3-Open-Ko-Linear-8B - NousResearch/Meta-Llama-3-8B-Instruct - nvidia/Llama3-ChatQA-1.5-8B - Danielbrdz/Barcenas-Llama3-8b-ORPO - aaditya/Llama3-OpenBioLLM-8B library_name: transformers tags: - mergekit - merge - llama --- # YACHT-Llama-3-Ko-8B [![DALL-E Yacht](https://i.ibb.co/hHr5xnh/DALL-E-2024-05-05-11-57-02-A-futuristic-yacht-boat-on-a-calm-ocean-at-dawn-featuring-sleek-curves-an.png)](https://ibb.co/92BXmfz) 🎡 *[JayLee LLMs Signature Tag] : ✍️ "I need a Jay Jay chat boy"* 🎡 ✨ *Navigating the High Seas of Data: Crafting the Ultimate Yacht Insights with Merged LLMs* ✨ ✨ *Aren’t you sometimes tired of just doing LLM & RAG & Normal Chat app? I'll show you a cool app soon integrating this my merged one(Tuned car). It wouldn't be fun if we only developed cars, so life is ultimately about driving cars and socializing with people.* ✨ 🧨 *When using the Merge model for commercial purposes, a lot of care is needed. A mix of many models can be good, but it can also pose many risks.* 🧨 ## 🏟️ Merged Model Series Yacht Features Welcome to the merged model series yacht! This provides an overview of the powerful features and functionalities that this series brings together, akin to a sleek, modern yacht sailing across the digital ocean. ### 1. Function Calling & JSON Outputs - Offers precise function calling and structured JSON outputs via specialized tokens like ``, ``, and ``. Streamlines system communication for developers. ### 2. Conversational Interaction - Avoids excessive "SYSTEM MESSAGE" chatter while delivering seamless, friendly dialogue. - Specializes in answering questions with precision, handling arithmetic and tabular data effortlessly. ### 3. Expanded Context Length - Extends the context length to 256k tokens using PoSE, offering a broader field of data analysis. ### 4. Multilingual Capabilities - Transfers instruction-following from English to Korean for reliable interaction across languages. ### 5. Optimized Dialogue & Safety - Aligns with human preferences using fine-tuning (SFT) and reinforcement learning (RLHF), ensuring helpful and safe dialogues. ### 6. Precision Merging - Merges foundational and preview models for Korean language through task arithmetic, providing seamless integration. ### 7. Specialized Biomedical Knowledge - Specializes in biomedical tasks with accurate responses for healthcare professionals and researchers. ### 8. Novel Training & Collaboration - Combines [ORPO method](https://arxiv.org/pdf/2403.07691) and dolphin preference datasets for high-quality conversation and collaboration. The merged model series yacht offers unparalleled functionality, drawing together a fleet of specialized models. Whether you need precise function calling, multilingual capabilities, or conversational AI, this yacht has every deck optimized to navigate the digital ocean with style and precision. ## πŸ‘˜ Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base. ## 🩱 Models Merged The following models were included in the merge: * [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B) * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b) * [winglian/llama-3-8b-256k-PoSE](https://huggingface.co/winglian/llama-3-8b-256k-PoSE) * [maum-ai/Llama-3-MAAL-8B-Instruct-v0.1](https://huggingface.co/maum-ai/Llama-3-MAAL-8B-Instruct-v0.1) * [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B) * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) * [nvidia/Llama3-ChatQA-1.5-8B](https://huggingface.co/nvidia/Llama3-ChatQA-1.5-8B) * [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO) * [aaditya/Llama3-OpenBioLLM-8B](https://huggingface.co/aaditya/Llama3-OpenBioLLM-8B) ## πŸ’ƒ ModelFile The parameters below depend on the performance of your Ollma-based computer. Therefore, the settings below do not necessarily mean that the performance will be as good. ``` ollama create yacht -f ./Modelfile_Q5_K_M ``` ``` FROM yacht-llama-3-ko-8b-Q5_K_M.gguf TEMPLATE """ {{- if .System }} system {{ .System }} {{- end }} user Human: {{ .Prompt }} assistant Assistant: """ SYSTEM """ μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•˜μž. λͺ¨λ“  λŒ€λ‹΅μ€ ν•œκ΅­μ–΄(Korean)으둜 λŒ€λ‹΅ν•΄μ€˜. """ PARAMETER temperature 0.7 PARAMETER num_predict 3000 PARAMETER num_ctx 250000 PARAMETER stop "" PARAMETER stop "" ``` ## πŸͺ­ Configuration Computer System ``` Hardware: Hardware Overview: Model Name: MacBook Pro Model Identifier: MacBookPro18,2 Chip: Apple M1 Max Total Number of Cores: 10 (8 performance and 2 efficiency) Memory: 64 GB System Firmware Version: 10151.101.3 ``` The following YAML configuration was used to produce this model: ```yaml models: - model: NousResearch/Meta-Llama-3-8B # Base model providing a general foundation without specific parameters - model: NousResearch/Meta-Llama-3-8B-Instruct parameters: density: 0.60 weight: 0.25 - model: winglian/llama-3-8b-256k-PoSE parameters: density: 0.55 weight: 0.15 - model: nvidia/Llama3-ChatQA-1.5-8B parameters: density: 0.55 weight: 0.1 - model: asiansoul/Llama-3-Open-Ko-Linear-8B parameters: density: 0.55 weight: 0.2 - model: maum-ai/Llama-3-MAAL-8B-Instruct-v0.1 parameters: density: 0.55 weight: 0.1 - model: NousResearch/Hermes-2-Pro-Llama-3-8B parameters: density: 0.55 weight: 0.1 - model: cognitivecomputations/dolphin-2.9-llama3-8b parameters: density: 0.55 weight: 0.05 - model: Danielbrdz/Barcenas-Llama3-8b-ORPO parameters: density: 0.55 weight: 0.05 - model: aaditya/Llama3-OpenBioLLM-8B parameters: density: 0.55 weight: 0.1 merge_method: dare_ties base_model: NousResearch/Meta-Llama-3-8B parameters: int8_mask: true dtype: bfloat16 ``` ## ⌨️ Codes ### Code that serves as a summary - TXT, PDF, WiKi Summary Code supporting both Korean, English By BrianHan(my follower) -> Because several models are mixed, if you attach a large file and ask a question in Korean and request a summary answer in Korean, sometimes the answer is given in English only for the large ctx cuz based on english maybe(for the several model test). In that case, I asked them to translate it into Korean using Google Translator. Even though the answer was sent in English, I confirmed that it was interpreted accurately and sent. ``` (.venv) jaylee@lees-MacBook-Pro-2 tmp % python han.py -m yacht:latest -s steve.txt -l en Model: yacht:latest, Langguage: en Source : steve.txt Processing input... Reading text file... Summarizing content...using ollama model : yacht:latest ``` ``` import sys import os import requests from bs4 import BeautifulSoup import PyPDF2 from langchain_community.chat_models import ChatOllama from langchain.schema import AIMessage, HumanMessage, SystemMessage from googletrans import Translator from langchain.chains.summarize import load_summarize_chain from langchain.text_splitter import RecursiveCharacterTextSplitter from langchain_core.prompts import PromptTemplate from langchain.docstore.document import Document import argparse import logging def clean_output(text): text = text.replace("", "").strip() return text def translate_text(text, src_lang, dest_lang): """Translates text from source language to destination language using Google Translate.""" if src_lang == dest_lang: return text translator = Translator() try: translation = translator.translate(text, src=src_lang, dest=dest_lang) return translation.text except Exception as e: logging.error(f"Translation failed: {e}") return text def detect_language(text): """Detects the language of the given text.""" translator = Translator() try: detected = translator.detect(text) return detected.lang except Exception as e: logging.error(f"Language detection failed: {e}") return None def refine_summary(data,target_lang,ollama_model): if target_lang == 'ko': prompt_template = """ λ‹€μŒ ν…μŠ€νŠΈμ— λŒ€ν•œ 전문적인 μš”μ•½μ„ μ œκ³΅ν•˜μ—¬ μ£Όμ„Έμš”. μš”μ•½μ€ ν•œκ΅­μ–΄μ˜ 언어적 λ‰˜μ•™μŠ€μ— 맞게 졜고 μˆ˜μ€€μ˜ λͺ…ν™•μ„±κ³Ό μ„ΈλΆ€ 사항을 μ€€μˆ˜ν•΄μ•  ν•©λ‹ˆλ‹€ Text: `{text}` CONCISE SUMMARY : """ refine_template= """ μ΅œμ’… μš”μ•½μ„ μ œκ³΅ν•˜μ—¬ μ£Όμ„Έμš”. 쀑간 μš”μ•½μ„ 제곡 λ‹€μŒκ³Ό 같이 μ œκ³΅ν•©λ‹ˆλ‹€ : {existing_answer}. μ΅œμ’… μš”μ•½μ€ ν•œκ΅­μ–΄μ˜ 언어적 λ‰˜μ•™μŠ€μ— 맞게 졜고 μˆ˜μ •μ˜ λͺ…ν™•μ„±κ³Ό μ„ΈλΆ€ 사항을 μ€€μˆ˜ν•΄μ•Ό ν•©λ‹ˆλ‹€. ------------- {text} ------------- μ΅œμ’… μš”μ•½μ€ λ„μž…λΆ€κ°€ μžˆμ–΄μ•Ό ν•©λ‹ˆλ‹€. λ„μž…λΆ€λŠ” 전체 주제λ₯Ό μ œκ³΅ν•˜μ—¬μ•Ό ν•©λ‹ˆλ‹€. 본문은 BULLET POINT둜 μ‹œμž‘ν•˜μ—¬μ•Ό ν•©λ‹ˆλ‹€. λ§ˆμ§€λ§‰μ€ κ²°λ‘ λΆ€κ°€ μžˆμ–΄μ•Ό ν•©λ‹ˆλ‹€. 결둠뢀뢄은 μ΅œμ’… μš”μ•½μ˜ 결둠이어야 ν•©λ‹ˆλ‹€. """ else: # default to English if not Korean prompt_template = """ Write a concise summary of the following extracting the key information: Text: `{text}` CONCISE SUMMARY : """ refine_template= """ Your job is to produce a final summary. I have provided an existing summary up to a certain point : {existing_answer}. Please refine eht existing summary with some more context blow. ------------- {text} ------------- Start the final summary with an INTRODUCTION PARAGRAPH that gives an overview of the the topic FOLLOWED by BULLET POINTS if possible AND end the summary with CONCLUSION PHRASE. """ docs = [Document(page_content=data)] initial_prompt = PromptTemplate(template=prompt_template, input_variables=['text']) refine_prompt = PromptTemplate( template=refine_template, input_variables=['existing_answer', 'text'] ) llm = ChatOllama(model=ollama_model) chain = load_summarize_chain( llm=llm, chain_type='refine', question_prompt=initial_prompt, refine_prompt=refine_prompt, return_intermediate_steps=False ) output_summary = chain.invoke(docs) output_summary['output_text'] cleaned_content = clean_output(output_summary['output_text']) # print(cleaned_content) content_lang = detect_language(cleaned_content) print(f"Current content language: {content_lang}, Target language to be translated to: {target_lang}") if content_lang != target_lang: return translate_text(cleaned_content, content_lang, target_lang) return cleaned_content def invoke_model(text,target_lang,ollama_model): if target_lang == 'ko': messages = [ SystemMessage(content='λ¬Έμ„œμ˜ 핡심 μš”μ•½μ„ μƒμ„Έν•˜κ²Œ μ œκ³΅ν•΄ μ£Όμ‹€ μ „λ¬Έκ°€λ‘œμ„œ, λ‹€μŒ λ¬Έμ„œλ₯Ό μš”μ•½ν•΄ μ£Όμ„Έμš”.'), HumanMessage(content=f'λ‹€μŒ ν…μŠ€νŠΈμ— λŒ€ν•œ 전문적 μš”μ•½μ„ μ œκ³΅ν•΄ μ£Όμ„Έμš”. μš”μ•½μ€ ν•œκ΅­μ–΄μ˜ 언어적 λ‰˜μ•™μŠ€μ— 맞게 졜고 μˆ˜μ€€μ˜ λͺ…ν™•μ„±κ³Ό μ„ΈλΆ€ 사항을 μ€€μˆ˜ν•΄μ•Ό ν•©λ‹ˆλ‹€:\n\nTEXT: {text}') ] else: # default to English if not Korean messages = [ SystemMessage(content='As an adept summarizer, your expertise is required to condense the following document into its essential points in detail.'), HumanMessage(content=f'Kindly provide an expert summary of the text below, adhering to the highest standards of clarity and detail. Ensure the response is tailored to the linguistic nuances of English:\n\nTEXT: {text}') ] try: llm = ChatOllama(model=ollama_model) summary_output = llm.invoke(messages) if isinstance(summary_output, AIMessage): cleaned_content = clean_output(summary_output.content) # print(cleaned_content) content_lang = detect_language(cleaned_content) print(f"Current content language: {content_lang}, Target language to be translated to: {target_lang}") if content_lang != target_lang: return translate_text(cleaned_content, content_lang, target_lang) return cleaned_content else: return "Unexpected data type for model output." except Exception as e: print(f"An error occurred while processing the model output: {str(e)}") return None def fetch_text_from_url(url): try: response = requests.get(url) response.raise_for_status() soup = BeautifulSoup(response.text, 'html.parser') main_content = soup.select_one('#mw-content-text, #bodyContent, .content') if not main_content: logging.error("No content found in the expected sections.") return None text_content = ' '.join(p.get_text() for p in main_content.find_all(['p', 'li'], string=True)) return text_content except requests.RequestException as e: print(f"Failed to fetch data from URL: {str(e)}") return None def read_text_file(file_path): with open(file_path, "r", encoding="utf-8") as file: return file.read() def read_pdf(file_path): ''' with open(file_path, "rb") as file: reader = PyPDF2.PdfReader(file) text_content = "" for page in reader.pages: extracted_text = page.extract_text() if extracted_text: text_content += extracted_text + "\n" return text_content ''' try: with open(file_path, "rb") as file: reader = PyPDF2.PdfReader(file) return ' '.join(page.extract_text() for page in reader.pages if page.extract_text()) except Exception as e: logging.error(f"Error reading PDF file: {e}") return None def summarize_content(source,language,model): print("Processing input...") text_content = None if source.startswith(('http://', 'https://')): print("Fetching content from URL...") text_content = fetch_text_from_url(source) elif os.path.isfile(source): _, file_extension = os.path.splitext(source) if file_extension.lower() == '.pdf': print("Reading PDF...") text_content = read_pdf(source) elif file_extension.lower() in ['.txt', '.text']: print("Reading text file...") text_content = read_text_file(source) else: print("Unsupported file type") return else: print("Unsupported file type") return if text_content: print(f"Summarizing content...using ollama model : {model} ") summary = refine_summary(text_content, language,model) print("\n--- Summary of the document ---\n") print(summary) else: print("No text found or unable to extract text from source.") if __name__ == '__main__': defSrc ='https://en.wikipedia.org/wiki/Britney_Spears' defModel='llama3:latest' defLang = 'en' parser = argparse.ArgumentParser() parser.add_argument('-s',dest='source',default = defSrc,help='text file or URL'); parser.add_argument('-m',dest='model',default = defModel,help='ollama Model'); parser.add_argument('-l',dest='language',default = defLang,help='en for English, kr for Korean'); arg= parser.parse_args() print(f"Model: {arg.model}, Langguage: {arg.language}") print(f"Source : {arg.source}") summarize_content(arg.source,arg.language,arg.model) ''' if len(sys.argv) < 2: print("Usage: python script.py ") else: source = sys.argv[1] summarize_content(source) ''' ``` ## 🚴 Test Result(Streamlit + Ollama Server) - Normal Question [![Normal 1](https://i.ibb.co/JqnKm9M/Screenshot-2024-05-05-at-3-53-19-PM.png)](https://ibb.co/KD6XbJ3) [![Normal 2](https://i.ibb.co/C019WD4/Screenshot-2024-05-05-at-3-54-26-PM.png)](https://ibb.co/s3m1tzc) - Tech Question [![Tech](https://i.ibb.co/tMw8R9C/Screenshot-2024-05-05-at-3-39-06-PM.png)](https://ibb.co/JRW5Z9k) [![Tech](https://i.ibb.co/JkHz1cZ/Screenshot-2024-05-05-at-3-39-18-PM.png)](https://ibb.co/GPRV6dm) [![Tech](https://i.ibb.co/992TsfJ/Screenshot-2024-05-05-at-3-50-42-PM.png)](https://ibb.co/3yWNB8w) [![Tech](https://i.ibb.co/4prZRys/Screenshot-2024-05-05-at-3-50-52-PM.png)](https://ibb.co/dDY049b) ## πŸš΅β€β™‚οΈ Test Result(Summary Code) [![summary](https://i.ibb.co/zFYsLXS/Screenshot-2024-05-05-at-4-06-52-PM.png)](https://ibb.co/NrPxk9L)