--- license: cc-by-4.0 --- # Introducing Mermaid-Llama-6.7B-RAG Powered by 6.7 billion parameters, this model sets the bar for excellence in AI-driven code comprehension and narrative visualization now with further reduction of hallucinations inspired by https://huggingface.co/jondurbin who created the "Context-Obedient" chat template. We stand on the shoulders of Giants, so we thank you Jon Durbin the original RAG pioneer for LLM's. Special Thanks to Eric Hartford for sharing his intuition with me personally on prompt templates, your shared wisdom has helped me innovate my own style that works for my own specialized Mermaid Models. Beyond turning input into Flow Diagrams this RAG Model excels in Formatted Knowledge Graph utilization in the Mermaid JS Syntax. See more Mermaid Here : https://www.mermaidchart.com ![MermaidLlama GIF](Mermaid_ShowCase/MermaidLlama.webp) --- ``` Note: I have been informed over this past 2 months that my models are being used in production. Through insights gathered on how my models are being used effectively in business environments I have tailored this model to the needs of those that have reached out to me. So please enjoy, and feedback is always welcome, good or bad. I prefer bad actually. - Current Issue is lack of compute - I will solve once I get a job / money to train : Context length of 4096 is very limiting for those that want full system diagrams without using aggregation strategies. ``` ### Key Features 1. **Code Understanding:** - Masters Python's intricacies. - Generates accurate Mermaid Diagram Flow Charts. - Ideal for developers visualizing code logic. 2. **Storytelling Capabilities:** - Converts narratives into captivating Mermaid Diagrams. - Maps character interactions, plot developments, and narrative arcs. 3. **Unmatched Performance:** - Surpasses GPT-4 in generating well-organized Mermaid Diagrams. 4. **Enhanced Adherence to Context (New):** - Incorporates contextual prompts to improve adherence and reduce hallucinations. - Supports the airoboros context-obedient format. ### Collaboration For collaboration opportunities to enhance Mermaid's capabilities, contact troydoesai@gmail.com. ### Use Cases - **Retrieval-Augmented Generation (RAG):** Creates condensed knowledge graphs to enhance retrieval using vector databases for efficient information retrieval. Combines knowledge graphs and context-aware RAG capabilities for better knowledge condensation. - **Code Documentation:** Generates automatic visual flow charts from Python code. - **Storyboarding:** Creates visually appealing diagrams for storytelling. - **Project Planning:** Generates visual project flow maps for effective team communication. - **Learning Python:** Assists students in visualizing Python code structures. - **Game Design:** Visualizes game storylines for coherent narrative structure. ### Dataset Format (New) To enhance contextual adherence and reduce hallucinations, the dataset follows the format below: ``` BEGININPUT BEGINCONTEXT [key0: value0] [key1: value1] ENDCONTEXT [insert your text blocks here] ENDINPUT BEGININSTRUCTION [insert your instruction(s)] ENDINSTRUCTION ``` This structure, while verbose, helps models understand specific responses and sources. ### Example **Prompt:** ``` BEGININPUT BEGINCONTEXT date: 2021-01-01 url: https://web.site/123 ENDCONTEXT Blueberries are now green. ENDINPUT BEGININSTRUCTION What color are blueberries? Source? ENDINSTRUCTION ``` **Expected Response:** ``` Blueberries are now green. Source: date: 2021-01-01 url: https://web.site/123 ``` ### Proof of Concept A VSCode Extension is forthcoming, providing a live flow map upon pausing for more than 10 seconds. ### Training Specifications - **LoRA Rank:** 2048 - **LoRA Alpha:** 4096 - **Batch Size:** 1 - **Micro Batch Size:** 1 - **Cutoff Length:** 4096 - **Save every n steps:** 1000 - **Epochs:** 3 - **Learning Rate:** 1e-6 - **LR Scheduler:** Cosine **Target Modules:** - Enable q_proj - Enable v_proj - Enable k_proj - Enable o_proj - Enable gate_proj - Enable down_proj - Enable up_proj --- ## Getting Started Start by downloading one of my models. ![0 TroyDoesAI GIF](Mermaid_ShowCase/0_TroyDoesAI.gif) Load the model. ![1 Load Model in 4-bit Show Example Use GIF](Mermaid_ShowCase/1_LoadModel_in_4bit_Show_Example_Use.gif) Use my prompt template to generate a Mermaid code block, which can be viewed in the Mermaid Live Editor or using the Mermaid CLI tool. ![2 Loaded Model in Full Precision 16-bit Show Inference and Mermaid Live Editor GIF](Mermaid_ShowCase/2_Loaded_Model_in_Full_Precision_16bit_Show_Inference_and_Mermaid_Live_editor.gif) Here we open the VLLM GUI Program while still running in Vram the Mermaid-Llama-8B to compare the flow diagram to the actual program and show the lightweight capabilites of small models on consumer hardware. ![3 Open The Program VLLM Program With Full Precision Mermaid-Llama-8B Running to Evaluate Flow Map GIF](Mermaid_ShowCase/3_Open_The_Program_VLLM_Program_With_Full_Precision_Mermaid-Llama-8B-Running_to_evaluate_flow_map.gif) ## More on my VLLM Class and inference GUI : https://github.com/Troys-Code/VLLM ![Python RtdBsaz8gy GIF](Mermaid_ShowCase/python_RtdBsaz8gy.gif) --- Note: This model should be treated as an Auto-Complete Model, Do not try talking to it in chat you are gonna get garbage, those layers have been pruned and replaced, that is all you will hear of my secret sauce on training on small < 1000 entry datasets. ``` ԅ(≖‿≖ԅ) STAY TUNED: THERES MORE TO COME, SOON MERMAID MODELS WILL BE ABLE TO TURN "MERMAID" --> "CODE" This new dataset is gonna be a game changer for refactoring code blocks if it works. I am interviewing like crazy so this may take some time as my days have been hectic, imaging studying for finals week every week. ``` video on how to use colab notebook and inference the model in the simplest example: ''' https://m.youtube.com/watch?v=fdwoOmiA2d0 ''' colab notebook: ''' https://colab.research.google.com/github/Troys-Code/MermaidEngine/blob/main/Mermaid_Llama_RAG_Colab_TextGen_GPU.ipynb '''