--- license: llama2 tags: - moe - merge base_model: codellama/CodeLlama-13b-Instruct-hf model-index: - name: Winged-Lagomorph-2x13B results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 47.95 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 69.39 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 44.5 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 44.54 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 67.4 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 25.63 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B name: Open LLM Leaderboard --- Grafted WhiteRabbitNeo-13B-v1 and NexusRaven-V2-13B with mergekit. Use the WhiteRabbitNeo template for regular code, and the NR template for function calling. SYSTEM: Answer the Question by exploring multiple reasoning paths as follows: - First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree. - For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts. - Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher. - Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order. - If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts. - Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal. - Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer. - Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process. In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers. USER: Write a pwntools script that overflows a 64 character buffer from stdin. ASSISTANT: To write a pwntools script that overflows a 64-character buffer from stdin, we need to follow these steps: 1. **Identify the vulnerable program**: Find the program that has a buffer overflow vulnerability. This can be done by analyzing the program's source code or by using tools like `gdb` or `valgrind`. 2. **Set up the environment**: Install pwntools and any other necessary dependencies. 3. **Create a pwntools script**: Start by importing the `pwn` module and setting up the connection to the vulnerable program. 4. **Find the offset**: Use the `cyclic` function from pwntools to generate a pattern that will help us find the offset. 5. **Overflow the buffer**: Send the generated pattern to the vulnerable program, and then send additional data to overwrite the return address on the stack. 6. **Control the execution flow**: Overwrite the return address with the address of a shellcode or a function that will give us a shell. 7. **Execute the script**: Run the script and observe the results. Here's an example pwntools script that overflows a 64-character buffer: ```python from pwn import * # Set up the connection to the vulnerable program p = process('./vulnerable_program') # Find the offset using cyclic offset = cyclic_find(b'A'*64) # Generate shellcode shellcode = asm(shellcraft.sh()) # Overflow the buffer payload = b'A' * offset + shellcode p.sendline(payload) # Execute the script p.interactive() ``` In this example, `vulnerable_program` is the name of the vulnerable program, and `shellcraft.sh()` is a function from pwntools that generates shellcode for a Linux/x86-64 system. The `cyclic_find` function is used to find the offset at which the buffer overflow occurs. Please note that this script is a basic example and may not work in all cases. It's important to tailor the script to the specific vulnerability and environment you are working with. # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lodrick-the-lafted__Winged-Lagomorph-2x13B) | Metric |Value| |---------------------------------|----:| |Avg. |49.90| |AI2 Reasoning Challenge (25-Shot)|47.95| |HellaSwag (10-Shot) |69.39| |MMLU (5-Shot) |44.50| |TruthfulQA (0-shot) |44.54| |Winogrande (5-shot) |67.40| |GSM8k (5-shot) |25.63|