azhx commited on
Commit
fc0acaf
1 Parent(s): c054478

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - TIGER-Lab/SKGInstruct
5
+ language:
6
+ - en
7
+ ---
8
+ # 🏗️ StructLM: Towards Building Generalist Models for Structured Knowledge Grounding
9
+
10
+
11
+
12
+ Project Page: [https://tiger-ai-lab.github.io/StructLM/](https://tiger-ai-lab.github.io/StructLM/)
13
+
14
+ Paper: Arxiv link not yet announced
15
+
16
+ Code: [https://github.com/TIGER-AI-Lab/StructLM](https://github.com/TIGER-AI-Lab/StructLM)
17
+
18
+
19
+ ![Alt text](https://raw.githubusercontent.com/TIGER-AI-Lab/StructLM/gh-pages/static/images/thumbnail.drawio%20(1).png)
20
+
21
+
22
+ ## Introduction
23
+ StructLM, is a series of open-source large language models (LLMs) finetuned for structured knowledge grounding (SKG) tasks. We release 3 models:
24
+
25
+ 7B | [StructLM-7B](https://huggingface.co/TIGER-Lab/StructLM-7B)
26
+
27
+ 13B | [StructLM-13B](https://huggingface.co/TIGER-Lab/StructLM-13B)
28
+
29
+ 34B | [StructLM-34B](https://huggingface.co/TIGER-Lab/StructLM-34B)
30
+
31
+
32
+ ## Training Data
33
+ These models are trained on 🤗 [SKGInstruct Dataset](https://huggingface.co/datasets/TIGER-Lab/SKGInstruct), an instruction-tuning dataset containing mixture of 19 SKG tasks combined with 🤗 [SlimOrca](https://huggingface.co/datasets/Open-Orca/SlimOrca). Check out the dataset card for more details.
34
+
35
+
36
+ ## Training Procedure
37
+ The models are fine-tuned with CodeLlama-Instruct-hf models as base models. Each model is trained for 3 epochs, and the best checkpoint is selected.
38
+
39
+ ## Evaluation
40
+ Here are a subset of model evaluation results:
41
+
42
+ ### Held in
43
+
44
+ | **Model** | **ToTTo** | **GrailQA** | **CompWebQ** | **MMQA** | **Feverous** | **Spider** | **TabFact** | **Dart** |
45
+ |-----------------------|--------------|----------|----------|----------|----------|----------|----------|----------|
46
+ | **StructLM-7B** | 49.4 | 80.4 | 78.3 | 85.2 | 84.4 | 72.4 | 80.8 | 62.2 |
47
+ | **StructLM-13B** | 49.3 | 79.2 | 80.4 | 86.0 | 85.0 | 74.1 | 84.7 | 61.4 |
48
+ | **StructLM-34B** | 50.2 | 82.2 | 81.9 | 88.1 | 85.7 | 74.6 | 86.6 | 61.8 |
49
+
50
+
51
+ ### Held out
52
+ | **Model** | **BIRD** | **InfoTabs** | **FinQA** | **SQA** |
53
+ |-----------------------|--------------|----------|----------|----------|
54
+ | **StructLM-7B** | 22.3 | 55.3 | 27.3 | 49.7 |
55
+ | **StructLM-13B** | 22.8 | 58.1 | 25.6 | 36.1 |
56
+ | **StructLM-34B** | 24.7 | 61.8 | 36.2 | 44.2 |
57
+
58
+
59
+ ## Usage
60
+ You can use the models through Huggingface's Transformers library.
61
+ Check our Github repo for the evaluation code: [https://github.com/TIGER-AI-Lab/StructLM](https://github.com/TIGER-AI-Lab/StructLM)
62
+
63
+
64
+ ## Prompt Format
65
+
66
+ **IMPORTANT GOTCHA**
67
+
68
+ **For this 13B model, the prompt format is**
69
+ ```
70
+ [INST] [INST] <<SYS>>
71
+ You are an AI assistant that specializes in analyzing and reasoning
72
+ over structured information. You will be given a task, optionally
73
+ with some structured knowledge input. Your answer must strictly
74
+ adhere to the output format, if specified.
75
+ <</SYS>>
76
+ {instruction} [/INST] [/INST]
77
+ ```
78
+
79
+ To linearize structured input of various types during training, we follow the linearization procedures from [UnifiedSKG](https://arxiv.org/pdf/2201.05966.pdf), so using this format during prompting will be most effective.
80
+ To see concrete examples of this linearization, you can directly reference the 🤗 [SKGInstruct Dataset](https://huggingface.co/datasets/TIGER-Lab/SKGInstruct).
81
+
82
+ ## Intended Uses
83
+ These models are trained for research purposes. They are designed to be proficient in interpreting linearized structured input. Downstream uses can potentially include various applications requiring the interpretation of structured data.
84
+
85
+ ## Limitations
86
+ While we've tried to build an SKG-specialized model capable of generalizing, we have shown that this is a challenging domain, and it may lack performance characteristics that allow it to be directly used in chat or other applications.
87
+
88
+
89
+ ## Citation
90
+ If you use the models, data, or code from this project, please cite the original paper:
91
+
92
+ ```
93
+ to be updated
94
+ ```