nunonmg commited on
Commit
923de31
·
1 Parent(s): b47c7ad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -17
README.md CHANGED
@@ -25,27 +25,32 @@ This modelcard aims to be a base template for new models. It has been generated
25
 
26
  ### Model Description
27
 
28
- <!-- Provide a longer summary of what this model is. -->
29
-
 
30
 
31
 
32
  - **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
33
- - **Model type:** [More Information Needed]
34
- - **Language(s) (NLP):** [More Information Needed]
35
  - **License:** CC-BY-NC-4.0
36
- - **Finetuned from model [optional]:** LLaMA2
37
-
38
- ### Model Sources [optional]
39
-
40
- <!-- Provide the basic links for the model. -->
41
-
42
- - **Repository:** TBA
43
- - **Paper [optional]:** TBA
44
- - **Demo [optional]:** TBA
45
-
46
- ## Uses
47
-
48
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
49
 
50
  ### Direct Use
51
 
 
25
 
26
  ### Model Description
27
 
28
+ TowerInstruct is a language model that results from fine-tuning TowerBase on the TowerBricks supervised fine-tuning dataset. TowerInstruct v0.1 is the first model in the series.
29
+ The model is trained to handle several translation-related tasks, such as general machine translation (e.g., sentence- and document-level translation, terminology-aware translation, context-aware translation), automatic post edition, named-entity recognition, gramatical error correction, and paraphrase generation.
30
+ We will release more details in the upcoming technical report.
31
 
32
 
33
  - **Developed by:** Unbabel, Instituto Superior Técnico, CentraleSupélec University of Paris-Saclay
34
+ - **Model type:** A 7B parameter model fine-tuned on a mix of publicly available, synthetic datasets on translation-related tasks, as well as conversational datasets and code instructions.
35
+ - **Language(s) (NLP):** English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, Russian
36
  - **License:** CC-BY-NC-4.0
37
+ - **Finetuned from model [optional]:** TowerBase
38
+
39
+ ## Intended uses & limitations
40
+
41
+ The model was initially fine-tuned on a filtered and preprocessed supervised fine-tuning dataset (TowerBricks), which contains a diverse range of data sources:
42
+ - Translation
43
+ - Automatic Post Edition
44
+ - Machine Translation Evaluation
45
+ - Context-aware Translation
46
+ - Terminology-aware Translation
47
+ - Multi-reference Translation
48
+ - Named-entity Recognition
49
+ - Paraphrase Generation
50
+ - Synthetic Chat data
51
+ - Code instructions
52
+
53
+ You can find the dataset and all data sources of TowerBricks here.
54
 
55
  ### Direct Use
56