Pankaj Mathur commited on
Commit
4c98e7e
1 Parent(s): 55dc209

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -1,12 +1,16 @@
1
- # alpaca_orca_open_llama: A Instruction-Following OpeLLaMA Model using Orca approaches on Alpaca dataset
2
 
3
 
4
  # Dataset and Training
5
 
6
  We train OpenLLaMa-3B model on the custom Alpaca dataset created using Orca Research Paper approaches.
 
7
  Please pay attention how System prompt is added and used for each instruction.
 
8
  The training configurations are provided in the table below.
 
9
  The training takes on 4 x A600(50G) GPUs and lasts for around 20 Hours for cost of $66.
 
10
  We used DeepSpeed with Zero-3 approaches for parallel gpu training.
11
 
12
  |||
@@ -62,4 +66,5 @@ with torch.no_grad():
62
 
63
  output = rest[0][length:]
64
  string = tokenizer.decode(output, skip_special_tokens=True)
65
- print(f'[!] Generation results: {string}')
 
 
1
+ # alpaca_orca_open_llama: An Open_LLaMA-3B model trained on Alpaca dataset using Orca Research paper approaches
2
 
3
 
4
  # Dataset and Training
5
 
6
  We train OpenLLaMa-3B model on the custom Alpaca dataset created using Orca Research Paper approaches.
7
+
8
  Please pay attention how System prompt is added and used for each instruction.
9
+
10
  The training configurations are provided in the table below.
11
+
12
  The training takes on 4 x A600(50G) GPUs and lasts for around 20 Hours for cost of $66.
13
+
14
  We used DeepSpeed with Zero-3 approaches for parallel gpu training.
15
 
16
  |||
 
66
 
67
  output = rest[0][length:]
68
  string = tokenizer.decode(output, skip_special_tokens=True)
69
+ print(f'[!] Generation results: {string}')
70
+ ```