zolekode commited on
Commit
51d30ac
1 Parent(s): b5ba718

updated read me

Browse files

Files changed (2) hide show
  1. .gitattributes +6 -0
  2. README.md +3 -3
.gitattributes CHANGED
@@ -14,3 +14,9 @@
14
  *.pb filter=lfs diff=lfs merge=lfs -text
15
  *.pt filter=lfs diff=lfs merge=lfs -text
16
  *.pth filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
14
  *.pb filter=lfs diff=lfs merge=lfs -text
15
  *.pt filter=lfs diff=lfs merge=lfs -text
16
  *.pth filter=lfs diff=lfs merge=lfs -text
17
+ t5-small-wav2vec2-grammar-fixer/spiece.model filter=lfs diff=lfs merge=lfs -text
18
+ t5-small-wav2vec2-grammar-fixer/tf_model.h5 filter=lfs diff=lfs merge=lfs -text
19
+ t5-small-wav2vec2-grammar-fixer/tokenizer_config.json filter=lfs diff=lfs merge=lfs -text
20
+ t5-small-wav2vec2-grammar-fixer/config.json filter=lfs diff=lfs merge=lfs -text
21
+ t5-small-wav2vec2-grammar-fixer/pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
22
+ t5-small-wav2vec2-grammar-fixer/special_tokens_map.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,7 +1,7 @@
1
  # flexudy-pipe-question-generation-v2
2
  After transcribing your audio with Wav2Vec2, you might be interested in a post processor.
3
 
4
- I trained it with only 42K paragraphs from the SQUAD dataset. All paragraphs had at most 128 tokens (separated by white spaces)
5
 
6
  ```python
7
  from transformers import T5Tokenizer, T5ForConditionalGeneration
@@ -38,7 +38,7 @@ BEFORE HE HAD TIME TO ANSWER A MUCH ENCUMBERED VERA BURST INTO THE ROOM WITH THE
38
  ```
39
  OUTPUT 1:
40
  ```
41
- Before he had time to answer a much-enumbered era burst into the room with the question, I say, "Can I leave these here?" In 2002, these were a small black pig and a dusty specimen of black red game cock.
42
  ```
43
 
44
  INPUT 2:
@@ -48,7 +48,7 @@ GOING ALONG SLUSHY COUNTRY ROADS AND SPEAKING TO DAMP AUDIENCES IN DRAUGHTY SCHO
48
 
49
  OUTPUT 2:
50
  ```
51
- Going along Slushy Country Roads and speaking to damp audiences in Droughty School rooms day after day for a fortnight, he'll have to put in an appearance at some place of worship on Sunday morning and he can come to us immediately afterwards.
52
  ```
53
  I strongly recommend improving the performance via further fine-tuning or by training more examples.
54
  - Possible Quick Rule based improvements: Align the transcribed version and the generated version. If the similarity of two words (case-insensitive) vary by more than some threshold based on some similarity metric (e.g. Levenshtein), then keep the transcribed word.
1
  # flexudy-pipe-question-generation-v2
2
  After transcribing your audio with Wav2Vec2, you might be interested in a post processor.
3
 
4
+ All paragraphs had at most 128 tokens (separated by white spaces)
5
 
6
  ```python
7
  from transformers import T5Tokenizer, T5ForConditionalGeneration
38
  ```
39
  OUTPUT 1:
40
  ```
41
+ Before he had time to answer a much encumbered vara burst into the room with the question, I say, can I leave these here. In 2002, these were a small black pig and a lusty specimen of black red game cock.
42
  ```
43
 
44
  INPUT 2:
48
 
49
  OUTPUT 2:
50
  ```
51
+ Going along Slushy Country Roads and speaking to damp audiences in Draughty School Rooms Day After day for a weekend, he'll have to put in an appearance at some place of worship on Sunday morning and he can come to us immediately afterwards.
52
  ```
53
  I strongly recommend improving the performance via further fine-tuning or by training more examples.
54
  - Possible Quick Rule based improvements: Align the transcribed version and the generated version. If the similarity of two words (case-insensitive) vary by more than some threshold based on some similarity metric (e.g. Levenshtein), then keep the transcribed word.