Shobhank-iiitdwd commited on
Commit
30846b9
1 Parent(s): 9c3f3fb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -33
README.md CHANGED
@@ -185,7 +185,7 @@ parameters:
185
  encoder_no_repeat_ngram_size: 3
186
  num_beams: 4
187
  model-index:
188
- - name: pszemraj/long-t5-tglobal-base-16384-book-summary
189
  results:
190
  - task:
191
  type: summarization
@@ -499,7 +499,7 @@ from transformers import pipeline
499
 
500
  summarizer = pipeline(
501
  "summarization",
502
- "pszemraj/long-t5-tglobal-base-16384-book-summary",
503
  device=0 if torch.cuda.is_available() else -1,
504
  )
505
  long_text = "Here is a lot of text I don't want to read. Replace me"
@@ -508,37 +508,6 @@ result = summarizer(long_text)
508
  print(result[0]["summary_text"])
509
  ```
510
 
511
- Pass [other parameters related to beam search textgen](https://huggingface.co/blog/how-to-generate) when calling `summarizer` to get even higher quality results.
512
-
513
- ## Intended uses & limitations
514
-
515
- - The current checkpoint is fairly well converged but will be updated if further improvements can be made.
516
- - Compare performance to [LED-base](https://huggingface.co/pszemraj/led-base-book-summary) trained on the same dataset (API gen parameters are the same).
517
- - while this model seems to improve upon factual consistency, **do not take summaries to be foolproof and check things that seem odd**.
518
-
519
- ## Training and evaluation data
520
-
521
- `kmfoda/booksum` dataset on HuggingFace - read [the original paper here](https://arxiv.org/abs/2105.08209). Summaries longer than 1024 LongT5 tokens were filtered out to prevent the model from learning to generate "partial" summaries.
522
-
523
-
524
-
525
- ### How to run inference over a very long (30k+ tokens) document in batches?
526
-
527
- See `summarize.py` in [the code for my hf space Document Summarization](https://huggingface.co/spaces/pszemraj/document-summarization/blob/main/summarize.py) :)
528
-
529
- You can also use the same code to split a document into batches of 4096, etc., and run over those with the model. This is useful in situations where CUDA memory is limited.
530
-
531
- ### How to fine-tune further?
532
-
533
- See [train with a script](https://huggingface.co/docs/transformers/run_scripts) and [the summarization scripts](https://github.com/huggingface/transformers/tree/main/examples/pytorch/summarization).
534
-
535
- This model was originally tuned on Google Colab with a heavily modified variant of the [longformer training notebook](https://github.com/patrickvonplaten/notebooks/blob/master/Fine_tune_Longformer_Encoder_Decoder_(LED)_for_Summarization_on_pubmed.ipynb), key enabler being deepspeed. You can try this as an alternate route to fine-tuning the model without using the command line.
536
-
537
- * * *
538
-
539
- ## Training procedure
540
-
541
-
542
  ### Training hyperparameters
543
 
544
  _NOTE: early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for 10+ epochs._
 
185
  encoder_no_repeat_ngram_size: 3
186
  num_beams: 4
187
  model-index:
188
+ - name: Shobhank-iiitdwd/long-t5-tglobal-base-16384-book-summary
189
  results:
190
  - task:
191
  type: summarization
 
499
 
500
  summarizer = pipeline(
501
  "summarization",
502
+ "Shobhank-iiitdwd/long-t5-tglobal-base-16384-book-summary",
503
  device=0 if torch.cuda.is_available() else -1,
504
  )
505
  long_text = "Here is a lot of text I don't want to read. Replace me"
 
508
  print(result[0]["summary_text"])
509
  ```
510
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
511
  ### Training hyperparameters
512
 
513
  _NOTE: early checkpoints of this model were trained on a "smaller" subsection of the dataset as it was filtered for summaries of **1024 characters**. This was subsequently caught and adjusted to **1024 tokens** and then trained further for 10+ epochs._