Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Training Data

Chart-to-text: Kanthara, S., Leong, R. T. K., Lin, X., Masry, A., Thakkar, M., Hoque, E., & Joty, S. (2022). Chart-to-Text: A Large-Scale Benchmark for Chart Summarization. arXiv preprint arXiv:2203.06486.

Github Link for the data: https://github.com/vis-nlp/Chart-to-text

Example use:

Append C2T: before every input to the model

tokenizer = AutoTokenizer.from_pretrained(saadob12/t5_C2T_big)
model =   AutoModelForSeq2SeqLM.from_pretrained(saadob12/t5_C2T_big)

data = 'Breakdown of coronavirus  ( COVID-19 ) deaths in South Korea as of March 16 , 2020 , by chronic disease x-y labels Response - Share of cases, x-y values Circulatory system disease* 62.7% , Endocrine and metabolic diseases** 46.7% , Mental illness*** 25.3% , Respiratory diseases*** 24% , Urinary and genital diseases 14.7% , Cancer 13.3% , Nervous system diseases 4% , Digestive system diseases 2.7% , Blood and hematopoietic diseases 1.3%'

prefix = 'C2T: '
tokens = tokenizer.encode(prefix + data,  truncation=True, padding='max_length', return_tensors='pt')
generated = model.generate(tokens, num_beams=4, max_length=256)
tgt_text = tokenizer.decode(generated[0], skip_special_tokens=True, clean_up_tokenization_spaces=True)
summary = str(tgt_text).strip('[]""')
#Summary: As of March 16, 2020, around 62.7 percent of all deaths due to the coronavirus ( COVID-19 ) in South Korea were related to circulatory system diseases. Other chronic diseases include endocrine and metabolic diseases, mental illness, and cancer. South Korea confirmed 30,017 cases of infection including 501 deaths. For further information about the coronavirus ( COVID-19 ) pandemic, please visit our dedicated Facts and Figures page. 

Intended Use and Limitations

You can use the model to generate summaries of data files. Works well for general statistics like the following:

Year Children born per woman
2018 1.14
2017 1.45
2016 1.49
2015 1.54
2014 1.6
2013 1.65

May or may not generate an okay summary at best for the following kind of data:

Model BLEU score BLEURT
t5-small 25.4 -0.11
t5-base 28.2 0.12
t5-large 35.4 0.34

Citation

Kindly cite my work. Thank you.

  @misc{obaid ul islam_2022, 
      title={saadob12/t5_C2T_big Hugging Face}, 
      url={https://huggingface.co/saadob12/t5_C2T_big}, 
      journal={Huggingface.co}, 
      author={Obaid ul Islam, Saad}, 
      year={2022} 
  }
Downloads last month
16
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using saadob12/t5_C2T_big 1