RichardErkhov/Narrativaai_-_bloom-560m-finetuned-totto-table-to-text-8bits

Quantization made by Richard Erkhov.

bloom-560m-finetuned-totto-table-to-text - bnb 8bits

Model creator: https://huggingface.co/Narrativaai/
Original model: https://huggingface.co/Narrativaai/bloom-560m-finetuned-totto-table-to-text/

Original model description:

language:

en tags:
table-to-text
tabular
Narratable datasets:
totto widget:
text: " John Higgins Minor-ranking finals: 6 (3 titles, 3 runners-up) Outcome No. Outcome Year Outcome No. Championship Outcome No. Year Opponent in the final Outcome No. Year Championship Score Outcome No. Year Championship Opponent in the final Winner Outcome 1. No. 2010 Year Ruhr Championship Championship England Shaun Murphy Opponent in the final 4–2 Score Runner-up Outcome 1. No. 2010 Year Prague Classic Championship England Michael Holt Opponent in the final 3–4 Score Runner-up Outcome 2. No. 2011 Year Players Tour Championship – Event 5 Championship England Andrew Higginson Opponent in the final 1–4 Score Winner Outcome 2. No. 2012 Year Kay Suzanne Memorial Trophy Championship England Judd Trump Opponent in the final 4–2 Score Runner-up Outcome 3. No. 2012 Year Bulgarian Open Championship England Judd Trump Opponent in the final 0–4 Score Winner Outcome 3. No. 2013 Year Bulgarian Open Championship Australia Neil Robertson Opponent in the final 4–1 Score

\n\n"

inference: parameters: max_length: 500

BLOOM (0.56B) fine-tuned on ToTTo for Table-to-text 📋 ➡️ 🔤 aka NARRATABLE

This model is a fine-tuned version of bigscience/bloom-560m on the ToTTo dataset.

The model 🧠

It is a 560M params version of BLOOM 🌸

The dataset 📚

ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.

During the dataset creation process, tables from English Wikipedia are matched with (noisy) descriptions. Each table cell mentioned in the description is highlighted and the descriptions are iteratively cleaned and corrected to faithfully reflect the content of the highlighted cells.

Evaluation results

Metric Value

rouge1 0.56

rouge2 0.33

rougeL 0.48

rougeLsum 0.48

sacrebleu 20.87

meteor 0.49

Usage

from datasets import load_dataset from transformers import BloomTokenizerFast, BloomForCausalLM valid_dataset = load_dataset('totto', split='validation') from preprocess import preprocess # This file is included in the repo # Now we linearize the tables valid_dataset = valid_dataset.map(preprocess) model_ckpt = "mrm8488/bloom-560m-finetuned-totto-table-to-text" tokenizer = BloomTokenizerFast.from_pretrained(ckpt) model = BloomForCausalLM.from_pretrained(ckpt).to("cuda") def explain_hl_cells(text): inputs = tokenizer(text, return_tensors='pt') input_ids = inputs.input_ids.to("cuda") attention_mask = inputs.attention_mask.to("cuda") output = model.generate(input_ids, attention_mask=attention_mask, max_length=2048, eos_token_id=tokenizer.eos_token_id) return tokenizer.decode(output[0], skip_special_tokens=False) example = valid_dataset[1] print(explain_hl_cells(example['linearized_table'])

Framework versions

Transformers 4.21.2

Pytorch 1.12.1+cu113

Datasets 2.4.0

Tokenizers 0.12.1

Created by: Narrativa

About Narrativa:

Narrativa is an internationally recognized content services company that uses its proprietary artificial intelligence and machine learning platforms to build and deploy digital content solutions for enterprises. Its technology suite, consisting of data extraction, data analysis, natural language processing (NLP) and natural language generation (NLG) tools, all seamlessly work together to power a lineup of smart content creation, automated business intelligence reporting and process optimization products for a variety of industries. Contact us to learn more about our solutions!

RichardErkhov
/

Narrativaai_-_bloom-560m-finetuned-totto-table-to-text-8bits

Original model description:

inference: parameters: max_length: 500

BLOOM (0.56B) fine-tuned on ToTTo for Table-to-text 📋 ➡️ 🔤 aka NARRATABLE

The model 🧠

The dataset 📚

Evaluation results

Usage

Framework versions

About Narrativa:

Metric	Value
rouge1	0.56
rouge2	0.33
rougeL	0.48
rougeLsum	0.48
sacrebleu	20.87
meteor	0.49