Update README.md

b39a7be verified 5 months ago

4.25 kB

	---
	license: mit
	datasets:
	- google-research-datasets/go_emotions
	language:
	- en
	tags:
	- text-classification
	- onnx
	- fp16
	- roberta
	- emotions
	- multi-class-classification
	- multi-label-classification
	- optimum
	inference: false
	---

	This model is a FP16 optimized version of [SamLowe/roberta-base-go_emotions](https://huggingface.co/SamLowe/roberta-base-go_emotions). It runs exclusively on the GPU.

	On an RTX 4090, it is about 2x faster than the base ONNX version ([SamLowe/roberta-base-go_emotions-onnx](https://huggingface.co/SamLowe/roberta-base-go_emotions-onnx)) and 3x faster than the pytorch version. The speedup depends chiefly on your GPU's FP16:FP32 ratio. For more comparison benchmarks and sample code, check here: [https://github.com/joaopn/gpu_benchmark_goemotions](https://github.com/joaopn/gpu_benchmark_goemotions).


	Accuracy: On a test set of 10K reddit comments, the mean label probability difference from the pytorch version was ~1E-4. Metrics (accuracy, F1) are essentially identical to the original model.

	### Usage

	The model was generated with

	```python
	from optimum.onnxruntime import ORTOptimizer, ORTModelForSequenceClassification, AutoOptimizationConfig

	model_id_onnx = "SamLowe/roberta-base-go_emotions-onnx"
	file_name = "onnx/model.onnx"
	model = ORTModelForSequenceClassification.from_pretrained(model_id_onnx, file_name=file_name, provider="CUDAExecutionProvider", provider_options={'device_id': 0})

	optimizer = ORTOptimizer.from_pretrained(model)
	optimization_config = AutoOptimizationConfig.O4()
	optimizer.optimize(save_dir='roberta-base-go_emotions-onnx-fp16', optimization_config=optimization_config)
	```

	You will need the GPU version of the ONNX Runtime. It can be installed with

	```
	pip install optimum[onnxruntime-gpu] --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple/
	```

	For convenience, the [benchmark repo](https://github.com/joaopn/gpu_benchmark_goemotions) provides an `environment.yml` file to create a conda env with all the requirements. Below is an optimized, batched usage example:

	```python
	import pandas as pd
	import torch
	from tqdm import tqdm
	from transformers import AutoTokenizer
	from optimum.onnxruntime import ORTModelForSequenceClassification

	def sentiment_analysis_batched(df, batch_size, field_name):

	model_id = 'joaopn/roberta-base-go_emotions-onnx-fp16'
	file_name = 'model.onnx'
	gpu_id = 0

	model = ORTModelForSequenceClassification.from_pretrained(model_id, file_name=file_name, provider="CUDAExecutionProvider", provider_options={'device_id': gpu_id})
	device = torch.device(f"cuda:{gpu_id}")

	tokenizer = AutoTokenizer.from_pretrained(model_id)

	results = []

	# Precompute id2label mapping
	id2label = model.config.id2label

	total_samples = len(df)
	with tqdm(total=total_samples, desc="Processing samples") as pbar:
	for start_idx in range(0, total_samples, batch_size):
	end_idx = start_idx + batch_size
	texts = df[field_name].iloc[start_idx:end_idx].tolist()

	inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt", max_length=512)
	input_ids = inputs['input_ids'].to(device)
	attention_mask = inputs['attention_mask'].to(device)

	with torch.no_grad():
	outputs = model(input_ids, attention_mask=attention_mask)
	predictions = torch.sigmoid(outputs.logits) # Use sigmoid for multi-label classification

	# Collect predictions on GPU
	results.append(predictions)

	pbar.update(end_idx - start_idx)

	# Concatenate all results on GPU
	all_predictions = torch.cat(results, dim=0).cpu().numpy()

	# Convert to DataFrame
	predictions_df = pd.DataFrame(all_predictions, columns=[id2label[i] for i in range(all_predictions.shape[1])])

	# Add prediction columns to the original DataFrame
	combined_df = pd.concat([df.reset_index(drop=True), predictions_df], axis=1)

	return combined_df

	df = pd.read_csv('https://github.com/joaopn/gpu_benchmark_goemotions/raw/main/data/random_sample_10k.csv.gz')
	df = sentiment_analysis_batched(df, batch_size=8, field_name='body')
	```