gair-prox
/

RedPJ-ProX-0.3B

Model card Files Files and versions Community

RedPJ-ProX-0.3B / README.md

koalazf99's picture

Update README.md

ef5ec0f verified 5 months ago

|

1.13 kB

	---
	license: apache-2.0
	datasets:
	- gair-prox/RedPajama-pro
	language:
	- en
	tags:
	- math
	- reasoning
	---

	# RedPJ-ProX-0.3B

	<p align="center">
	<img src="prox-teaser.png" width="200">
	</p>

	[ArXiv](http://arxiv.org/abs/xxxx) \| [Models](https://huggingface.co/gair-prox/RedPJ-ProX-0.3B) \| [Data](https://huggingface.co/datasets/gair-prox/RedPajama-pro) \| [Code](https://github.com/GAIR-NLP/program-every-example)

	RedPJ-ProX-0.3B is a tiny language model. It was and trained on the [RedPajama-V2-pro](https://huggingface.co/datasets/gair-prox/RedPajama-pro) for 25B tokens.

	## Evaluations

	ProX models are evaluated over 10 language model benchmarks in zero-shot setting.

	\| \| ArC-c \| ARC-e \| CSQA \| HellaS \| MMLU \| OBQA \| PiQA \| SIQA \| WinoG \| SciQ \| AVG \|
	\|-----------------------\|-------\|-------\|-------\|-----------\|-------\|-------\|-------\|-------\|-------\|-------\|------\|
	\| raw \| 22.6 \| 41.9 \| 29.7 \| 32.8 \| 26.2 \| 26.4 \| 62.2 \| 39.3 \| 51.3 \| 63.3 \| 39.6 \|
	\| ours \| 25.9 \| 47.5 \| 29.2 \| 36.7 \| 28.1 \| 30.2 \| 64.6 \| 38.0 \| 51.7 \| 71.4 \| 42.3 \|

	### Citation
	```
	@misc{TBD
	}
	```