steviel
/

gemma-2b-stem-qa-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gemma-2b-stem-qa-v0.1 / README.md

steviel's picture

Create README.md (#1)

1bf5d42 verified 5 months ago

|

history blame contribute delete

No virus

2.64 kB

	---
	# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
	# Doc / guide: https://huggingface.co/docs/hub/model-cards
	{}
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]


	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	[More Information Needed]

	### Training Procedure
	Supervised Fine-Tuning (SFT) on chosen examples and Direct Preference Optimiazion (DPO) on preference data.

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	DPO hyperparameters
	* `beta=0.1`
	* `learning_rate=5e-6`
	* `gradient_accumulation=8`
	* `num_train_epochs=2`

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Dataset Card if possible. -->

	[More Information Needed]

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary

	[More Information Needed]



	## Technical Specifications

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	[More Information Needed]

	#### Software

	[More Information Needed]



	## Model Card Authors and Contacts
	DebuggingFace
	Antonio Mari (antonio.mari@epfl.ch)
	Matteo Santelmo (matteo.santelmo@epfl.ch)
	Stefano Viel (stefano.viel@epfl.ch)