croissantllm
/

base_50k

Text2Text Generation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

base_50k / README.md

manu's picture

Create README.md

161527f about 1 year ago

|

1.21 kB

	---
	license: mit
	datasets:
	- cerebras/SlimPajama-627B
	- oscar-corpus/OSCAR-2301
	- bigcode/starcoderdata
	language:
	- fr
	- en
	pipeline_tag: text-generation
	tags:
	- legal
	- art
	- code
	- finance
	- medical
	- text-generation-inference
	---

	# CroissantLLM: A not so flaky bilingual 1.3B model

	An experimental mode trained on a small subsplit of the final data.

	### Usage

	```python
	model_name = "croissantllm/base_50k"

	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	inputs = tokenizer("His name is Bob. -> Il s'appelle Bob.\nHe is heading to the market. -> Il va au marché.\nWe are heading to the beach, let's go together. ->", return_tensors="pt").to(model.device)
	tokens = model.generate(**inputs, max_length=100, do_sample=True, top_p=0.95, top_k=60, temperature=0.5)
	print(tokenizer.decode(tokens[0]))

	# remove bos token
	inputs = tokenizer("France -> Paris, Italie -> Rome, Allemagne -> Berlin, Espagne ->", return_tensors="pt", add_special_tokens=False).to(model.device)
	tokens = model.generate(**inputs, max_length=250, do_sample=True, top_p=0.95, top_k=60)
	print(tokenizer.decode(tokens[0]))
	```