ahmedheakl
/

arazn-llama3-english

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

arazn-llama3-english / README.md

ahmedheakl's picture

Update README.md

b15b265 verified 1 day ago

|

history blame contribute delete

2.64 kB

	---
	license: mit
	datasets:
	- ahmedheakl/arzen-llm-dataset
	language:
	- ar
	- en
	metrics:
	- bleu
	- ecody726/bertscore
	- meteor
	library_name: transformers
	pipeline_tag: translation
	---

	## How to use
	Just install `peft`, `transformers`, 'accelerate', 'bitsandbytes' and `pytorch` first.

	```bash
	pip install peft accelerate bitsandbytes transformers torch
	```

	Then login with your huggingface token to get access to base models
	```bash
	huggingface-cli login --token <YOUR_HF_TOKEN>
	```

	Then load the model.
	```python
	from peft import PeftConfig, PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	peft_model_id = "ahmedheakl/arazn-llama3-english"
	peft_config = PeftConfig.from_pretrained(peft_model_id)
	base_model_name = peft_config.base_model_name_or_path
	base_model = AutoModelForCausalLM.from_pretrained(base_model_name, device_map="auto", torch_dtype=torch.bfloat16)
	model = PeftModel.from_pretrained(base_model, peft_model_id, device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
	```

	Then do inference
	```python
	import torch

	raw_prompt = """<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>

	Translate the following code-switched Arabic-English-mixed text to English only.<\|eot_id\|><\|start_header_id\|>user<\|end_header_id\|>

	{source}<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>

	"""
	def inference(prompt) -> str:
	prompt = raw_prompt.format(source=prompt)
	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	generated_ids = model.generate(
	**inputs,
	use_cache=True,
	num_return_sequences=1,
	max_new_tokens=100,
	# do_sample=True,
	num_beams=1,
	# temperature=0.7,
	eos_token_id=tokenizer.eos_token_id,
	pad_token_id=tokenizer.pad_token_id,
	)
	outputs = tokenizer.batch_decode(generated_ids)[0]
	torch.cuda.empty_cache()
	torch.cuda.synchronize()
	return outputs.split("assistant<\|end_header_id\|>\n\n")[-1].split("<\|eot_id\|>")[0]
	print(inference("أنا أحب الbanana")) # I love bananas
	```

	Please see paper & code for more information:
	- https://github.com/ahmedheakl/arazn-llm
	- https://arxiv.org/abs/2406.18120


	## Citation

	BibTeX:
	```
	@article{heakl2024arzen,
	title={ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs},
	author={Heakl, Ahmed and Zaghloul, Youssef and Ali, Mennatullah and Hossam, Rania and Gomaa, Walid},
	journal={arXiv preprint arXiv:2406.18120},
	year={2024}
	}
	```


	## Model Card Authors

	- Email: ahmed.heakl@ejust.edu.eg
	- Linkedin: https://linkedin.com/in/ahmed-heakl