Yosef Worku Alemneh

rasyosef

AI & ML interests

Pretraining, Supervised Fine Tuning, Direct Preference Optimization, Retrieval Augmented Generation (RAG), Function Calling

Recent Activity

new activity 10 days ago

rasyosef/roberta-amharic-reranker-medium:Update model metadata to set pipeline tag to the new `text-ranking` and tags to `sentence-transformers`

updated a model 22 days ago

rasyosef/roberta-amharic-reranker-medium

updated a collection 23 days ago

Llama 3.2 Amharic

View all activity

Organizations

rasyosef's activity

New activity in rasyosef/roberta-amharic-reranker-medium 10 days ago

Update model metadata to set pipeline tag to the new `text-ranking` and tags to `sentence-transformers`

#1 opened 10 days ago by

tomaarsen

New activity in tomaarsen/natural-questions-hard-negatives about 1 month ago

Using hard negatives VS query, pos pair to train embedding models

#2 opened about 1 month ago by

rasyosef

New activity in rasyosef/phi-2-instruct-apo 4 months ago

Adding Evaluation Results

#1 opened 7 months ago by

leaderboard-pr-bot

New activity in rasyosef/Mistral-NeMo-Minitron-8B-Chat 4 months ago

Adding Evaluation Results

#3 opened 7 months ago by

leaderboard-pr-bot

New activity in ContextualAI/ultrafeedback_clair_32k 4 months ago

Phi-2-Instruct-APO: aligned with Anchored Preference Optimization

#3 opened 7 months ago by

rasyosef

New activity in meta-llama/Llama-3.2-1B 5 months ago

[Query-ISSUE] tokenizer.vocab_size is 128000, however len(tokenizer) is 128256, which prevents me from using those other tokens.

#34 opened 5 months ago by

HV-Khurdula

What are the start and stop tokens of this model?

#40 opened 5 months ago by

aryaash

Is the BOS token id of 128000 hardcoded into the llama 3.2 tokenizer?

#17 opened 6 months ago by

rasyosef

New activity in nvidia/Mistral-NeMo-Minitron-8B-Base 7 months ago

Mistral-NeMo-Minitron-8B-Chat

#5 opened 8 months ago by

rasyosef

New activity in rasyosef/Phi-1_5-Instruct-v0.1 7 months ago

what is the context window size of this model , i means what is the input token and output tokens of this model

#1 opened 7 months ago by

naveen237

New activity in ContextualAI/ultrafeedback_clair_32k 7 months ago

APO Trainer in TRL?

#2 opened 7 months ago by

rasyosef

New activity in rasyosef/Mistral-NeMo-Minitron-8B-Chat 8 months ago

ChatML template does not work properly

#2 opened 8 months ago by

WasamiKirua

New activity in rasyosef/bert-medium-amharic 8 months ago

Collaboration

#1 opened 8 months ago by deleted

New activity in rasyosef/Llama-3.1-Minitron-4B-Chat 8 months ago

Error when trying to run

#1 opened 8 months ago by

ctranslate2-4you

New activity in microsoft/Phi-3.5-mini-instruct 8 months ago

What changed for people using this model in english?

#3 opened 8 months ago by

migueltalka

New activity in microsoft/phi-2 8 months ago

Phi 2 Instruct: an instruction following Phi 2 SLM that has undergone SFT and DPO

#132 opened 8 months ago by

rasyosef

New activity in open-llm-leaderboard/open_llm_leaderboard 8 months ago

What should a finetuned model's license be if the model is MIT but the datasets are Apache 2.0 and cc-by-4.0

#866 opened 8 months ago by

rasyosef

New activity in microsoft/phi-1_5 8 months ago

Phi 1.5 Instruct: an instruction following Phi 1.5 model that has undergone SFT and DPO

#89 opened 8 months ago by

rasyosef

New activity in rasyosef/amharic-sentences-corpus 9 months ago

Update README.md

#2 opened 9 months ago by

seyyaw

New activity in rasyosef/amharic-news-category-classification 11 months ago

Duplicate?

#2 opened 11 months ago by

israel