Amir Hossein Kargaran's picture

Amir Hossein Kargaran

kargaranamir

·

https://kargaranamir.github.io

AI & ML interests

#NLP, checkout https://huggingface.co/cis-lmu

Recent Activity

new activity 3 days ago

cis-lmu/glotlid-corpus:taq_Tfng is contaminated with zgh_Tfng data

updated a dataset 3 days ago

cis-lmu/glotlid-corpus

liked a model 3 days ago

Alibaba-NLP/gte-Qwen2-7B-instruct

View all activity

Organizations

kargaranamir's activity

upvoted a paper 25 days ago

On Relation-Specific Neurons in Large Language Models

Paper • 2502.17355 • Published 27 days ago • 6

upvoted a collection about 1 month ago

MMTEB

Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. • 4 items • Updated Jun 6, 2024 • 2

upvoted a paper about 1 month ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 33

upvoted a collection about 1 month ago

CommonCrawl

Large web-mined general corpus based on CommonCrawl. • 7 items • Updated Dec 8, 2024 • 2

upvoted a paper about 1 month ago

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Paper • 2502.05167 • Published Feb 7 • 15

upvoted 4 collections 4 months ago

LLM Training

46 items • Updated 25 days ago • 4

reading list

1 item • Updated Nov 4, 2024 • 1

Text Datasets

13 items • Updated 16 days ago • 1

OpenCoder

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated Nov 23, 2024 • 80

upvoted a paper 5 months ago

How Transliterations Improve Crosslingual Alignment

Paper • 2409.17326 • Published Sep 25, 2024 • 1

upvoted 2 collections 5 months ago

LLMs

418 items • Updated 10 days ago • 31

cool datasets

158 items • Updated 4 days ago • 15

upvoted a paper 5 months ago

GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages

Paper • 2410.23825 • Published Oct 31, 2024 • 4

upvoted a collection 5 months ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15 • 120

upvoted a paper 5 months ago

MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

Paper • 2410.05873 • Published Oct 8, 2024 • 3

upvoted a collection 9 months ago

LLM Spaces

189 items • Updated 19 days ago • 15