GoToCompany
/

gemma2-9b-cpt-sahabatai-v1-base

Model card Files Files and versions Community

kiliangoto commited on Nov 12, 2024

Commit

ccca593

·

verified ·

1 Parent(s): 144613d

Update README.md

Files changed (1) hide show

README.md +0 -1

README.md CHANGED Viewed

@@ -37,7 +37,6 @@ For the evaluation of general language capabilities, we employed the
 - [SEA HELM (also known as BHASA) evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
   - These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
   - We also added support for Javanese and Sundanese for the BHASA tasks whenever applicable
-  - These tasks include examination questions on Humanities, Indonesian language, Local languages and cultures, Social science and STEM across primary, middle, and high school levels.
 - and the common English tasks from the [HuggingFace LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
   - These tasks consist of [IFEval, BBH, Math Lvl 5, GPQA, MuSR, and MMLU-PRO.](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about)
   - **Caveat**: Our results differ from the HuggingFace LLM Leaderboard because we have used [VLLM](https://docs.vllm.ai/en/latest/) as our inference platform. VLLM caps the context size at **4096 tokens** while HuggingFace was set to **8192 tokens**.

 - [SEA HELM (also known as BHASA) evaluation benchmark](https://arxiv.org/abs/2309.06085v2) across a variety of tasks.
   - These tasks include Question Answering (QA), Sentiment Analysis (Sentiment), Toxicity Detection (Toxicity), Translation in both directions (Eng>Lang & Lang>Eng), Abstractive Summarization (Summ), Causal Reasoning (Causal) and Natural Language Inference (NLI).
   - We also added support for Javanese and Sundanese for the BHASA tasks whenever applicable
 - and the common English tasks from the [HuggingFace LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard).
   - These tasks consist of [IFEval, BBH, Math Lvl 5, GPQA, MuSR, and MMLU-PRO.](https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about)
   - **Caveat**: Our results differ from the HuggingFace LLM Leaderboard because we have used [VLLM](https://docs.vllm.ai/en/latest/) as our inference platform. VLLM caps the context size at **4096 tokens** while HuggingFace was set to **8192 tokens**.