Llama-2-ko-DPO-13B

Based on the changed criteria from Open-AI-LLM leaderboard, the evaluation metric exceeded 50 percent for the first time. I am pretty proud of myself, even though this score will soon fade into the background as I'm simply testing a hypothesis rather than competing, and there are a lot of great models coming out of 7B. Since my day job is technical support, not R&D, I could not spend a lot of time on it, so I only processed about 1000 samples and tuned them with DPO (Direct Preference Optimization) to reduce hallucination. The infrastructure was the same as before, using AWS g5.12xlarge, and no additional prompts were given.

I think the potential of the base LLM model is enormous, seeing how much hallucination are reduced with very little data and without much effort. When I meet with customers, many of them have difficulty implementing GenAI features. But it does not take much effort to implement them since many template codes/APIs are well done. It is a world where anyone who is willing to process data can easily and quickly create their own quality model.

Model Details

Base Model: Llama-2-ko-instruct-13B

Datasets

1,000 samples generated by myself
Sentences generated by Amazon Bedrock Claude-2 were adopted as chosen, and sentences generated by the Llama-2-13B model fine-tuned with SFT were adopted as rejected.

Benchmark

This is the first Korean LLM model to exceed the average metric of 50 percent.
SOTA model as of October 31, 2023 (https://huggingface.co/spaces/upstage/open-ko-llm-leaderboard).

Model	Average	Ko-ARC	Ko-HellaSwag	Ko-MMLU	Ko-TruthfulQA	Ko-CommonGen V2
daekeun-ml/Llama-2-ko-DPO-13B (Ours)	51.03	47.53	58.28	43.59	51.91	53.84
daekeun-ml/Llama-2-ko-instruct-13B	49.52	46.5	56.9	43.76	42	58.44
kyujinpy/Korean-OpenOrca-13B	48.79	43.09	54.13	40.24	45.22	61.28

License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License, under LLAMA 2 COMMUNITY LICENSE AGREEMENT

This model was created as a personal experiment, unrelated to the organization I work for.

daekeun-ml
/

Llama-2-ko-DPO-13B

Llama-2-ko-DPO-13B

Model Details

Datasets

Benchmark

License

Spaces using daekeun-ml/Llama-2-ko-DPO-13B 2