Papers
arxiv:2309.11385

Safurai 001: New Qualitative Approach for Code LLM Evaluation

Published on Sep 20, 2023
Authors:

Abstract

This paper presents Safurai-001, a new Large Language Model (LLM) with significant potential in the domain of coding assistance. Driven by recent advancements in coding LLMs, Safurai-001 competes in performance with the latest models like WizardCoder [Xu et al., 2023], PanguCoder [Shen et al., 2023] and Phi-1 [Gunasekar et al., 2023] but aims to deliver a more conversational interaction. By capitalizing on the progress in data engineering (including latest techniques of data transformation and prompt engineering) and instruction tuning, this new model promises to stand toe-to-toe with recent closed and open source developments. Recognizing the need for an efficacious evaluation metric for coding LLMs, this paper also introduces GPT4-based MultiParameters, an evaluation benchmark that harnesses varied parameters to present a comprehensive insight into the models functioning and performance. Our assessment shows that Safurai-001 can outperform GPT-3.5 by 1.58% and WizardCoder by 18.78% in the Code Readability parameter and more.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.11385 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.11385 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.11385 in a Space README.md to link it from this page.

Collections including this paper 2