CoTaEval_leaderboard / versions /llama2-70b-chat-hf_news_rag_mean.csv
boyiwei's picture
update
c94c38d
raw
history blame
823 Bytes
model_name,method,rouge1,rougeL,semantic_sim,LCS(character),LCS(word),ACS(word),Levenshtein Distance,Minhash Similarity,MMLU,MT-Bench,Blocklisted F1,In-Domain F1,Efficiency
llama2-70b-chat-hf_news_rag,vanilla,0.6822865714014262,0.6199068209453332,0.840910555485636,383.537,76.68,89.165,461.589,0.5946875,0.619,7.1,0.595,0.624,1.00
llama2-70b-chat-hf_news_rag,sys_prompt_bing,0.630698567215431,0.5610633998461577,0.7809992222869768,329.287,63.72,78.661,508.58,0.53940625,0.614,7.2,0.594,0.616,1.00
llama2-70b-chat-hf_news_rag,top_k_3,0.45021398015786224,0.33722548268443875,0.7241460259128362,112.794,21.733,36.267,678.657,0.35540625,0.361,4.8,0.120,0.077,0.99
llama2-70b-chat-hf_news_rag,memfree_6,0.5686107281633525,0.49311296818266537,0.7844266164638102,83.406,11.907,61.525,553.886,0.46215625,0.619,6.6,0.514,0.601,0.99