dahara1
/

gemma-2-27b-it-gguf-japanese-imatrix

Inference Endpoints

Model card Files Files and versions Community

dahara1 commited on Jul 20

Commit

6d66ebc

•

1 Parent(s): 86dfd05

Update README.md

Files changed (1) hide show

README.md +12 -0

README.md CHANGED Viewed

@@ -3,6 +3,18 @@ tags:
 - gemma
 - llm
 ---
 gemma-2-27b-itを日本語が多く含まれる重要度行列(iMatrix)を使って量子化したgguf版です。日本語対応能力が多めに保持されている事を期待していますが確かめる事はまだ出来ていません
 This is a quantized gguf version of gemma-2-27b-it using an importance matrix (iMatrix) that contains many Japanese words.
 I hope it retains more Japanese support, but I can't be sure yet.

 - gemma
 - llm
 ---
+## 更新履歴 update history
+2024/07/20
+llama.cppに不具合[llama : fix pre-tokenization of non-special added tokens #8228](https://github.com/ggerganov/llama.cpp/pull/8228)が見つかり、Gemma2モデルは再変換が必要になり対応しました。HTMLタグの処理などが不正確になっていたとの事です。
+A bug was found in llama.cpp [llama: fix pre-tokenization of non-special added tokens #8228](https://github.com/ggerganov/llama.cpp/pull/8228), and the Gemma2 model needed to be reconverted. The problem was that HTML tags were not being processed correctly.
+単純に再変換するのは面白みがなかったので4bit以上の版は更に精度向上するという説もあるoutput tensorとembeddingをf16にするタイプの変換をしてみました。
+Simply reconverting it was not interesting, so I tried converting the output tensor and embedding to f16, which is said to have even greater accuracy in versions of 4 bits or more.
+念の為、4bit版は従来の変換とf16タイプの変換の両方をアップロードしてあります。
+Just to be on the safe side, I have uploaded both the 4-bit conventional conversion and the f16 conversion.
 gemma-2-27b-itを日本語が多く含まれる重要度行列(iMatrix)を使って量子化したgguf版です。日本語対応能力が多めに保持されている事を期待していますが確かめる事はまだ出来ていません
 This is a quantized gguf version of gemma-2-27b-it using an importance matrix (iMatrix) that contains many Japanese words.
 I hope it retains more Japanese support, but I can't be sure yet.