Bllossom
/

llama-3.2-Korean-Bllossom-3B-gguf-Q4_K_M

Inference Endpoints

Model card Files Files and versions Community

사용 관련 문의드립니다!

#1

by scj0709 - opened Nov 14, 2024

scj0709

Nov 14, 2024

안녕하세요~

llama-3.2 3B 모델이 정말 훌륭하게 동작 되는것을 확인하엿습니다!!
너무 좋은 성능이라 놀랐습니다.

그래서 quantization된 해당 모델도 사용해보고 싶은데 Model card에 있는 사용법은 llama-3.2 3B 모델과 같아서요!

혹시 사용하려면 어떻게 해야하는지 알 수 있을까요?

PerRing

Bllossom org Nov 14, 2024

llama cpp python을 이용한 코드로 model card를 수정했습니다.
model card나 아래의 ollama modelfile을 참조해서 사용하시길 바랍니다.

https://huggingface.co/Bllossom/llama-3.2-Korean-Bllossom-3B/discussions/1

scj0709

Nov 14, 2024

답변 감사드립니다.!! model card를 사용하려고 하는데
"ValueError: Model path does not exist: llama-3.2-Korean-Bllossom-3B-gguf-Q4_K_M.gguf"
이런 에러가 발생하네요ㅠㅠ

PerRing

Bllossom org Nov 14, 2024

gguf 파일을 다운받은뒤 gguf 파일 경로를 입력해야됩니다.

scj0709

Nov 25, 2024

감사합니다!
덕분에 infenece 하는 부분은 성공적으로 진행 되었습니다.
성능도 기존 llama보다 훨씬 좋은 부분이 놀랍네요!
한가지 질문이 있습니다.
혹시, kv cache 같은 부분이 적용된 것인가요??
kv cache라던가 prompt cache를 적용하기 위한 가이드가 있을까요??
감사합니다!

PerRing changed discussion status to closed Jan 3

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment