2 contributors

History: 31 commits

wangzihan99

Merge branch 'main' of https://huggingface.co/Qwen/Qwen-7B-Chat-Int4 into pr/6

7fa16ca over 1 year ago

assets
update over 1 year ago
.gitattributes

1.59 kB

update model over 1 year ago
LICENSE

6.9 kB

update readme over 1 year ago
NOTICE

15.3 kB

update over 1 year ago
README.md

30.6 kB

update over 1 year ago
cache_autogptq_cuda_256.cpp

8.4 kB

update kernels over 1 year ago
cache_autogptq_cuda_kernel_256.cu

52 kB

update kernels over 1 year ago
config.json

1.22 kB

Merge branch 'main' of https://huggingface.co/Qwen/Qwen-7B-Chat-Int4 into pr/6 over 1 year ago
configuration_qwen.py

2.41 kB

Add fused ApplyRoPE and RMSNorm kernels written in OpenAI Triton. over 1 year ago
cpp_kernels.py

1.92 kB

update kernels over 1 year ago
generation_config.json

273 Bytes

update over 1 year ago
model-00001-of-00003.safetensors

2.04 GB
LFS

update model over 1 year ago
model-00002-of-00003.safetensors

2.05 GB
LFS

update model over 1 year ago
model-00003-of-00003.safetensors

1.77 GB
LFS

update model over 1 year ago
model.safetensors.index.json

65.7 kB

update model over 1 year ago
modeling_qwen.py

56.9 kB

Merge branch 'main' of https://huggingface.co/Qwen/Qwen-7B-Chat-Int4 into pr/6 over 1 year ago
quantize_config.json

214 Bytes

update model over 1 year ago
qwen.tiktoken

2.56 MB

update model over 1 year ago
qwen_generation_utils.py

14.6 kB

update model over 1 year ago
tokenization_qwen.py

9.62 kB

update tokenization.py over 1 year ago
tokenizer_config.json

174 Bytes

update over 1 year ago
triton_kernels.py

3.96 kB

Improve performance witih Triton 2.0 and adapt to latest Qwen releases. over 1 year ago