THUDM
/

chatglm2-6b-int4

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (1)

Adding `safetensors` variant of this model

#24 opened 27 days ago by

chatglm2-6b-int4报错RuntimeError: expected m1 and m2 to have the same dtype, but got: c10::Half != float

#23 opened 5 months ago by

Adding `safetensors` variant of this model

#22 opened 8 months ago by

Create handler.py

#21 opened 11 months ago by

Update tokenization_chatglm.py

#20 opened about 1 year ago by

linux环境部署cpu模式

#19 opened about 1 year ago by

使用AdaLora微调训练chatglm2-6b-int4模型报错

#18 opened about 1 year ago by

test_rh

#17 opened over 1 year ago by

'NoneType' object has no attribute 'int4WeightExtractionHalf'

#16 opened over 1 year ago by

AttributeError: 'NoneType' object has no attribute 'int4WeightExtractionHalf'

#15 opened over 1 year ago by

运行量化模型时报错：RuntimeError: CUDA Error: no kernel image is available for execution on the device

#14 opened over 1 year ago by

INT4 model shows bad perf than FP32 on Intel CPU,why?

#13 opened over 1 year ago by

ModuleNotFoundError: No module named 'transformers_modules.chatglm2-6b-int4'

#12 opened over 1 year ago by

怀疑hugging face现在不让用int4版本了，前段时间还用了

#11 opened over 1 year ago by

前段时间还能用int4版本了，现在又用不了了，怎么回事。。。。。。。

#10 opened over 1 year ago by

English only version?

#9 opened over 1 year ago by

"addmm_impl_cpu_" not implemented for 'Half'

#8 opened over 1 year ago by

推理时陷入重复

#6 opened over 1 year ago by

quantization_kernels.c和quantization_kernels_parallel.c直接使用chatglm1-6b-int4项目中的吗？

#5 opened over 1 year ago by

示例代码里还是THUDM/chatglm2-6b

#4 opened over 1 year ago by

Perplxity between quantized and original?

#3 opened over 1 year ago by

报错了 Failed to load cpm_kernels:Unknown platform: darwin

#2 opened over 1 year ago by

如果这个是用Bitsandsbyte的NF4量化的，能否直接在这个基础上用qlora继续训练？

#1 opened over 1 year ago by