xuanxuanzl
/

BaoLuo-LawAssistant-sftglm-6b

Feature Extraction

Model card Files Files and versions Community

BaoLuo-LawAssistant-sftglm-6b / README.md

xuanxuanzl's picture

Update README.md

822acfc over 1 year ago

|

history blame contribute delete

2.84 kB

	---
	license: apache-2.0
	---

	# BaoLuo-LawAssistant-sftglm-6b 宝锣法律大模型1.0

	<p align="center">
	🌐 <a href="https://baoluo.dahole.com" target="_blank">WEB</a> • 💻 <a href="https://github.com/xuanxuanzl/BaoLuo-LawAssistant" target="_blank">宝锣法律助理V1.0</a>
	</p>

	## 介绍
	宝锣法律大模型是一个基于Encoder-Decoder开源的中文法律对话语言模型，使用开源法律领域的数据进行精调，能够提供法律法规检索、法律咨询、案情分析、罪名预测等服务。基于 [General Language Model (GLM)](https://github.com/THUDM/GLM) 架构，对chatglm进行了微调，用户可以在消费级的显卡上进行本地部署。
	本项目不支持商用，可做研究使用。

	## 软件依赖

	```shell
	pip install protobuf==3.20.0 transformers>=4.27.1 icetk cpm_kernels torch==2.0.1
	```

	## 代码调用

	可以通过如下代码调用 BaoLuo-LawAssistant-sftglm-6b 模型来生成对话：

	```ipython

	>>> from transformers import AutoTokenizer, AutoModel, AutoConfig

	>>> tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
	>>> config = AutoConfig.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True, pre_seq_len=256)
	>>> model = AutoModel.from_pretrained("THUDM/chatglm-6b", config=config, trust_remote_code=True).half().cuda()

	>>> model = model.quantize(bits=8, kernel_file="xuanxuanzl/BaoLuo-LawAssistant-sftglm-6b/quantization_kernels.so")
	>>> prefix_state_dict = torch.load(os.path.join("xuanxuanzl/BaoLuo-LawAssistant-sftglm-6b", "pytorch_model.bin"))
	>>> new_prefix_state_dict = {}
	>>> for k, v in prefix_state_dict.items():
	>>> if k.startswith("transformer.prefix_encoder."):
	>>> new_prefix_state_dict[k[len("transformer.prefix_encoder."):]] = v
	>>> model.transformer.prefix_encoder.load_state_dict(new_prefix_state_dict)
	>>> model.transformer.prefix_encoder.float()
	>>> model = model.eval()

	>>> response, history = model.chat(tokenizer, "你好", history=[])
	>>> print(response)
	```

	## 协议

	本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源，ChatGLM-6B 模型的权重的使用则需要遵循 [Model License](https://huggingface.co/THUDM/chatglm-6b/blob/main/LICENSE)。

	## 模型需要完善

	- 基准模型采用的性能不高，导致回复响应时间较长，下一步采用效率更高的基础模型。
	- 各服务功能的数据分布不均衡。
	- 各服务数据的重要指令设计不足。
	- 结合外部知识增强提升模型输出的准确度方面有欠缺。

	## 更新日志

	- 2023年7月10日宝锣法律大模型V1.0发布，[宝锣法律AI助理](https://github.com/xuanxuanzl/BaoLuo-LawAssistant/tree/main)同日发布。

	<p align="center">
	<br>
	<img src="https://github.com/xuanxuanzl/BaoLuo-LawAssistant/raw/main/leizi.png" width="20%"/>
	<br>
	</p>