Edit model card

CoolShell LLM

[ English | 中文 ]

We express our deepest gratitude to Mr. Chen Hao for his selfless sharing in the internet community, especially in the field of technology.

An orchid in deep forest won't stop giving out aroma despite nobody appreciating it.
A good man who is moral and well-behaved won't give up his principles despite poverty.

Model description

This model is a fine-tuned version of ChatGLM3-6B on the coolshell-llm dataset by using Qlora-4bit method. For detailed usage instructions and more information, please visit the coolshell-llm GitHub page.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • num_epochs: 25.0

Framework versions

  • PEFT 0.7.1
  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.15.0
  • Tokenizers 0.15.0
  • LLaMA-Factory 0.4.0

Demo

Welcome to MegaEase CoolShell LLM demo. Use `exit` to exit the application.

User: 酷壳网有哪些内容
酷壳网(Coolshell)提供了广泛的技术相关内容,主要涉及以下几个方面:
编程语言和框架:涵盖多种编程语言如Python、Java、Go、JavaScript等,以及相关的框架和工具的使用技巧和最佳实践。
算法和数据结构:介绍常用的算法和数据结构,包括它们的理论基础和实际应用。
系统架构:讨论软件系统设计和架构,包括微服务、分布式系统、云计算等领域。
软件开发方法论:涉及敏捷开发、DevOps、持续集成等现代软件开发方法和实践。
前端和后端技术:探讨前端开发的最新技术趋势,以及后端开发如服务器端渲染、RESTful API等架构设计。
数据库技术:涵盖数据库系统设计、SQL语言、数据库优化、数据库安全等方面的内容。
软件工程和项目管理:讨论软件工程的基本原则和方法,以及项目开发中的风险管理、测试驱动开发等实践。
互联网行业趋势和见解:提供最新的互联网行业动态和趋势分析,涉及人工智能、物联网、区块链等领域。
技术趋势和新闻:对当前技术发展的最新动态和新闻进行分析和讨论。
酷壳网以深入浅出的语言和丰富的案例,为读者提供了实用的技术知识,是编程领域值得信赖的更多信息源。

User: exit

Statement

The CoolShell LLM model aims to perpetuate the spirit of Mr. Chen Hao. Do not use the open-source model and code, and any derivatives produced from the open-source project, for any purpose that may harm the nation and society, or for any service that has not undergone safety assessment and registration.

Although every effort has been made to ensure the compliance and accuracy of the data at every stage of model training, due to the influence of probabilistic randomness, the accuracy of output content cannot be guaranteed. Furthermore, the model's output can be easily misled by user input. This project does not assume any responsibility for data security, public opinion risks, or any risks and liabilities arising from the model being misled, abused, disseminated, or improperly utilized.

Special Thanks

We are immensely grateful to LLaMA-Factory for providing such a feature-rich and easy-to-use LLM fine-tuning framework. Similarly, we would like to thank Zhipu AI and the KEG Laboratory of Tsinghua University for their open-source contribution to the ChatGLM3 model. Without their exceptional work, the establishment of this repository would not have been possible.

Downloads last month
0
Unable to determine this model's library. Check the docs .
Invalid base_model specified in model card metadata. Needs to be a model id from hf.co/models.