--- license: apache-2.0 base_model: - deepseek-ai/deepseek-llm-7b-chat pipeline_tag: text-generation tags: - silk - eco-friendly - sustainable - agriculture - deepseek - llama - fine-tuned - zh - chinese --- # EcoSilkModel ## Model Overview **EcoSilkModel** is a fine-tuned language model based on [DeepSeek-LLM-7B](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base), specifically designed for sustainable agriculture, silk production, and eco-friendly practices. The model excels in the following tasks: - **Silk Production Guidance**: Provides recommendations for sustainable silk farming and sericulture. - **Eco-friendly Practices**: Offers suggestions for environmentally friendly agricultural practices. - **Chinese Language Support**: Focuses on Chinese tasks while also supporting both Chinese and English. This model is part of the **EcoSilk Project**, aimed at promoting sustainable development in the silk industry. ## Applicable Scenarios The model is suitable for the following scenarios: - **Researchers**: Studying sustainable agriculture and silk production. - **Farmers**: Seeking guidance on eco-friendly agricultural practices. - **Educators**: Teaching sustainable agricultural practices. - **Developers**: Building applications for the silk and agricultural industries. ## Training Data The model was fine-tuned on the following datasets: 1. **zhanxu/ecosilk-chat**: A Chinese instruction-following dataset used for fine-tuning language models.(https://huggingface.co/datasets/zhanxu/ecosilk-chat) The dataset is derived from professional literature across various fields. It uses open-source tools to automatically annotate question-answer datasets, which are then manually cleaned and filtered. For more details, see [GitHub - ConardLi/easy-dataset: A powerful tool for creating fine-tuning datasets for LLM](https://github.com/ConardLi/easy-dataset). ## Fine-tuning Details - **Base Model**: [DeepSeek-LLM-7B](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base) - **Fine-tuning Method**: LoRA (Low-Rank Adaptation) - **Template**: `llama3` - **Languages**: Chinese (`zh`) and English (`en`) - **Truncation Length**: 1024 tokens # How to Use Our Model Here are some examples of how to use our model. ## Chat Completion ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig model_name = "deepseek-ai/deepseek-llm-7b-chat" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto") model.generation_config = GenerationConfig.from_pretrained(model_name) model.generation_config.pad_token_id = model.generation_config.eos_token_id messages = [ {"role": "user", "content": "Who are you?"} ] input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt") outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100) result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True) print(result)