zhanxu
/

EcosilkChat-7b

Text Generation

Model card Files Files and versions Community

EcosilkChat-7b / README.md

zhanxu's picture

Update README.md

fad3757 verified about 19 hours ago

|

history blame contribute delete

3.17 kB

	---
	license: apache-2.0
	base_model:
	- deepseek-ai/deepseek-llm-7b-chat
	pipeline_tag: text-generation
	tags:
	- silk
	- eco-friendly
	- sustainable
	- agriculture
	- deepseek
	- llama
	- fine-tuned
	- zh
	- chinese
	---
	# EcoSilkModel

	## Model Overview

	EcoSilkModel is a fine-tuned language model based on [DeepSeek-LLM-7B](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base), specifically designed for sustainable agriculture, silk production, and eco-friendly practices.

	![EcoSilkModel Logo](https://huggingface.co/zhanxu/EcosilkChat-7b/resolve/main/modellogo.jpg)
	The model excels in the following tasks:

	- Silk Production Guidance: Provides recommendations for sustainable silk farming and sericulture.
	- Eco-friendly Practices: Offers suggestions for environmentally friendly agricultural practices.
	- Chinese Language Support: Focuses on Chinese tasks while also supporting both Chinese and English.

	This model is part of the EcoSilk Project, aimed at promoting sustainable development in the silk industry.

	## Applicable Scenarios

	The model is suitable for the following scenarios:

	- Researchers: Studying sustainable agriculture and silk production.
	- Farmers: Seeking guidance on eco-friendly agricultural practices.
	- Educators: Teaching sustainable agricultural practices.
	- Developers: Building applications for the silk and agricultural industries.

	## Training Data

	The model was fine-tuned on the following datasets:

	1. zhanxu/ecosilk-chat: A Chinese instruction-following dataset used for fine-tuning language models.（https://huggingface.co/datasets/zhanxu/ecosilk-chat）
	The dataset is derived from professional literature across various fields. It uses open-source tools to automatically annotate question-answer datasets, which are then manually cleaned and filtered.
	For more details, see [GitHub - ConardLi/easy-dataset: A powerful tool for creating fine-tuning datasets for LLM](https://github.com/ConardLi/easy-dataset).

	## Fine-tuning Details

	- Base Model: [DeepSeek-LLM-7B](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Template: `llama3`
	- Languages: Chinese (`zh`) and English (`en`)
	- Truncation Length: 1024 tokens

	# How to Use Our Model

	Here are some examples of how to use our model.

	## Chat Completion

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig

	model_name = "deepseek-ai/deepseek-llm-7b-chat"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
	model.generation_config = GenerationConfig.from_pretrained(model_name)
	model.generation_config.pad_token_id = model.generation_config.eos_token_id

	messages = [
	{"role": "user", "content": "Who are you?"}
	]
	input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
	outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)

	result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
	print(result)