Update README.md

5a1a43c 9 months ago

No virus

8.42 kB

	---
	library_name: peft
	base_model: meta-llama/Llama-2-7b-chat-hf
	---



	![image/png](https://cdn-uploads.huggingface.co/production/uploads/652250487bf8cc2dd2bd425d/EVXOPW10F1NnGTkbSMqkN.png)

	adapter_config.json은 머신 러닝에서 어댑터(Adapter) 모델을 구성하기 위한 설정을 담고 있는 JSON 파일입니다. 어댑터는 사전 훈련된 모델에 추가하여 모델의 일부를 적은 계산 비용으로 수정할 수 있게 하는 모듈입니다. 이 설정 파일에는 어댑터 레이어의 차원, 학습률, 활성화 함수 등의 어댑터에 관한 구성 옵션이 포함될 수 있습니다.

	True Positives (TP): 257개의 샘플이 긍정으로 올바르게 분류되었습니다.
	True Negatives (TN): 284개의 샘플이 부정으로 올바르게 분류되었습니다.
	False Positives (FP): 208개의 샘플이 부정임에도 불구하고 긍정으로 잘못 분류되었습니다.
	False Negatives (FN): 251개의 샘플이 긍정임에도 불구하고 부정으로 잘못 분류되었습니다.
	정확도(Accuracy): 정확도는 (TP + TN) / (TP + TN + FP + FN)으로 계산되며, 이 경우에는 54.1%로 계산됩니다.
	이 정확도는 모델이 분류 작업을 수행하는 데 있어 중간 정도의 성능을 보여준다고 할 수 있습니다. 일반적으로 분류 모델의 정확도가 50%를 조금 넘으면 무작위 추측보다는 낫지만, 여전히 많은 개선이 필요함을 의미합니다. 특히, FN과 FP가 높은 경우, 모델이 특정 클래스를 분류하는 데 문제가 있음을 나타냅니다.

	NSMC(Naver Sentiment Movie Corpus): 'nsmc'는 네이버 영화 리뷰에 대한 감정 분석을 위한 데이터셋으로, 대략 20만 개의 리뷰로 구성되어 있으며 각 리뷰에는 긍정 혹은 부정의 레이블이 지정되어 있습니다.
	데이터 사용: 이 데이터셋은 주로 한국어 텍스트의 감정 분석을 위해 사용되며, 모델이 자연어 이해 능력을 학습하고 검증하는 데 유용합니다. 훈련 데이터는 'train' 부분의 첫 2000개 샘플을, 테스트 데이터는 'test' 부분의 첫 1000개 샘플을 사용하고 있습니다.

	테스트 조건:
	시퀀스 길이: 텍스트의 입력 시퀀스 길이는 코드에 따라 설정할 수 있으나, GPU 메모리 부족으로 200과 같이 설정했습니다.
	배치 사이즈: 학습과 평가에 사용되는 배치 사이즈는 각각 1로 설정되어 있으며, 이는 매우 작은 크기입니다.
	그라디언트 축적: 모델은 그라디언트를 2개의 스텝마다 축적합니다.
	학습률: 기본 설정으로는 1e-4의 학습률을 사용하며, 코사인 학습률 스케줄러(cosine learning rate scheduler)를 사용합니다.
	에포크: 모델은 한 에포크(epoch) 동안 훈련됩니다.
	최적화기: 페이지된 아담W 32비트(paged_adamw_32bit) 최적화기를 사용합니다.
	정밀도: 모델은 반정밀도(fp16)를 사용하여 학습합니다.

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->



	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->



	- Developed by: [More Information Needed]
	- Funded by [optional]: [More Information Needed]
	- Shared by [optional]: [More Information Needed]
	- Model type: [More Information Needed]
	- Language(s) (NLP): [More Information Needed]
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: [More Information Needed]
	- Demo [optional]: [More Information Needed]

	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	### Direct Use

	<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

	[More Information Needed]

	### Downstream Use [optional]

	<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->

	[More Information Needed]

	### Out-of-Scope Use

	<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

	[More Information Needed]

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	[More Information Needed]

	### Recommendations

	<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->

	Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	[More Information Needed]

	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	[More Information Needed]

	### Training Procedure

	<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

	#### Preprocessing [optional]

	[More Information Needed]


	#### Training Hyperparameters

	- Training regime: [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->

	[More Information Needed]

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	<!-- This should link to a Dataset Card if possible. -->

	[More Information Needed]

	#### Factors

	<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->

	[More Information Needed]

	#### Metrics

	<!-- These are the evaluation metrics being used, ideally with a description of why. -->

	[More Information Needed]

	### Results

	[More Information Needed]

	#### Summary



	## Model Examination [optional]

	<!-- Relevant interpretability work for the model goes here -->

	[More Information Needed]

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: [More Information Needed]
	- Hours used: [More Information Needed]
	- Cloud Provider: [More Information Needed]
	- Compute Region: [More Information Needed]
	- Carbon Emitted: [More Information Needed]

	## Technical Specifications [optional]

	### Model Architecture and Objective

	[More Information Needed]

	### Compute Infrastructure

	[More Information Needed]

	#### Hardware

	[More Information Needed]

	#### Software

	[More Information Needed]

	## Citation [optional]

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	[More Information Needed]

	APA:

	[More Information Needed]

	## Glossary [optional]

	<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->

	[More Information Needed]

	## More Information [optional]

	[More Information Needed]

	## Model Card Authors [optional]

	[More Information Needed]

	## Model Card Contact

	[More Information Needed]


	## Training procedure

	The following `bitsandbytes` quantization config was used during training:
	- quant_method: bitsandbytes
	- load_in_8bit: False
	- load_in_4bit: True
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: nf4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: bfloat16

	### Framework versions

	- PEFT 0.7.0