adapter_config.json은 머신 러닝에서 어댑터(Adapter) 모델을 구성하기 위한 설정을 담고 있는 JSON 파일입니다. 어댑터는 사전 훈련된 모델에 추가하여 모델의 일부를 적은 계산 비용으로 수정할 수 있게 하는 모듈입니다. 이 설정 파일에는 어댑터 레이어의 차원, 학습률, 활성화 함수 등의 어댑터에 관한 구성 옵션이 포함될 수 있습니다.

True Positives (TP): 257개의 샘플이 긍정으로 올바르게 분류되었습니다. True Negatives (TN): 284개의 샘플이 부정으로 올바르게 분류되었습니다. False Positives (FP): 208개의 샘플이 부정임에도 불구하고 긍정으로 잘못 분류되었습니다. False Negatives (FN): 251개의 샘플이 긍정임에도 불구하고 부정으로 잘못 분류되었습니다. 정확도(Accuracy): 정확도는 (TP + TN) / (TP + TN + FP + FN)으로 계산되며, 이 경우에는 54.1%로 계산됩니다. 이 정확도는 모델이 분류 작업을 수행하는 데 있어 중간 정도의 성능을 보여준다고 할 수 있습니다. 일반적으로 분류 모델의 정확도가 50%를 조금 넘으면 무작위 추측보다는 낫지만, 여전히 많은 개선이 필요함을 의미합니다. 특히, FN과 FP가 높은 경우, 모델이 특정 클래스를 분류하는 데 문제가 있음을 나타냅니다.

NSMC(Naver Sentiment Movie Corpus): 'nsmc'는 네이버 영화 리뷰에 대한 감정 분석을 위한 데이터셋으로, 대략 20만 개의 리뷰로 구성되어 있으며 각 리뷰에는 긍정 혹은 부정의 레이블이 지정되어 있습니다. 데이터 사용: 이 데이터셋은 주로 한국어 텍스트의 감정 분석을 위해 사용되며, 모델이 자연어 이해 능력을 학습하고 검증하는 데 유용합니다. 훈련 데이터는 'train' 부분의 첫 2000개 샘플을, 테스트 데이터는 'test' 부분의 첫 1000개 샘플을 사용하고 있습니다.

테스트 조건: 시퀀스 길이: 텍스트의 입력 시퀀스 길이는 코드에 따라 설정할 수 있으나, GPU 메모리 부족으로 200과 같이 설정했습니다. 배치 사이즈: 학습과 평가에 사용되는 배치 사이즈는 각각 1로 설정되어 있으며, 이는 매우 작은 크기입니다. 그라디언트 축적: 모델은 그라디언트를 2개의 스텝마다 축적합니다. 학습률: 기본 설정으로는 1e-4의 학습률을 사용하며, 코사인 학습률 스케줄러(cosine learning rate scheduler)를 사용합니다. 에포크: 모델은 한 에포크(epoch) 동안 훈련됩니다. 최적화기: 페이지된 아담W 32비트(paged_adamw_32bit) 최적화기를 사용합니다. 정밀도: 모델은 반정밀도(fp16)를 사용하여 학습합니다.

Model Card for Model ID

Model Details

Model Description

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: False
load_in_4bit: True
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: bfloat16

Framework versions

PEFT 0.7.0

cheonyumin
/

lora-llama-2-7b-food-order-understanding

Model Card for Model ID

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Training procedure

Framework versions

Adapter for

Model Card for Model ID

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

Training procedure

Framework versions

Adapter for meta-llama/Llama-2-7b-chat-hf

Adapter for