cheonyumin's picture
Update README.md
5a1a43c
---
library_name: peft
base_model: meta-llama/Llama-2-7b-chat-hf
---
![image/png](https://cdn-uploads.huggingface.co/production/uploads/652250487bf8cc2dd2bd425d/EVXOPW10F1NnGTkbSMqkN.png)
adapter_config.json은 λ¨Έμ‹  λŸ¬λ‹μ—μ„œ μ–΄λŒ‘ν„°(Adapter) λͺ¨λΈμ„ κ΅¬μ„±ν•˜κΈ° μœ„ν•œ 섀정을 λ‹΄κ³  μžˆλŠ” JSON νŒŒμΌμž…λ‹ˆλ‹€. μ–΄λŒ‘ν„°λŠ” 사전 ν›ˆλ ¨λœ λͺ¨λΈμ— μΆ”κ°€ν•˜μ—¬ λͺ¨λΈμ˜ 일뢀λ₯Ό 적은 계산 λΉ„μš©μœΌλ‘œ μˆ˜μ •ν•  수 있게 ν•˜λŠ” λͺ¨λ“ˆμž…λ‹ˆλ‹€. 이 μ„€μ • νŒŒμΌμ—λŠ” μ–΄λŒ‘ν„° λ ˆμ΄μ–΄μ˜ 차원, ν•™μŠ΅λ₯ , ν™œμ„±ν™” ν•¨μˆ˜ λ“±μ˜ μ–΄λŒ‘ν„°μ— κ΄€ν•œ ꡬ성 μ˜΅μ…˜μ΄ 포함될 수 μžˆμŠ΅λ‹ˆλ‹€.
True Positives (TP): 257개의 μƒ˜ν”Œμ΄ κΈμ •μœΌλ‘œ μ˜¬λ°”λ₯΄κ²Œ λΆ„λ₯˜λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
True Negatives (TN): 284개의 μƒ˜ν”Œμ΄ λΆ€μ •μœΌλ‘œ μ˜¬λ°”λ₯΄κ²Œ λΆ„λ₯˜λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
False Positives (FP): 208개의 μƒ˜ν”Œμ΄ λΆ€μ •μž„μ—λ„ λΆˆκ΅¬ν•˜κ³  κΈμ •μœΌλ‘œ 잘λͺ» λΆ„λ₯˜λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
False Negatives (FN): 251개의 μƒ˜ν”Œμ΄ κΈμ •μž„μ—λ„ λΆˆκ΅¬ν•˜κ³  λΆ€μ •μœΌλ‘œ 잘λͺ» λΆ„λ₯˜λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
정확도(Accuracy): μ •ν™•λ„λŠ” (TP + TN) / (TP + TN + FP + FN)으둜 κ³„μ‚°λ˜λ©°, 이 κ²½μš°μ—λŠ” 54.1%둜 κ³„μ‚°λ©λ‹ˆλ‹€.
이 μ •ν™•λ„λŠ” λͺ¨λΈμ΄ λΆ„λ₯˜ μž‘μ—…μ„ μˆ˜ν–‰ν•˜λŠ” 데 μžˆμ–΄ 쀑간 μ •λ„μ˜ μ„±λŠ₯을 보여쀀닀고 ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 일반적으둜 λΆ„λ₯˜ λͺ¨λΈμ˜ 정확도가 50%λ₯Ό 쑰금 λ„˜μœΌλ©΄ λ¬΄μž‘μœ„ μΆ”μΈ‘λ³΄λ‹€λŠ” λ‚«μ§€λ§Œ, μ—¬μ „νžˆ λ§Žμ€ κ°œμ„ μ΄ ν•„μš”ν•¨μ„ μ˜λ―Έν•©λ‹ˆλ‹€. 특히, FNκ³Ό FPκ°€ 높은 경우, λͺ¨λΈμ΄ νŠΉμ • 클래슀λ₯Ό λΆ„λ₯˜ν•˜λŠ” 데 λ¬Έμ œκ°€ μžˆμŒμ„ λ‚˜νƒ€λƒ…λ‹ˆλ‹€.
NSMC(Naver Sentiment Movie Corpus): 'nsmc'λŠ” 넀이버 μ˜ν™” 리뷰에 λŒ€ν•œ 감정 뢄석을 μœ„ν•œ λ°μ΄ν„°μ…‹μœΌλ‘œ, λŒ€λž΅ 20만 개의 리뷰둜 κ΅¬μ„±λ˜μ–΄ 있으며 각 λ¦¬λ·°μ—λŠ” 긍정 ν˜Ήμ€ λΆ€μ •μ˜ λ ˆμ΄λΈ”μ΄ μ§€μ •λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.
데이터 μ‚¬μš©: 이 데이터셋은 주둜 ν•œκ΅­μ–΄ ν…μŠ€νŠΈμ˜ 감정 뢄석을 μœ„ν•΄ μ‚¬μš©λ˜λ©°, λͺ¨λΈμ΄ μžμ—°μ–΄ 이해 λŠ₯λ ₯을 ν•™μŠ΅ν•˜κ³  κ²€μ¦ν•˜λŠ” 데 μœ μš©ν•©λ‹ˆλ‹€. ν›ˆλ ¨ λ°μ΄ν„°λŠ” 'train' λΆ€λΆ„μ˜ 첫 2000개 μƒ˜ν”Œμ„, ν…ŒμŠ€νŠΈ λ°μ΄ν„°λŠ” 'test' λΆ€λΆ„μ˜ 첫 1000개 μƒ˜ν”Œμ„ μ‚¬μš©ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
ν…ŒμŠ€νŠΈ 쑰건:
μ‹œν€€μŠ€ 길이: ν…μŠ€νŠΈμ˜ μž…λ ₯ μ‹œν€€μŠ€ κΈΈμ΄λŠ” μ½”λ“œμ— 따라 μ„€μ •ν•  수 μžˆμœΌλ‚˜, GPU λ©”λͺ¨λ¦¬ λΆ€μ‘±μœΌλ‘œ 200κ³Ό 같이 μ„€μ •ν–ˆμŠ΅λ‹ˆλ‹€.
배치 μ‚¬μ΄μ¦ˆ: ν•™μŠ΅κ³Ό 평가에 μ‚¬μš©λ˜λŠ” 배치 μ‚¬μ΄μ¦ˆλŠ” 각각 1둜 μ„€μ •λ˜μ–΄ 있으며, μ΄λŠ” 맀우 μž‘μ€ ν¬κΈ°μž…λ‹ˆλ‹€.
κ·ΈλΌλ””μ–ΈνŠΈ 좕적: λͺ¨λΈμ€ κ·ΈλΌλ””μ–ΈνŠΈλ₯Ό 2개의 μŠ€ν…λ§ˆλ‹€ μΆ•μ ν•©λ‹ˆλ‹€.
ν•™μŠ΅λ₯ : κΈ°λ³Έ μ„€μ •μœΌλ‘œλŠ” 1e-4의 ν•™μŠ΅λ₯ μ„ μ‚¬μš©ν•˜λ©°, 코사인 ν•™μŠ΅λ₯  μŠ€μΌ€μ€„λŸ¬(cosine learning rate scheduler)λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
에포크: λͺ¨λΈμ€ ν•œ 에포크(epoch) λ™μ•ˆ ν›ˆλ ¨λ©λ‹ˆλ‹€.
μ΅œμ ν™”κΈ°: νŽ˜μ΄μ§€λœ μ•„λ‹΄W 32λΉ„νŠΈ(paged_adamw_32bit) μ΅œμ ν™”κΈ°λ₯Ό μ‚¬μš©ν•©λ‹ˆλ‹€.
정밀도: λͺ¨λΈμ€ λ°˜μ •λ°€λ„(fp16)λ₯Ό μ‚¬μš©ν•˜μ—¬ ν•™μŠ΅ν•©λ‹ˆλ‹€.
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [More Information Needed]
- **Funded by [optional]:** [More Information Needed]
- **Shared by [optional]:** [More Information Needed]
- **Model type:** [More Information Needed]
- **Language(s) (NLP):** [More Information Needed]
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** [More Information Needed]
- **Paper [optional]:** [More Information Needed]
- **Demo [optional]:** [More Information Needed]
## Uses
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
[More Information Needed]
### Downstream Use [optional]
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
[More Information Needed]
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
[More Information Needed]
## Bias, Risks, and Limitations
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
[More Information Needed]
### Recommendations
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
## How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
[More Information Needed]
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Preprocessing [optional]
[More Information Needed]
#### Training Hyperparameters
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
#### Speeds, Sizes, Times [optional]
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
[More Information Needed]
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
### Testing Data, Factors & Metrics
#### Testing Data
<!-- This should link to a Dataset Card if possible. -->
[More Information Needed]
#### Factors
<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
[More Information Needed]
#### Metrics
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
[More Information Needed]
### Results
[More Information Needed]
#### Summary
## Model Examination [optional]
<!-- Relevant interpretability work for the model goes here -->
[More Information Needed]
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
## Technical Specifications [optional]
### Model Architecture and Objective
[More Information Needed]
### Compute Infrastructure
[More Information Needed]
#### Hardware
[More Information Needed]
#### Software
[More Information Needed]
## Citation [optional]
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
**BibTeX:**
[More Information Needed]
**APA:**
[More Information Needed]
## Glossary [optional]
<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
[More Information Needed]
## More Information [optional]
[More Information Needed]
## Model Card Authors [optional]
[More Information Needed]
## Model Card Contact
[More Information Needed]
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: bfloat16
### Framework versions
- PEFT 0.7.0