brildev7
/

gemma-7b-summarization-ko-sft-qlora

Summarization

PEFT

Safetensors

Korean

gemma

Model card Files Files and versions Community

brildev7 commited on Mar 21

Commit

3d5441e

•

1 Parent(s): 047f7c6

Update README.md

Browse files

Files changed (1) hide show

README.md +55 -198

README.md CHANGED Viewed

@@ -9,209 +9,66 @@ tags:
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 - **Developed by:** [Kang Seok Ju]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-'''
-import os
-from dataclasses import dataclass, field
-from typing import Optional
-import torch
-from transformers import AutoTokenizer, HfArgumentParser, AutoModelForCausalLM, BitsAndBytesConfig, TrainingArguments
-from datasets import load_dataset
-from peft import LoraConfig, PeftModel
-from transformers import BitsAndBytesConfig
-'''
 ## Training Details
 ### Training Data
 https://huggingface.co/datasets/raki-1203/ai_hub_summarization
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.8.2

 ---
 # Model Card for Model ID
 ## Model Details
 ### Model Description
+Summarise Korean sentences concisely
 - **Developed by:** [Kang Seok Ju]
+- **Contact:** [brildev7@gmail.com]
 ## Training Details
 ### Training Data
 https://huggingface.co/datasets/raki-1203/ai_hub_summarization
+# Inference Examples
+```
+import os
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
+from peft import PeftModel
+model_id = "google/gemma-7b"
+peft_model_id = "brildev7/gemma_7b_summarization_ko_sft_qlora"
+quantization_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_compute_dtype=torch.float32,
+    bnb_4bit_quant_type="nf4"
+)
+model = AutoModelForCausalLM.from_pretrained(model_id,
+                                             quantization_config=quantization_config,
+                                             torch_dtype=torch.float32,
+                                             low_cpu_mem_usage=True,
+                                             attn_implementation="sdpa",
+                                             device_map="auto")
+model = PeftModel.from_pretrained(model, peft_model_id)
+tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
+tokenizer.pad_token_id = tokenizer.eos_token_id
+# example
+prompt_template = "다음 글을 요약하세요.:{}\n요약:"
+passage = "기획재정부는 20일 이 같은 내용의 '주류 면허 등에 관한 법률 시행령' 개정안을 입법 예고했다. 개정안에는 주류 판매업 면허 취소의 예외에 해당하는 주류의 단순가공·조작의 범위를 술잔 등 빈 용기에 주류를 나눠 담아 판매하는 경우 등이 포함됐다. 식당·주점 등에서 주류를 판매할 때 술을 잔에 나눠 판매할 수 있다는 의미다. 종합주류도매업자가 주류제조자 등이 제조·판매하는 비알코올 음료 또는 무알코올 음료를 주류와 함께 음식점 등에 공급할 수 있도록 주류판매 전업의무 면허요건도 완화했다. 현재 알코올 도수가 0%인 음료는 '무알코올 음료'로, 0% 이상 1% 미만인 것은 '비알코올 음료'로 구분된다. 현행 규정상 무알코올·비알코올 주류는 주류 업자가 유통할 수 없는데 이 규정을 완화한다는 것이다. 기재부는 다음 달 29일까지 의견 수렴을 거쳐 이르면 다음 달 말부터 시행할 예정이다．"
+prompt = prompt_template.format(passage)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs,
+                        max_new_tokens=512,
+                        temperature=0.2,
+                        top_p=0.95,
+                        do_sample=True,
+                        use_cache=False)
+print(tokenizer.decode(outputs[0]))
+- 기획재정부는 20일 주류 판매업 면허 취소의 예외에 해당하는 주류의 단순가공·조작의 범위를 술잔 등 빈 용기에 주류를 나눠 담아 판매하는 경우 등이 포함된 '주류 면허 등에 관한 법률 시행령' 개정안을 입법 예고했다.
+# example
+prompt_template = "다음 글을 요약하세요.:{}\n요약:"
+passage = "지난 1월 일본 오사카 우메다의 뷰티샵 ‘앳코스메’에서 진행된 CJ올리브영의 메이크업 브랜드(PB) ‘바이오힐 보’의 팝업 스토어 현장.  오사카 최대 규모를 자랑하는 앳코스메 매장 한 가운데 꾸며진 팝업 스토어에는 한국에서 인기 높은 화장품을 실제로 경험해보려는 고객들로 발 디딜 틈 없이 북적거렸다.  타이완 국적자이지만 오사카에서 거주하고 있다는 32살 쿠이잉씨는 이날 팝업 스토어를 찾아 바이오힐 보의 ‘탄탄크림’을 구매했다.  사회관계망서비스(SNS)와 유튜브를 통해 한국 화장품이 좋다는 평을 들어본 터라 이번 기회에 구매해 사용해보기로 결심했다고 한다. 쿠이잉씨는 한국 화장품을 쓰면 한국 여성처럼 예뻐지지 않을까 기대가 된다고 말했다.  이날 앳코스메는 바이오힐 보 팝업 뿐만 아니라 눈에 잘 띄는 메인 진열대 상당수가 한국 브랜드 차지였다.  대부분 한국에서도 인기가 높은 브랜드들로, 입구에서 바로 보이는 진열대에는 ‘웨이크메이크’와 ‘피치씨’, ‘어뮤즈’가, 해외 명품 브랜드 존 정중앙에는 ‘헤라’가 자리하고 있었다.  일본 내 K뷰티의 인기가 예사롭지 않다. ‘제 3차 한류붐’이라고까지 일컬어지는 한류열풍을 타고 일본 내 K뷰티의 입지가 나날이 치솟고 있다.  과거에는 일본 내에서 한국 문화를 좋아하는 일부 소비자들 사이에서만 유행하는 수준이었다면, 지금은 일본 뷰티 시장에 하나의 카테고리로 K뷰티가 자리를 잡았다는 평가다.   21일 베인앤드컴퍼니와 유로모니터에 따르면 K뷰티의 일본 지역별 침투율(특정 기간 동안 특정 상품 소비 규모 비중)은 2017년 1%에서 2022년 4.9%로 5년 만에 5배가 증가했다. 최근 3년간 연평균 성장률은 20%가 넘는다.  지난해에는 일본 수입 화장품 국가별 비중에서 한국이 처음으로 프랑스를 제치고 1위에 오르기도 했다. 서효주 베인앤드컴퍼니 파트너는 지금보다 3~4배 이상 성장할 여력이 충분하다고 말했다.  일본 여성들이 K뷰티에 매료된 이유는 무엇일까. 가장 큰 이유로는 ‘높은 가성비(가격 대비 성능)’가 꼽힌다.  업계에 따르면 실제 일본에서 많이 판매되는 한국 화장품 브랜드의 기초제품들은 일본 브랜드에 비해 제품 가격이 10~20% 가량 저렴한 편이다.  이는 한국콜마와 코스맥스 같은 국내 화장품 OEM(주문자 상표 부착 생산)·ODM(주문자 개발생산) 제조사들의 성장 덕이 크다. 이들의 기술력은 세계 최고 수준으로, 세계 최대 화장품 기업인 로레알도 고객사일 정도다.  이들은 단순 제품 제조를 넘어 신제품을 개발해 브랜드에 먼저 제안하고 또 필요시 마케팅까지 지원해 브랜드를 키우는 서비스를 제공하고 있다. 한국 뷰티 브랜드 대부분이 이들을 통해 제품을 만들고 있어 중소 규모 K뷰티 브랜드도 품질이 보장된다는 얘기다.  또 K뷰티 제품의 강점으로는 △독특하고 트렌디한 컨셉 △발빠른 신제품 출시 △예쁜 패키지 등이 거론된다.  이를 방증하듯 최근 일본에선 위의 강점들을 갖춘 한국의 신진 메이크업 브랜드들이 인기다.  실제로 일본 내 트위터와 유튜브 등 SNS에서는 수십~수백만 팔로워를 보유한 현지 인플루언서들도 일명 ‘내돈내산’(내 돈 주고 내가 산 물건) 영상에서 자발적으로 K뷰티 메이크업 브랜드 제품을 소개하고 있다.   지난 1월 일본 오사카에 소재한 뷰티 랭킹샵 ‘앳코스메 우메다점’에서 일본 여성들이 한국 코스메틱 브랜드 ‘라카(Laka)’의 제품을 살펴보고 있는 모습. [김효혜 기자] 대표적인 예가 ‘라카’다. 한국보다 일본에서 더 유명한 라카는 100만 구독자를 보유하고 있는 메이크업 아티스트이자 유튜버 ‘히로’(오다기리 히로)가 영상에서 제품을 추천해 홍보 효과를 톡톡히 봤다.  이민미 라카 대표는 일본에서 특정 제품이 갑자기 하루에 수천개가 팔려 무슨 일인가 봤는데, 현지 유명 유튜버가 추천한 영상이 올라왔더라며 협찬이나 광고가 아니어서 더 놀랐다고 말했다.  이에 지난 2020년 처음 일본에 진출한 라카는 올해 1월 말 일본 전역 약 350여개 매장에 입점하는 성과를 올렸다. 2021년 47억원에 불과했던 라카의 매출도 지난해 4배가 넘게 상승해 200억원에 육박한다.  일본 시장에서 두각을 보이는 국내 화장품 브랜드들이 늘면서 새롭게 진출을 타진하거나 준비하고 있는 업체들도 늘고 있다.  그동안 한국 화장품의 가장 큰 시장이었던 중국이 경기 침체 및 정치적 이슈 등으로 쪼그라들고 있는 상황에서 일본이 이를 대체할 새로운 시장으로 부상한 것이다.  일본 화장품 판매 채널들도 K뷰티 유치에 적극적이다. 앳코스메의 경우 거의 매달 K뷰티 팝업이 열리고 있는 수준으로, 오는 5월에는 도쿄점에서 K뷰티 페스티벌도 열 계획이다. 로프트와 프라자 등도 K뷰티 유치 경쟁이 뜨겁다.  CJ올리브영 관계자는 한국 화장품에 대한 반응이 좋고 특히 올리브영에서 인기 있는 브랜드에 대한 수요가 높다 보니 플랫폼에서 먼저 팝업 요청이 왔다며 앞으로도 일본 시장 유통에 더욱 적극적으로 나서려 한다고 전했다."
+prompt = prompt_template.format(passage)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs,
+                        max_new_tokens=512,
+                        temperature=0.2,
+                        top_p=0.95,
+                        do_sample=True,
+                        use_cache=False)
+print(tokenizer.decode(outputs[0]))
+- 일본 내 K뷰티의 인기가 예사롭지 않은 가운데, 일본 내에서 한국 문화를 좋아하는 일부 소비자들 사이에서만 유행하는 수준이었던 K뷰티가 지금은 일본 뷰티 시장에 하나의 카테고리로 자리 잡았다는 평가를 받고 있다.