DopeorNope
/

Ko-Mixtral-v1.4-MoE-7Bx2

@@ -1,201 +1,104 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+license: cc-by-nc-sa-4.0
 ---
 # Model Card for Model ID
+기존의 DopeorNope/Ko-Mixtral-v1.3-MoE-7Bx2 모델에서 향상된 1.4버전입니다.
+추가된 사항은 다음과 같습니다.
+1. 훈련에 활용된 코퍼스를 매뉴얼하게 검토하고 이상한 코퍼스를 수정하고 정제하였습니다.
+2. Near dudup 알고리즘을 적용하여 중복되는 코퍼스를 제거하였습니다.
+3. 기존의 3가지 task에서 한가지 task를 추가하였습니다.
+## Model Details
+### Model Description
+- **Developed by:** DopeorNope(Seungyoo Lee), kyujinpy(Kyujin Han)
+- **Model type:** Mixtral
+- **Language:** English based model but finetuned with Korean corpus
+- **License:** cc-by-nc-sa-4.0
+- **Finetuned from model:** DopeorNope/Ko-Mixtral-v1.3-MoE-7Bx2
+- **funded by:** the Ministry of Science and ICT(MSIT, Korea) & Gwangju Metropolitan City
+## Training
 #### Testing Data
+AI-HUB에서 제공된 코퍼스를 가지고 다음과 같은 4가지 task를 text mining으로 만들어 적용하였습니다.
+- **1.Mask prediction Task**
+```python
+#Mask prediction
+#문장에서 한국어 단어를 마스킹 한 이후, 이 단어를 예측하게 만드는 Task입니다.
+Text='지능(智能) 또는 인텔리전스(intelligence)�� 인간의 <MASK> 능력을 말한다.'
+Response='지적'
+Complete_text='지능(智能) 또는 인텔리전스(intelligence)는 인간의 지적 능력을 말한다.'
+```
+- **2.Text-allign Task**
+```python
+#Text-allign Task
+#문단에서 각 문장들을 추출하고 추출한 문장들을 무작위로 섞은 후 섞은 문장들을 문맥상 적절하게 배열하는 태스트 입니다.
+Text_list=['복수명령-복수자료(MIMD,Multiple Instruction, Multiple Data)은 전산에서 병렬화의 한 기법이다.',
+           '분산 메모리의 예는 MPP(massively parallel processors)와 COW (Clusters of Workstations)이다.',
+           'MIMD기계는 공유 메모리이거나 분산 메모리이며 이러한 분류는 MIMD가 어떻게 메모리를 이용하느냐에 따라 나뉜다.']
+Response='복수명령-복수자료(MIMD,Multiple Instruction, Multiple Data)은 전산에서 병렬화의 한 기법이다.
+          MIMD기계는 공유 메모리이거나 분산 메모리이며 이러한 분류는 MIMD가 어떻게 메모리를 이용하느냐에 따라 나뉜다.
+          분산 메모리의 예는 MPP(massively parallel processors)와 COW (Clusters of Workstations)이다.'
+```
+- **3.Text completion Task**
+```python
+#Text Completion
+# 문단에서 마지막 문장을 추출하고, 추출된 문장의 이전의 문단까지를 input으로 하여 마지막 문장을 예측하게 하는 과제입니다.
+Text= '그린브라우저(GreenBrowser)는 인터넷 익스플로러에서 사용하는 트라이던트 레이아웃 엔진을 바탕으로 하며 중국에 기반을 둔 소프트웨어 회사인 모어퀵(morequick)에서 만든 무료 웹 브라우저다. 간체자 중국어가 웹 브라우저에 내장되어 있다.
+      맥스톤 웹 브라우저와 비슷하여 MyIE와 밀접하게 관련되어 있다. 맥스톤용의 일부 플러그인이 그린브라우저에서도 작동할 것이다.'
+Response= '자동 스크롤, 자동 리프레시, 자동 저장, 자동 폼 채우기와 같은 많은 자동화 기능이 있다.'
+```
+- **4. Sentence Genration**
+```python
+#Text Completion
+# 문장에서 모든 단어들을 추출하고 무작위로 섞은 후 중복된 단어를 제거하고, 제시된 단어 리스트를 기반으로 완벽한 문장을 생성해내는 task입니다.
+Word_List: ['φ의', '제어에서는', '제어와', '표현이', 'ψ', '로봇', '쓰인다', 'θ', '같은', '자주', '기기']
+response= '자동 스크롤, 자동 리프레시, 자동 저장, 자동 폼 채우기와 같은 많은 자동화 기능이 있다.'
+```
+### Environments
+- **Hardware Type:** Nvidia A100 x 4
+- **Training hours:** 3 Days