File size: 10,689 Bytes
caa49cc
 
 
 
 
 
 
 
 
 
 
 
e90a1b6
 
 
 
 
9dab2e1
 
 
 
 
 
 
 
9a89eb3
9dab2e1
 
e90a1b6
 
 
 
 
 
 
 
 
 
 
 
 
 
d9446c8
 
 
 
 
 
 
 
 
 
 
 
 
e90a1b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2914830
 
 
 
 
 
 
 
63e8015
 
 
3a5cb6c
9dab2e1
63e8015
 
1dada4c
63e8015
 
2914830
 
 
 
 
d9446c8
2914830
 
 
 
 
 
d9446c8
 
 
 
 
 
 
 
 
 
 
 
 
 
2914830
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
caa49cc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
---
title: AI Tutor BERT
emoji: πŸ“ˆ
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 4.1.2
app_file: app.py
pinned: false
license: apache-2.0
---

# AI Tutor BERT
This model is a BERT model fine-tuned on artificial intelligence (AI) related terms and explanations.

With the increasing interest in artificial intelligence, many people are taking AI-related courses and projects. However, as a graduate student in artificial intelligence, it's not common to find useful resources that are easy for AI beginners to understand. Furthermore, personalized lessons tailored to individual levels and fields are often lacking, making it difficult for many people to start learning about artificial intelligence. To address these challenges, our team has created a language model that plays the role of a tutor in the field of AI terminology. Details about the model type, training dataset, and usage are explained below, so please read them carefully and be sure to try it out.


## How to use?


<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/45afcd24-7ef9-4149-85d4-2236e23fbf69" width="1400" height="700"/>
https://huggingface.co/spaces/pseudolab/AI_Tutor_BERT


As shown above, you can input passages (context) related to artificial intelligence and questions about terms. Upon pressing "Submit," you will receive corresponding explanations and answers on the right side. (This model only supports English.)


## Model
https://huggingface.co/bert-base-uncased


For the model, I used BERT, which is one of the most famous natural language processing models developed by Google. For more detailed information, please refer to the website mentioned above. To make the question-answering more like a private tutoring experience, I utilized a specialized Question and Answering model within BERT. Here's how you can load it:


```
   from transformers import BertForQuestionAnswering
   
   model = BertForQuestionAnswering.from_pretrained("bert-base-uncased")
```


https://huggingface.co/CountingMstar/ai-tutor-bert-model


Afterwards, I fine-tuned the original BertForQuestionAnswering model using the artificial intelligence-related datasets for my project on creating an AI tutoring model. You can find the fine-tuned AI Tutor BERT model at the link provided, and the usage in Python is as follows.


```
    from transformers import BertForQuestionAnswering

    model = BertForQuestionAnswering.from_pretrained("CountingMstar/ai-tutor-bert-model")
```


## Dataset
### Wikipedia
https://en.wikipedia.org/wiki/Main_Page
### activeloop
https://www.activeloop.ai/resources/glossary/arima-models/
### Adrien Beaulieu
https://product.house/100-ai-glossary-terms-explained-to-the-rest-of-us/


```
Context: 'Feature engineering or feature extraction or feature discovery is the process of extracting features (characteristics, properties, attributes) from raw data. Due to deep learning networks, such as convolutional neural networks, that are able to learn features by themselves, domain-specific-based feature engineering has become obsolete for vision and speech processing. Other examples of features in physics include the construction of dimensionless numbers such as Reynolds number in fluid dynamics; then Nusselt number in heat transfer; Archimedes number in sedimentation; construction of first approximations of the solution such as analytical strength of materials solutions in mechanics, etc..'

Question: 'What is large language model?'

Answer: 'A large language model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation.'
```

The training dataset consists of three components: context, questions, and answers, all related to artificial intelligence. The response (correct answer) data is included within the context data, and the sentence order in the context data has been rearranged to augment the dataset. The question data is focused on artificial intelligence terms as the topic. You can refer to the example above for better understanding. In total, there are over 3,300 data points, stored in pickle files in the 'data' folder. The data has been extracted and processed using HTML from sources such as Wikipedia and other websites. The sources are as mentioned above.


## Training and Result
https://github.com/CountingMstar/AI_BERT/blob/main/MY_AI_BERT_final.ipynb


The training process involves loading data from the 'data' folder and utilizing the BERT Question and Answering model. Detailed instructions for model training and usage can be found in the link provided above.


```
N_EPOCHS = 10
optim = AdamW(model.parameters(), lr=5e-5)
```


I used 10 epochs for training, and I employed the Adam optimizer with a learning rate of 5e-5.


<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/72142ff8-f5c8-47ea-9f19-1e6abb4072cd" width="500" height="400"/>
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/2dd78573-34eb-4ce9-ad4d-2237fc7a5b1e" width="500" height="400"/>


The results, as shown in the graphs above, indicate that, at the last epoch, the loss is 6.917126256477786, and the accuracy is 0.9819078947368421, demonstrating that the model has been trained quite effectively.


Thank you.


---
# AI Tutor BERT (인곡지λŠ₯ κ³Όμ™Έ μ„ μƒλ‹˜ BERT)
이 λͺ¨λΈμ€ 인곡지λŠ₯(AI) κ΄€λ ¨ μš©μ–΄ 및 μ„€λͺ…을 νŒŒμΈνŠœλ‹(fine-tuning)ν•œ BERT λͺ¨λΈμž…λ‹ˆλ‹€.


졜근 인곡지λŠ₯에 κ΄€ν•œ 관심이 λ†’μ•„μ§€λ©΄μ„œ λ§Žμ€ μ‚¬λžŒμ΄ 인곡지λŠ₯ κ΄€λ ¨ μˆ˜μ—… 및 ν”„λ‘œμ νŠΈλ₯Ό μ§„ν–‰ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. κ·ΈλŸ¬λ‚˜ 인곡지λŠ₯ κ΄€λ ¨ λŒ€ν•™μ›μƒμœΌλ‘œμ„œ μ΄λŸ¬ν•œ μˆ˜μš”μ— λΉ„ν•΄ 인곡지λŠ₯ μ΄ˆλ³΄μžλ“€μ΄ 잘 μ•Œμ•„λ“€μ„ 수 μžˆλŠ” μœ μš©ν•œ μžλ£ŒλŠ” ν”μΉ˜ μ•ŠμŠ΅λ‹ˆλ‹€. λ”λΆˆμ–΄ 각자의 μˆ˜μ€€κ³Ό 뢄야에 κ°œμΈν™”λœ κ°•μ˜ λ˜ν•œ λΆ€μ‘±ν•œ μƒν™©μ΄μ–΄μ„œ λ§Žμ€ μ‚¬λžŒλ“€μ΄ 인곡지λŠ₯ ν•™μŠ΅μ„ μ‹œμž‘ν•˜κΈ° μ–΄λ €μ›Œν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. μ΄λŸ¬ν•œ 문제λ₯Ό ν•΄κ²°ν•˜κ³ μž, 저희 νŒ€μ€ 인곡지λŠ₯ μš©μ–΄ λ„λ©”μΈμ—μ„œ κ³Όμ™Έ μ„ μƒλ‹˜ 역할을 ν•˜λŠ” μ–Έμ–΄λͺ¨λΈμ„ λ§Œλ“€μ—ˆμŠ΅λ‹ˆλ‹€. λͺ¨λΈμ˜ μ’…λ₯˜, ν•™μŠ΅ 데이터셋, μ‚¬μš©λ²• 등이 μ•„λž˜μ— μ„€λͺ…λ˜μ–΄ μžˆμœΌλ‹ˆ μžμ„Ένžˆ μ½μ–΄λ³΄μ‹œκ³ , κΌ­ μ‚¬μš©ν•΄ λ³΄μ‹œκΈ° λ°”λžλ‹ˆλ‹€.


## How to use?


<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/45afcd24-7ef9-4149-85d4-2236e23fbf69" width="1400" height="700"/>
https://huggingface.co/spaces/pseudolab/AI_Tutor_BERT


μœ„ κ·Έλ¦Όκ³Ό 같이 인곡지λŠ₯κ΄€λ ¨ 지문(λ¬Έλ§₯)κ³Ό μš©μ–΄ κ΄€λ ¨ μ§ˆλ¬Έμ„ μž…λ ₯ν•΄μ£Όκ³  Submit을 눌러주면, 였λ₯Έμͺ½μ— ν•΄λ‹Ή μš©μ–΄μ— λŒ€ν•œ μ„€λͺ… 닡변이 λ‚˜μ˜΅λ‹ˆλ‹€. 


## Model
https://huggingface.co/bert-base-uncased


λͺ¨λΈμ˜ 경우 μžμ—°μ–΄ 처리 λͺ¨λΈ 쀑 κ°€μž₯ 유λͺ…ν•œ Googleμ—μ„œ κ°œλ°œν•œ BERTλ₯Ό μ‚¬μš©ν–ˆμŠ΅λ‹ˆλ‹€. μžμ„Έν•œ μ„€λͺ…은 μœ„ μ‚¬μ΄νŠΈλ₯Ό μ°Έκ³ ν•˜μ‹œκΈ° λ°”λžλ‹ˆλ‹€. μ§ˆμ˜μ‘λ‹΅μ΄ 주인 κ³Όμ™Έ μ„ μƒλ‹˜λ‹΅κ²Œ, BERT μ€‘μ—μ„œλ„ μ§ˆμ˜μ‘λ‹΅μ— νŠΉν™”λœ Question and Answering λͺ¨λΈμ„ μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€. λΆˆλŸ¬μ˜€λŠ” 법은 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

```
   from transformers import BertForQuestionAnswering
   
   model = BertForQuestionAnswering.from_pretrained("bert-base-uncased")
```


https://huggingface.co/CountingMstar/ai-tutor-bert-model


이후 μ˜€λ¦¬μ§€λ„ BertForQuestionAnswering λͺ¨λΈμ„ 이 ν”„λ‘œμ νŠΈ 주제인 인곡지λŠ₯ κ³Όμ™Έ μ„ μƒλ‹˜ λͺ¨λΈλ‘œ λ§Œλ“€κΈ° μœ„ν•΄ μ•„λž˜μ˜ 인곡지λŠ₯ κ΄€λ ¨ λ°μ΄ν„°μ…‹μœΌλ‘œ νŒŒμΈνŠœλ‹μ„ ν•΄μ€¬μŠ΅λ‹ˆλ‹€. μ΄λ ‡κ²Œ νŒŒμΈνŠœλ‹λœ AI Tutor BERT λͺ¨λΈμ€ μœ„ λ§ν¬μ—μ„œ 찾아보싀 수 있으며, νŒŒμ΄μ¬μ—μ„œμ˜ μ‚¬μš©λ°©λ²•μ€ μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€. 


```
    from transformers import BertForQuestionAnswering

    model = BertForQuestionAnswering.from_pretrained("CountingMstar/ai-tutor-bert-model")
```


## Dataset
### Wikipedia
https://en.wikipedia.org/wiki/Main_Page
### activeloop
https://www.activeloop.ai/resources/glossary/arima-models/
### Adrien Beaulieu
https://product.house/100-ai-glossary-terms-explained-to-the-rest-of-us/


```
Context: 'Feature engineering or feature extraction or feature discovery is the process of extracting features (characteristics, properties, attributes) from raw data. Due to deep learning networks, such as convolutional neural networks, that are able to learn features by themselves, domain-specific-based feature engineering has become obsolete for vision and speech processing. Other examples of features in physics include the construction of dimensionless numbers such as Reynolds number in fluid dynamics; then Nusselt number in heat transfer; Archimedes number in sedimentation; construction of first approximations of the solution such as analytical strength of materials solutions in mechanics, etc..'

Question: 'What is large language model?'

Answer: 'A large language model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation.'
```


ν•™μŠ΅ 데이터셋은 인곡지λŠ₯ κ΄€λ ¨ λ¬Έλ§₯, 질문, 그리고 응닡 μ΄λ ‡κ²Œ 3κ°€μ§€λ‘œ ꡬ성이 λ˜μ–΄μžˆμŠ΅λ‹ˆλ‹€. 응닡(μ •λ‹΅) λ°μ΄ν„°λŠ” λ¬Έλ§₯ 데이터 μ•ˆμ— ν¬ν•¨λ˜μ–΄ 있고, λ¬Έλ§₯ λ°μ΄ν„°μ˜ λ¬Έμž₯ μˆœμ„œλ₯Ό λ°”κΏ”μ£Όμ–΄ 데이터λ₯Ό μ¦κ°•ν•˜μ˜€μŠ΅λ‹ˆλ‹€. 질문 λ°μ΄ν„°λŠ” μ£Όμ œκ°€ λ˜λŠ” 인곡지λŠ₯ μš©μ–΄λ‘œ μ„€μ •ν–ˆμŠ΅λ‹ˆλ‹€. μœ„μ˜ μ˜ˆμ‹œλ₯Ό λ³΄μ‹œλ©΄ μ΄ν•΄ν•˜μ‹œκΈ° νŽΈν•˜μ‹€ κ²λ‹ˆλ‹€. 총 데이터 μˆ˜λŠ” 3300μ—¬ 개둜 data 폴더에 pickle 파일 ν˜•νƒœλ‘œ μ €μž₯λ˜μ–΄ 있고, λ°μ΄ν„°λŠ” Wikipedia 및 λ‹€λ₯Έ μ‚¬μ΄νŠΈλ“€μ„ μ—μ„œ html을 μ΄μš©ν•˜μ—¬ μΆ”μΆœ 및 κ°€κ³΅ν•˜μ—¬ μ œμž‘ν•˜μ˜€μŠ΅λ‹ˆλ‹€. ν•΄λ‹Ή μΆœμ²˜λŠ” μœ„μ™€ κ°™μŠ΅λ‹ˆλ‹€. 


## Training and Result
https://github.com/CountingMstar/AI_BERT/blob/main/MY_AI_BERT_final.ipynb


ν•™μŠ΅ 방식은 data ν΄λ”μ˜ 데이터와 BERT Question and Answering λͺ¨λΈμ„ λΆˆμ–΄μ™€ μ§„ν–‰λ©λ‹ˆλ‹€. μžμ„Έν•œ λͺ¨λΈ ν•™μŠ΅ 및 μ‚¬μš©λ²•μ€ μœ„μ˜ 링크에 μ„€λͺ…λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€.

```
N_EPOCHS = 10
optim = AdamW(model.parameters(), lr=5e-5)
```


에포크(epoch)λŠ” 10을 μ‚¬μš©ν–ˆμœΌλ©°, μ•„λ‹΄ μ˜΅ν‹°λ§ˆμ΄μ Έμ™€ λŸ¬λ‹λ ˆμ΄νŠΈλŠ” 5e-5λ₯Ό μ‚¬μš©ν–ˆμŠ΅λ‹ˆλ‹€.



<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/72142ff8-f5c8-47ea-9f19-1e6abb4072cd" width="500" height="400"/>
<img src="https://github.com/CountingMstar/AI_BERT/assets/90711707/2dd78573-34eb-4ce9-ad4d-2237fc7a5b1e" width="500" height="400"/>


κ²°κ³ΌλŠ” μœ„ κ·Έλž˜ν”„λ“€κ³Ό 같이 λ§ˆμ§€λ§‰ 에포크 κΈ°μ€€ loss = 6.917126256477786, accuracy = 0.9819078947368421둜 μƒλ‹Ήνžˆ ν•™μŠ΅μ΄ 잘 된 λͺ¨μŠ΅μ„ λ³΄μ—¬μ€λ‹ˆλ‹€.


κ°μ‚¬ν•©λ‹ˆλ‹€.



Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference