ryantrisnadi commited on
Commit
55f8f0c
1 Parent(s): 9af275a

Upload 2 files

Browse files
P1M2_Ryan_Trisnadi.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
P1M2_Ryan_Trisnadi_inf.ipynb ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "code",
5
+ "execution_count": 1,
6
+ "metadata": {},
7
+ "outputs": [
8
+ {
9
+ "name": "stderr",
10
+ "output_type": "stream",
11
+ "text": [
12
+ "/opt/miniconda3/lib/python3.12/site-packages/threadpoolctl.py:1214: RuntimeWarning: \n",
13
+ "Found Intel OpenMP ('libiomp') and LLVM OpenMP ('libomp') loaded at\n",
14
+ "the same time. Both libraries are known to be incompatible and this\n",
15
+ "can cause random crashes or deadlocks on Linux when loaded in the\n",
16
+ "same Python program.\n",
17
+ "Using threadpoolctl may cause crashes or deadlocks. For more\n",
18
+ "information and possible workarounds, please see\n",
19
+ " https://github.com/joblib/threadpoolctl/blob/master/multiple_openmp.md\n",
20
+ "\n",
21
+ " warnings.warn(msg, RuntimeWarning)\n"
22
+ ]
23
+ }
24
+ ],
25
+ "source": [
26
+ "import json\n",
27
+ "import pickle\n",
28
+ "\n",
29
+ "# Load the saved list of numerical columns\n",
30
+ "with open('list_num_cols.txt', 'r') as file_1:\n",
31
+ " combined_columns = json.load(file_1)\n",
32
+ "\n",
33
+ "# Load the saved model\n",
34
+ "with open('model.pkl', 'rb') as file_2:\n",
35
+ " lr = pickle.load(file_2)"
36
+ ]
37
+ },
38
+ {
39
+ "cell_type": "markdown",
40
+ "metadata": {},
41
+ "source": [
42
+ "Kita akan coba buka yang kita tadi save untuk dipake untuk inference."
43
+ ]
44
+ },
45
+ {
46
+ "cell_type": "code",
47
+ "execution_count": 2,
48
+ "metadata": {},
49
+ "outputs": [
50
+ {
51
+ "name": "stdout",
52
+ "output_type": "stream",
53
+ "text": [
54
+ "Original Dummy Data:\n",
55
+ " Suburb Rooms Price Distance Bathroom Car Landsize \\\n",
56
+ "0 Abbotsford 2 1035000.0 2.5 1.0 0.0 156.0 \n",
57
+ "1 Abbotsford 3 1465000.0 2.5 2.0 0.0 134.0 \n",
58
+ "2 Abbotsford 4 1600000.0 2.5 1.0 2.0 120.0 \n",
59
+ "3 Abbotsford 3 1876000.0 2.5 2.0 0.0 245.0 \n",
60
+ "4 Abbotsford 2 1636000.0 2.5 1.0 2.0 256.0 \n",
61
+ "\n",
62
+ " BuildingArea YearBuilt Propertycount \n",
63
+ "0 79.0 1900.0 4019.0 \n",
64
+ "1 150.0 1900.0 4019.0 \n",
65
+ "2 142.0 2014.0 4019.0 \n",
66
+ "3 210.0 1910.0 4019.0 \n",
67
+ "4 107.0 1890.0 4019.0 \n"
68
+ ]
69
+ }
70
+ ],
71
+ "source": [
72
+ "import pandas as pd\n",
73
+ "\n",
74
+ "# Assuming df_data_dummy is your DataFrame with the data\n",
75
+ "df_data_dummy = pd.DataFrame({\n",
76
+ "\n",
77
+ " \"Suburb\": [\"Abbotsford\", \"Abbotsford\", \"Abbotsford\", \"Abbotsford\", \"Abbotsford\"],\n",
78
+ " \"Rooms\": [2, 3, 4, 3, 2],\n",
79
+ " \"Price\": [1035000.0, 1465000.0, 1600000.0, 1876000.0, 1636000.0],\n",
80
+ " \"Distance\": [2.5, 2.5, 2.5, 2.5, 2.5],\n",
81
+ " \"Bathroom\": [1.0, 2.0, 1.0, 2.0, 1.0],\n",
82
+ " \"Car\": [0.0, 0.0, 2.0, 0.0, 2.0],\n",
83
+ " \"Landsize\": [156.0, 134.0, 120.0, 245.0, 256.0],\n",
84
+ " \"BuildingArea\": [79.0, 150.0, 142.0, 210.0, 107.0],\n",
85
+ " \"YearBuilt\": [1900.0, 1900.0, 2014.0, 1910.0, 1890.0],\n",
86
+ " \"Propertycount\": [4019.0, 4019.0, 4019.0, 4019.0, 4019.0]\n",
87
+ "\n",
88
+ "})\n",
89
+ "\n",
90
+ "df_dummy_data = pd.DataFrame(df_data_dummy)\n",
91
+ "print(\"Original Dummy Data:\")\n",
92
+ "print(df_dummy_data)\n",
93
+ "\n"
94
+ ]
95
+ },
96
+ {
97
+ "cell_type": "markdown",
98
+ "metadata": {},
99
+ "source": [
100
+ "Kita akan membuat dataset \"dummy\" baru dan masukan ke dataframe dinamakan \"df_dummy_data\". Kita mau uji nanti dengan linear regression."
101
+ ]
102
+ },
103
+ {
104
+ "cell_type": "code",
105
+ "execution_count": 3,
106
+ "metadata": {},
107
+ "outputs": [
108
+ {
109
+ "data": {
110
+ "text/html": [
111
+ "<div>\n",
112
+ "<style scoped>\n",
113
+ " .dataframe tbody tr th:only-of-type {\n",
114
+ " vertical-align: middle;\n",
115
+ " }\n",
116
+ "\n",
117
+ " .dataframe tbody tr th {\n",
118
+ " vertical-align: top;\n",
119
+ " }\n",
120
+ "\n",
121
+ " .dataframe thead th {\n",
122
+ " text-align: right;\n",
123
+ " }\n",
124
+ "</style>\n",
125
+ "<table border=\"1\" class=\"dataframe\">\n",
126
+ " <thead>\n",
127
+ " <tr style=\"text-align: right;\">\n",
128
+ " <th></th>\n",
129
+ " <th>Suburb</th>\n",
130
+ " <th>Rooms</th>\n",
131
+ " <th>Price</th>\n",
132
+ " <th>Distance</th>\n",
133
+ " <th>Bathroom</th>\n",
134
+ " <th>Car</th>\n",
135
+ " <th>Landsize</th>\n",
136
+ " <th>BuildingArea</th>\n",
137
+ " <th>YearBuilt</th>\n",
138
+ " <th>Propertycount</th>\n",
139
+ " </tr>\n",
140
+ " </thead>\n",
141
+ " <tbody>\n",
142
+ " <tr>\n",
143
+ " <th>0</th>\n",
144
+ " <td>Abbotsford</td>\n",
145
+ " <td>2</td>\n",
146
+ " <td>1035000.0</td>\n",
147
+ " <td>2.5</td>\n",
148
+ " <td>1.0</td>\n",
149
+ " <td>0.0</td>\n",
150
+ " <td>156.0</td>\n",
151
+ " <td>79.0</td>\n",
152
+ " <td>1900.0</td>\n",
153
+ " <td>4019.0</td>\n",
154
+ " </tr>\n",
155
+ " <tr>\n",
156
+ " <th>1</th>\n",
157
+ " <td>Abbotsford</td>\n",
158
+ " <td>3</td>\n",
159
+ " <td>1465000.0</td>\n",
160
+ " <td>2.5</td>\n",
161
+ " <td>2.0</td>\n",
162
+ " <td>0.0</td>\n",
163
+ " <td>134.0</td>\n",
164
+ " <td>150.0</td>\n",
165
+ " <td>1900.0</td>\n",
166
+ " <td>4019.0</td>\n",
167
+ " </tr>\n",
168
+ " <tr>\n",
169
+ " <th>2</th>\n",
170
+ " <td>Abbotsford</td>\n",
171
+ " <td>4</td>\n",
172
+ " <td>1600000.0</td>\n",
173
+ " <td>2.5</td>\n",
174
+ " <td>1.0</td>\n",
175
+ " <td>2.0</td>\n",
176
+ " <td>120.0</td>\n",
177
+ " <td>142.0</td>\n",
178
+ " <td>2014.0</td>\n",
179
+ " <td>4019.0</td>\n",
180
+ " </tr>\n",
181
+ " <tr>\n",
182
+ " <th>3</th>\n",
183
+ " <td>Abbotsford</td>\n",
184
+ " <td>3</td>\n",
185
+ " <td>1876000.0</td>\n",
186
+ " <td>2.5</td>\n",
187
+ " <td>2.0</td>\n",
188
+ " <td>0.0</td>\n",
189
+ " <td>245.0</td>\n",
190
+ " <td>210.0</td>\n",
191
+ " <td>1910.0</td>\n",
192
+ " <td>4019.0</td>\n",
193
+ " </tr>\n",
194
+ " <tr>\n",
195
+ " <th>4</th>\n",
196
+ " <td>Abbotsford</td>\n",
197
+ " <td>2</td>\n",
198
+ " <td>1636000.0</td>\n",
199
+ " <td>2.5</td>\n",
200
+ " <td>1.0</td>\n",
201
+ " <td>2.0</td>\n",
202
+ " <td>256.0</td>\n",
203
+ " <td>107.0</td>\n",
204
+ " <td>1890.0</td>\n",
205
+ " <td>4019.0</td>\n",
206
+ " </tr>\n",
207
+ " </tbody>\n",
208
+ "</table>\n",
209
+ "</div>"
210
+ ],
211
+ "text/plain": [
212
+ " Suburb Rooms Price Distance Bathroom Car Landsize \\\n",
213
+ "0 Abbotsford 2 1035000.0 2.5 1.0 0.0 156.0 \n",
214
+ "1 Abbotsford 3 1465000.0 2.5 2.0 0.0 134.0 \n",
215
+ "2 Abbotsford 4 1600000.0 2.5 1.0 2.0 120.0 \n",
216
+ "3 Abbotsford 3 1876000.0 2.5 2.0 0.0 245.0 \n",
217
+ "4 Abbotsford 2 1636000.0 2.5 1.0 2.0 256.0 \n",
218
+ "\n",
219
+ " BuildingArea YearBuilt Propertycount \n",
220
+ "0 79.0 1900.0 4019.0 \n",
221
+ "1 150.0 1900.0 4019.0 \n",
222
+ "2 142.0 2014.0 4019.0 \n",
223
+ "3 210.0 1910.0 4019.0 \n",
224
+ "4 107.0 1890.0 4019.0 "
225
+ ]
226
+ },
227
+ "execution_count": 3,
228
+ "metadata": {},
229
+ "output_type": "execute_result"
230
+ }
231
+ ],
232
+ "source": [
233
+ "df_dummy_data"
234
+ ]
235
+ },
236
+ {
237
+ "cell_type": "code",
238
+ "execution_count": 4,
239
+ "metadata": {},
240
+ "outputs": [],
241
+ "source": [
242
+ "df_dummy_data_new = df_dummy_data[combined_columns]"
243
+ ]
244
+ },
245
+ {
246
+ "cell_type": "markdown",
247
+ "metadata": {},
248
+ "source": [
249
+ "masukan kolom ke data dummy. Berikutnya namakan variable baru."
250
+ ]
251
+ },
252
+ {
253
+ "cell_type": "code",
254
+ "execution_count": 5,
255
+ "metadata": {},
256
+ "outputs": [],
257
+ "source": [
258
+ "predictions = lr.predict(df_dummy_data_new)"
259
+ ]
260
+ },
261
+ {
262
+ "cell_type": "markdown",
263
+ "metadata": {},
264
+ "source": [
265
+ "Akan membuat prediksi dengan linear regression di test. "
266
+ ]
267
+ },
268
+ {
269
+ "cell_type": "code",
270
+ "execution_count": 6,
271
+ "metadata": {},
272
+ "outputs": [
273
+ {
274
+ "data": {
275
+ "text/plain": [
276
+ "array([1101277.02454045, 1665725.95948649, 1297970.1974852 ,\n",
277
+ " 1639455.71625785, 1297855.42299958])"
278
+ ]
279
+ },
280
+ "execution_count": 6,
281
+ "metadata": {},
282
+ "output_type": "execute_result"
283
+ }
284
+ ],
285
+ "source": [
286
+ "predictions"
287
+ ]
288
+ },
289
+ {
290
+ "cell_type": "markdown",
291
+ "metadata": {},
292
+ "source": [
293
+ "Keluarlah prediksi harga rumah di beberapa bulan kedepan. Harganya semua diatas AU$1 Juta."
294
+ ]
295
+ }
296
+ ],
297
+ "metadata": {
298
+ "kernelspec": {
299
+ "display_name": "base",
300
+ "language": "python",
301
+ "name": "python3"
302
+ },
303
+ "language_info": {
304
+ "codemirror_mode": {
305
+ "name": "ipython",
306
+ "version": 3
307
+ },
308
+ "file_extension": ".py",
309
+ "mimetype": "text/x-python",
310
+ "name": "python",
311
+ "nbconvert_exporter": "python",
312
+ "pygments_lexer": "ipython3",
313
+ "version": "3.12.2"
314
+ }
315
+ },
316
+ "nbformat": 4,
317
+ "nbformat_minor": 2
318
+ }