commited on
Commit
d636394
·
verified ·
1 Parent(s): a538c2b

Upload 6 files

Browse files
Credit Card Clustering with Machine Learning.ipynb ADDED
The diff for this file is too large to render. See raw diff
 
README.md CHANGED
@@ -1,3 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Credit Card Clustering with Machine Learning
2
+
3
+ This project focuses on clustering credit card customers based on their usage behavior using unsupervised machine learning techniques. The goal is to segment customers for better targeting, offers, and personalized financial services.
4
+
5
+ ## 📌 Objective
6
+
7
+ - Understand customer behavior from credit card usage.
8
+ - Segment customers into clusters with similar patterns.
9
+ - Help financial institutions create targeted marketing strategies.
10
+
11
+ ## 📊 Dataset
12
+
13
+ - Source: [Aman Kharwal’s GitHub Dataset](https://raw.githubusercontent.com/amankharwal/Website-data/master/credit_card.csv)
14
+ - Contains features like:
15
+ - `BALANCE`: Average balance
16
+ - `PURCHASES`: Total purchases
17
+ - `CREDIT_LIMIT`: Assigned credit limit
18
+ - `PAYMENTS`: Amount paid
19
+ - `TENURE`: Months as a customer
20
+ - `ONEOFF_PURCHASES`, `INSTALLMENTS_PURCHASES`, etc.
21
+
22
+ ## 🧹 Data Preprocessing
23
+
24
+ - Checked for null values and handled them
25
+ - Dropped irrelevant columns (e.g., `CUST_ID`)
26
+ - Scaled data using `StandardScaler`
27
+
28
+ ## 🧠 Clustering Algorithm
29
+
30
+ - Used **KMeans** algorithm
31
+ - Determined optimal number of clusters using:
32
+ - Elbow Method
33
+ - Silhouette Score
34
+
35
+ ## 📉 Dimensionality Reduction
36
+
37
+ - Applied **PCA** for visualizing clusters in 2D space
38
+
39
+ ## 📈 Results & Analysis
40
+
41
+ - Clusters represent different types of customers:
42
+ - High spenders
43
+ - Low activity users
44
+ - Customers using mostly installments
45
+ - Visualized clusters using `matplotlib` and `seaborn`
46
+
47
+ ## 📦 Libraries Used
48
+
49
+ - `pandas`
50
+ - `numpy`
51
+ - `matplotlib`, `seaborn`
52
+ - `scikit-learn`
53
+
54
+ ## 🔍 Future Improvements
55
+
56
+ - Try alternative clustering algorithms like DBSCAN, GMM
57
+ - Add deeper feature engineering
58
+ - Include time-based features for trend analysis
59
+
60
+ ## 💻 How to Run
61
+
62
+ 1. Clone the repo:
63
+ ```bash
64
+ git clone https://github.com/handecrkc/credit-card-clustering.git
65
+ ```
66
+
67
+ 2. Install requirements:
68
+ ```bash
69
+ pip install -r requirements.txt
70
+ ```
71
+
72
+ 3. Run the notebook:
73
+ Open `credit_card_clustering.ipynb` in Jupyter Notebook or VS Code
74
+
75
  ---
76
+
77
+ ## 🧑‍💻 Author
78
+
79
+ - **Hande Çarkcı**
80
+ - GitHub: [github.com/handecrkc](https://github.com/handecrkc)
81
+
82
+
83
+ # 💳 Credit Card Clustering – Streamlit App
84
+
85
+ Bu proje, müşterilerin kredi kartı kullanım alışkanlıklarına göre segmentlere ayrılmasını sağlayan bir **Makine Öğrenimi** uygulamasıdır.
86
+ Streamlit ile geliştirilen bu uygulama sayesinde kullanıcıdan alınan veriye göre müşterinin ait olduğu küme tahmin edilir.
87
+
88
+ ## 🎯 Proje Amacı
89
+
90
+ - Kredi kartı kullanıcılarını **benzer davranış gruplarına ayırmak**
91
+ - Finansal kurumlara **hedefli pazarlama stratejileri** sağlamak
92
+ - Kullanıcıya ait segmenti gerçek zamanlı olarak tahmin etmek
93
+
94
+ ## 🧠 Kullanılan Yöntem
95
+
96
+ - **KMeans Clustering**
97
+ - **StandardScaler** ile veri ölçekleme
98
+ - **Streamlit** ile web uygulaması
99
+
100
+ ## 🗃️ Kullanılan Veri Seti
101
+
102
+ - Kaynak: [`CC GENERAL.csv`](https://raw.githubusercontent.com/amankharwal/Website-data/master/credit_card.csv)
103
+ - Sütunlar: `BALANCE`, `PURCHASES`, `CREDIT_LIMIT`, `PAYMENTS`, `TENURE`, vb.
104
+
105
+ ## 🚀 Uygulamayı Çalıştırmak
106
+
107
+ ```bash
108
+ git clone https://github.com/kullanici_adin/credit-card-clustering-streamlit.git
109
+ cd credit-card-clustering-streamlit
110
+ pip install -r requirements.txt
111
+ streamlit run app.py
112
+
113
+
114
+ 🖼️ Uygulama Görünümü
115
+
116
+ 🔍 Küme Açıklamaları
117
+ Küme Açıklama
118
+ 0 🟢 Düşük harcama yapan, düşük riskli müşteri
119
+ 1 🟡 Orta seviyede harcama yapan müşteri
120
+ 2 🔴 Yüksek harcama yapan ve aktif müşteri
121
+ 3 🔵 Taksitli harcamaları yüksek olan müşteri
122
+
123
+ 🛠️ Gereken Kütüphaneler
124
+ streamlit
125
+
126
+ pandas
127
+
128
+ numpy
129
+
130
+ scikit-learn
131
+
132
+ joblib
133
+
134
+
135
+ ## 📜 License
136
+
137
+ This project is open-source under the MIT License.
Screenshot_2.png ADDED
app.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import pandas as pd
3
+ import numpy as np
4
+ import joblib
5
+
6
+ # Model ve scaler'ı yükle
7
+ scaler, kmeans = joblib.load("model.pkl")
8
+
9
+ st.title("💳 Credit Card Customer Segmentation")
10
+ st.markdown("Müşteri bilgilerini girerek hangi kümeye ait olduğunu öğrenin.")
11
+
12
+ # Kullanıcıdan veri al
13
+ def get_user_input():
14
+ balance = st.number_input("BALANCE", 0.0, 100000.0, 2000.0)
15
+ purchases = st.number_input("PURCHASES", 0.0, 100000.0, 3000.0)
16
+ oneoff = st.number_input("ONEOFF_PURCHASES", 0.0, 50000.0, 1000.0)
17
+ installments = st.number_input("INSTALLMENTS_PURCHASES", 0.0, 50000.0, 2000.0)
18
+ credit_limit = st.number_input("CREDIT_LIMIT", 100.0, 100000.0, 5000.0)
19
+ payments = st.number_input("PAYMENTS", 0.0, 100000.0, 2500.0)
20
+ tenure = st.slider("TENURE (kaç aydır müşteri?)", 0, 12, 6)
21
+
22
+ data = {
23
+ 'BALANCE': balance,
24
+ 'PURCHASES': purchases,
25
+ 'ONEOFF_PURCHASES': oneoff,
26
+ 'INSTALLMENTS_PURCHASES': installments,
27
+ 'CREDIT_LIMIT': credit_limit,
28
+ 'PAYMENTS': payments,
29
+ 'TENURE': tenure
30
+ }
31
+
32
+ return pd.DataFrame([data])
33
+
34
+ # Tahmin yap
35
+ input_df = get_user_input()
36
+
37
+ if st.button("Tahmin Et"):
38
+ scaled_input = scaler.transform(input_df)
39
+ cluster = kmeans.predict(scaled_input)[0]
40
+
41
+ st.subheader(f"🔍 Tahmin Edilen Küme: {cluster}")
42
+ yorumlar = {
43
+ 0: "🟢 Düşük harcama yapan, düşük riskli müşteri.",
44
+ 1: "🟡 Orta seviyede harcama yapan müşteri.",
45
+ 2: "🔴 Yüksek harcama yapan ve aktif müşteri.",
46
+ 3: "🔵 Taksitli harcamaları yüksek olan müşteri."
47
+ }
48
+
49
+ st.write(yorumlar.get(cluster, "Bilinmeyen küme"))
model.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b60f9e959e6ac0efa1dabc07e019fad4495f6bdc1e1c3da7d54ca656b61ee0b
3
+ size 37266
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ streamlit
2
+ pandas
3
+ numpy
4
+ scikit-learn
5
+ joblib