Create README.md
Browse files# 🧠 Cifer Fraud Detection Model
`(cifer-fraud-detection-k1-a)`
## 🧾 Overview
This model is a binary classifier trained to detect fraudulent transactions using the **Cifer Fraud Detection Dataset** (6 million synthetic rows). It is designed to operate in **federated learning environments,** where data is split across clients or organizations without centralized access.
This model was trained on **6 million synthetic rows,** split into **four partitions of 1.5 million records each.** You can train this model **independently across the four dataset partitions,** then **aggregate the results using FedAvg (Federated Averaging)** to achieve performance **comparable to centralized training**—as validated in Cifer’s internal lab benchmarks.
This model is part of Cifer’s **laboratory-validated framework for privacy-preserving machine learning,** enabling secure, consent-first collaboration without exposing raw data. It is fully compatible with **Cifer’s no-code workspace** and **federated orchestration engine.**
|
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
datasets:
|
| 4 |
+
- CiferAI/Cifer-Fraud-Detection-Dataset-AF
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
metrics:
|
| 8 |
+
- accuracy
|
| 9 |
+
- precision
|
| 10 |
+
- recall
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# 🧠 Cifer Fraud Detection Model
|
| 14 |
+
`(cifer-fraud-detection-k1-a)`
|
| 15 |
+
|
| 16 |
+
## 🧾 Overview
|
| 17 |
+
|
| 18 |
+
This model is a binary classifier trained to detect fraudulent transactions using the **Cifer Fraud Detection Dataset** (6 million synthetic rows). It is designed to operate in **federated learning environments,** where data is split across clients or organizations without centralized access.
|
| 19 |
+
|
| 20 |
+
This model was trained on **6 million synthetic rows,** split into **four partitions of 1.5 million records each.** You can train this model **independently across the four dataset partitions,** then **aggregate the results using FedAvg (Federated Averaging)** to achieve performance **comparable to centralized training**—as validated in Cifer’s internal lab benchmarks.
|
| 21 |
+
|
| 22 |
+
This model is part of Cifer’s **laboratory-validated framework for privacy-preserving machine learning,** enabling secure, consent-first collaboration without exposing raw data. It is fully compatible with **Cifer’s no-code workspace** and **federated orchestration engine.**
|
| 23 |
+
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
## 📊 Training Data
|
| 27 |
+
|
| 28 |
+
- **Dataset:** CiferAI/Cifer-Fraud-Detection-Dataset-AF
|
| 29 |
+
- **Total rows:** 6,000,000 (split into 4 federated parts)
|
| 30 |
+
- **Type:** Fully synthetic tabular data modeled after real-world financial fraud scenarios
|
| 31 |
+
- **Fields:** transaction type, amount, sender/receiver balance, fraud flags, and step-based timestamps
|
| 32 |
+
- **Generated with:** Cifer Simulation Engine, modeled after the PaySim simulator
|
| 33 |
+
|
| 34 |
+
---
|
| 35 |
+
|
| 36 |
+
## 🧠 Model Architecture
|
| 37 |
+
|
| 38 |
+
- **Framework:** TensorFlow / Keras
|
| 39 |
+
- **Architecture:** Multi-layer Perceptron (MLP)
|
| 40 |
+
- **Layers:**
|
| 41 |
+
- Input Layer (shape = number of features)
|
| 42 |
+
- Dense(64, activation="relu")
|
| 43 |
+
- Dense(32, activation="relu")
|
| 44 |
+
- Dense(2, activation="softmax")
|
| 45 |
+
- **Loss Function:** `sparse_categorical_crossentropy`
|
| 46 |
+
- **Optimizer:** `adam`
|
| 47 |
+
- **Output:** Fraud probability classification (0 = normal, 1 = fraud)
|
| 48 |
+
|
| 49 |
+
---
|
| 50 |
+
|
| 51 |
+
## ✅ Use Cases
|
| 52 |
+
|
| 53 |
+
- Fraud detection in fintech, mobile money, and digital banking
|
| 54 |
+
- Federated AI research across decentralized clients
|
| 55 |
+
- Privacy-preserving machine learning benchmarking
|
| 56 |
+
- Fairness and bias mitigation testing under distribution shift
|
| 57 |
+
- Integration with Cifer's federated orchestration engine and no-code workspace
|
| 58 |
+
|
| 59 |
+
---
|
| 60 |
+
|
| 61 |
+
## 📈 Performance
|
| 62 |
+
|
| 63 |
+
Trained on a synthetic dataset **benchmarked against real-world financial logs.**
|
| 64 |
+
This model achieves **99.93% accuracy,** closely matching the **99.98% benchmark** of models trained on real financial data.
|
| 65 |
+
Performance consistency is preserved across federated nodes when using **FedAvg** aggregation.
|
| 66 |
+
|
| 67 |
+
---
|
| 68 |
+
|
| 69 |
+
## 🔐 Privacy & Federated Context
|
| 70 |
+
|
| 71 |
+
- Designed for federated training across 4 dataset partitions
|
| 72 |
+
- No raw data sharing between clients or central servers
|
| 73 |
+
- Supports Cifer’s asynchronous training and client coordination
|
| 74 |
+
- Compatible with Cifer’s blockchain-based contribution tracking and aggregation module
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## 🔧 File Info
|
| 79 |
+
|
| 80 |
+
- **Format:** `.h5` (Keras model file)
|
| 81 |
+
- **Input:** Preprocessed numeric tabular data (StandardScaler + LabelEncoded type)
|
| 82 |
+
- **Target:** `isFraud` binary label (0 or 1)
|
| 83 |
+
- **Recommended loader:** `keras.models.load_model("client_model.h5")`
|
| 84 |
+
|
| 85 |
+
---
|
| 86 |
+
|
| 87 |
+
## 📜 License
|
| 88 |
+
Apache 2.0
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
+
## 🙌 Citation
|
| 93 |
+
If you use this model or dataset in your work, please cite:
|
| 94 |
+
> CiferAI (2025). Cifer Fraud Detection Dataset & Federated Model – Privacy-Preserving AI for Financial Risk.
|
| 95 |
+
|
| 96 |
+
---
|