Implementing Privacy-Preserving Federated Fraud Detection

Overview

In this tutorial, we demonstrate how we simulate a privacy-preserving fraud detection system using Federated Learning without relying on heavyweight frameworks or complex infrastructure. We build a clean, CPU-friendly setup that mimics ten independent banks, each training a local fraud-detection model on its own highly imbalanced transaction data.

Coordination of Local Updates

We coordinate these local updates through a simple FedAvg aggregation loop, allowing us to improve a global model while ensuring that no raw transaction data ever leaves a client. Alongside this, we integrate OpenAI to support post-training analysis and risk-oriented reporting, demonstrating how federated learning outputs can be translated into decision-ready insights.

Setting Up the Environment

We set up the execution environment and import all required libraries for data generation, modeling, evaluation, and reporting. We also fix random seeds to ensure our federated simulation remains deterministic.

!pip -q install torch scikit-learn numpy openai
 
import time, random, json, os, getpass  
import numpy as np  
import torch  
import torch.nn as nn  
from torch.utils.data import DataLoader, TensorDataset  
from sklearn.datasets import make_classification  
from sklearn.model_selection import train_test_split  
from sklearn.preprocessing import StandardScaler  
from sklearn.metrics import roc_auc_score, average_precision_score, accuracy_score  
from openai import OpenAI
SEED = 7  
random.seed(SEED); np.random.seed(SEED); torch.manual_seed(SEED)
DEVICE = torch.device("cpu")  
print("Device:", DEVICE)

Data Generation and Splitting

We generate a highly imbalanced, credit-card-like fraud dataset and split it into training and test sets. We standardize the server-side data and prepare a global test loader that enables consistent evaluation of the aggregated model after each federated round.

X, y = make_classification(
   n_samples=60000,  
   n_features=30,  
   n_informative=18,  
   n_redundant=8,  
   weights=[0.985, 0.015],  
   class_sep=1.5,  
   flip_y=0.01,  
   random_state=SEED
)
X = X.astype(np.float32)  
y = y.astype(np.int64)
X_train_full, X_test, y_train_full, y_test = train_test_split(
   X, y, test_size=0.2, stratify=y, random_state=SEED
)
server_scaler = StandardScaler()
X_train_full_s = server_scaler.fit_transform(X_train_full).astype(np.float32)
X_test_s = server_scaler.transform(X_test).astype(np.float32)
test_loader = DataLoader(
   TensorDataset(torch.from_numpy(X_test_s), torch.from_numpy(y_test)),  
   batch_size=1024,  
   shuffle=False
)

Simulating Non-IID Behavior

We simulate realistic non-IID behavior by partitioning the training data across ten clients using a Dirichlet distribution. Each simulated bank operates on its own locally scaled data.

def dirichlet_partition(y, n_clients=10, alpha=0.35):  
   classes = np.unique(y)  
   idx_by_class = [np.where(y == c)[0] for c in classes]  
   client_idxs = [[] for _ in range(n_clients)]  
   for idxs in idx_by_class:  
       np.random.shuffle(idxs)  
       props = np.random.dirichlet(alpha * np.ones(n_clients))  
       cuts = (np.cumsum(props) * len(idxs)).astype(int)  
       prev = 0  
       for cid, cut in enumerate(cuts):  
           client_idxs[cid].extend(idxs[prev:cut].tolist())  
           prev = cut  
   return [np.array(ci, dtype=np.int64) for ci in client_idxs]

Model Definition and Training

We define the neural network used for fraud detection along with utility functions for training and evaluation. The local optimization ensures efficient updates to client models.

class FraudNet(nn.Module):
   def __init__(self, in_dim):  
       super().__init__()
       self.net = nn.Sequential(
           nn.Linear(in_dim, 64),  
           nn.ReLU(),  
           nn.Dropout(0.1),  
           nn.Linear(64, 32),  
           nn.ReLU(),  
           nn.Dropout(0.1),  
           nn.Linear(32, 1)
       )
   def forward(self, x):
       return self.net(x).squeeze(-1)

Federated Learning Process

We orchestrate the federated learning process by iteratively training local client models and aggregating their parameters using FedAvg, evaluating the global model after each round to monitor convergence.

for r in range(1, ROUNDS + 1):
   client_weights, client_sizes = [], []
   for cid in range(NUM_CLIENTS):
       local = FraudNet(X_train_full.shape[1])
       set_weights(local, global_weights)
       train_local(local, client_loaders[cid][0], LR)
       client_weights.append(get_weights(local))
       client_sizes.append(len(client_loaders[cid][0].dataset))
   global_weights = fedavg(client_weights, client_sizes)
   set_weights(global_model, global_weights)
   metrics = evaluate(global_model, test_loader)
   print(f"Round {r}: {metrics}")

Generating Risk Reports with OpenAI

We transform the results into concise analytical reports using OpenAI. The results summarize performance, risks, and recommended next steps to aid decision-making.

if OPENAI_API_KEY:
   os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
   client = OpenAI()
   resp = client.responses.create(model="gpt-5.2", input=prompt)
   print(resp.output_text)

Through this framework, we have shown a robust method for implementing federated fraud detection that is both private and applicable in real-world scenarios.