DEEP LEARNING MODEL

Neural Churn Predictor

A multi-layer feedforward deep neural network built in PyTorch to accurately analyze behavioral telemetry and forecast customer churn rates for SaaS platforms.

github.com/Thet9354/NeuralChurning

The Business Thesis & The Dataset

For modern SaaS and subscription services, retaining existing customers is significantly cheaper than acquiring new ones. Churn prediction acts as a critical early-warning mechanism: if a model can accurately identify accounts demonstrating early signs of disengagement, retention teams can deploy custom promotions before the user hits "Cancel".

We set out to build a highly specialized deep-learning classification pipeline. Rather than relying on simple linear regressions or standard decision trees, we engineered a deep feedforward network designed to learn non-linear patterns within complex behavioral profiles (e.g. usage decline, frequency drops, payment delays).

Core Challenge (Imbalanced Data): Subscription datasets are heavily skewed—usually 90%+ active users and under 10% churn instances. To prevent the neural network from taking a lazy shortcut (predicting "not churned" 100% of the time), we implemented advanced sampling techniques including **SMOTE** and custom class-weighted loss structures.

Network Architecture Detail

The network comprises five distinct dense linear layers, stabilized by modern normalization layers to prevent vanishing gradients:

Layer 1 (Input): Dynamic dimension scaling corresponding to processed features (usage frequency, session lengths, billing type, support ticket volume).
Layer 2-4 (Hidden): Standard 128 and 64 node sizes, structured using LeakyReLU activations, Batch Normalization, and Dropout layers set to 0.3 to maximize generalizability.
Layer 5 (Output): Single output node representing churn probability, compressed via a Sigmoid activation function.

PyTorch Model Implementation

Below is the structural training configuration written in PyTorch showing our layers, activation rules, and forward propagation logic:

import torch import torch.nn as nn class ChurnDeepNetwork(nn.Module): def __init__(self, input_dim): super(ChurnDeepNetwork, self).__init__() # Five-Layer Architecture self.layer1 = nn.Linear(input_dim, 128) self.bn1 = nn.BatchNorm1d(128) self.relu1 = nn.LeakyReLU(0.1) self.dropout1 = nn.Dropout(0.3) self.layer2 = nn.Linear(128, 64) self.bn2 = nn.BatchNorm1d(64) self.relu2 = nn.LeakyReLU(0.1) self.dropout2 = nn.Dropout(0.3) self.layer3 = nn.Linear(64, 32) self.bn3 = nn.BatchNorm1d(32) self.relu3 = nn.LeakyReLU(0.1) self.layer4 = nn.Linear(32, 16) self.relu4 = nn.LeakyReLU(0.1) self.layer5 = nn.Linear(16, 1) self.sigmoid = nn.Sigmoid() def forward(self, x): x = self.dropout1(self.relu1(self.bn1(self.layer1(x)))) x = self.dropout2(self.relu2(self.bn2(self.layer2(x)))) x = self.relu3(self.bn3(self.layer3(x))) x = self.relu4(self.layer4(x)) x = self.sigmoid(self.layer5(x)) return x

Optimization and Loss Curves

To optimize the weights during backward propagation, we selected the **Adam** optimizer paired with a learning-rate scheduler (ReduceLROnPlateau). Training utilized a binary cross-entropy loss function weighted dynamically to offset class imbalances, monitored in real time using **Weights & Biases** dashboards.

The final model reached a validation accuracy of **91.8%** and an ROC-AUC score of **0.89**, signaling clean, generalized performance capable of being deployed straight to production systems.