Project 02 · Base SAS + SAS EG + SQL Server + Power BI · ⭐ Featured
A logistic regression model built in Base SAS (PROC LOGISTIC) that scores all active policies 60 days before renewal, identifying at-risk accounts so retention teams can intervene proactively improving renewal rate by 9.4%.
The retention team had no early-warning system for policies at risk of lapsing. Outreach was entirely reactive agents only contacted customers after a lapse had already occurred. This resulted in avoidable churn and revenue loss on policies that could have been retained with a timely intervention.
The goal was a scoring model that flagged at-risk policies at least 60 days before renewal, with enough precision to prioritise agent outreach effectively.
Multi-table SQL query pulling 3 years of policy records (180K policies), joined to payment schedules, claims, customer demographics, and product data.
PROC FREQ, PROC UNIVARIATE, PROC MEANS for univariate analysis. Created derived features: missed_payments_6m, claims_ratio, tenure_band, premium_increase_pct.
PROC LOGISTIC with stepwise variable selection, OUTROC / ROC statement for AUC. 5-fold cross-validation via SAS macro loop. Final model: 8 variables, AUC = 0.87.
SAS macro scores all policies with renewal dates 50–70 days out. Output table includes lapse probability, risk band (High/Medium/Low), and top 3 risk drivers per policy.
Agents see their territory's at-risk list sorted by lapse probability. Dashboard filters by risk band, product line, and premium tier. Managers see team-level retention KPIs.
/* Logistic Regression – Policy Lapse Prediction */
/* Step 1: Create model training dataset */
DATA work.model_train;
SET policy.features_2021_2023;
WHERE renewal_date BETWEEN '01JAN2021'd AND '31DEC2023'd
AND is_holdout = 0;
RUN;
/* Step 2: Logistic regression with stepwise selection */
PROC LOGISTIC data=work.model_train
outmodel=work.lapse_model
plots=(roc oddsratio);
CLASS product_line tenure_band / param=ref;
MODEL lapsed(event='1') =
missed_payments_6m
premium_increase_pct
claims_freq_tenure
cust_tenure_yrs
products_held
prior_lapse_flag
product_line
tenure_band
/ selection=stepwise slentry=0.05 slstay=0.05
outroc=work.roc_data
ctable pprob=(0.3 0.4 0.5);
OUTPUT out=work.scored_train
predicted=lapse_prob
xbeta=log_odds;
RUN;
/* Step 3: Validate AUC */
PROC SQL;
SELECT MAX(c_hat) AS AUC
FROM work.roc_data;
QUIT;
/* Weekly batch scoring macro */
/* Runs every Monday 05:00 via SAS Scheduler */
%MACRO score_renewal_risk(run_date=);
/* 1. Pull policies renewing in 50-70 days */
PROC SQL;
CONNECT TO sqlserver
(server="SQLSRV01" database="PolicyDB");
CREATE TABLE work.to_score AS
SELECT * FROM connection to sqlserver (
SELECT * FROM dbo.vw_PolicyFeatures
WHERE renewal_date
BETWEEN DATEADD(day,50,'&run_date')
AND DATEADD(day,70,'&run_date')
);
DISCONNECT FROM sqlserver;
QUIT;
/* 2. Score using saved model */
PROC LOGISTIC inmodel=work.lapse_model;
SCORE data=work.to_score
out=work.scored_output
predicted;
RUN;
/* 3. Assign risk bands */
DATA work.scored_final;
SET work.scored_output;
LENGTH risk_band $6;
IF P_1 >= 0.65 THEN risk_band = 'HIGH';
ELSE IF P_1 >= 0.40 THEN risk_band = 'MEDIUM';
ELSE risk_band = 'LOW';
score_date = today();
RUN;
/* 4. Write to SQL Server for Power BI */
PROC SQL;
CONNECT TO sqlserver
(server="SQLSRV01" database="PolicyDB");
EXECUTE(
TRUNCATE TABLE dbo.lapse_scores
) BY sqlserver;
QUIT;
PROC APPEND base=sqlsrv.lapse_scores
data=work.scored_final force; RUN;
%MEND score_renewal_risk;
/* Invoke macro */
%score_renewal_risk(run_date=%sysfunc(today(),date9.));
-- Feature extraction view for model training
CREATE VIEW dbo.vw_PolicyFeatures AS
SELECT
p.PolicyID,
p.CustomerID,
p.RenewalDate,
p.ProductLine,
-- Missed payments in last 6 months
SUM(CASE WHEN pay.PaymentStatus = 'MISSED'
AND pay.DueDate >= DATEADD(month,-6,GETDATE())
THEN 1 ELSE 0 END) AS missed_payments_6m,
-- Premium change at last renewal
ROUND(
(p.CurrentPremium - p.PriorPremium)
/ NULLIF(p.PriorPremium,0) * 100, 2
) AS premium_increase_pct,
-- Claims frequency
COUNT(c.ClaimID) * 1.0
/ NULLIF(p.TenureYears,0) AS claims_freq_tenure,
p.TenureYears AS cust_tenure_yrs,
pc.ProductsHeld AS products_held,
p.PriorLapseFlag AS prior_lapse_flag,
p.Lapsed AS lapsed -- target variable
FROM dbo.Policies p
LEFT JOIN dbo.Payments pay ON p.PolicyID = pay.PolicyID
LEFT JOIN dbo.Claims c ON p.PolicyID = c.PolicyID
INNER JOIN dbo.ProductCnt pc ON p.CustomerID = pc.CustomerID
GROUP BY
p.PolicyID, p.CustomerID, p.RenewalDate,
p.ProductLine, p.CurrentPremium, p.PriorPremium,
p.TenureYears, pc.ProductsHeld,
p.PriorLapseFlag, p.Lapsed;
| Policy ID | Renewal Date | Lapse Probability | Risk Band | Top Risk Driver | Premium | Agent Action |
|---|---|---|---|---|---|---|
| POL-84921 | 29 Apr 2026 | 82% | HIGH | 3 missed payments | ₹24,800 | Priority Call |
| POL-61033 | 02 May 2026 | 74% | HIGH | +28% premium increase | ₹18,200 | Retention Offer |
| POL-55247 | 05 May 2026 | 61% | HIGH | Prior lapse history | ₹31,500 | Priority Call |
| POL-72819 | 08 May 2026 | 48% | MEDIUM | Single product holder | ₹12,400 | Cross-sell Call |
| POL-39042 | 11 May 2026 | 41% | MEDIUM | 2 missed payments | ₹9,600 | Email Campaign |