Project 06 · SAS EG + PROC GENMOD + Power BI + RLS

Premium Pricing Sophistication Dashboard

Built a Generalised Linear Model (GLM) in SAS Enterprise Guide using PROC GENMOD to address inadequate premium pricing across 12 rating variables. The model quantified individually-rated relativity for each factor; a Power BI executive dashboard with row-level security enabled actuaries and product managers to monitor loss ratios against predicted values and identify repricing opportunities.

SAS EG PROC GENMOD GLM Modelling Power BI DAX RLS Actuarial Analytics T-SQL
↑ 6.2%
Loss Ratio Improvement
12
Rating Variables Modelled
5 Years
Historical Training Data
RLS
Row-Level Security Applied

Business Problem

The pricing team used simple univariate relativities each rating variable set independently using historical averages leading to cross-subsidisation between risk cohorts. High-risk segments were under-priced, contributing to a 98.4% loss ratio in two motor sub-segments, while low-risk segments were over-priced and losing to competitors.

The actuarial team required a multivariate GLM to quantify correlations between rating factors, remove distortions, and produce stable, credible relativities for each of the 12 rating variables simultaneously.

Solution Overview

  • Extracted 5 years of policy and claims data from SQL Server warehouse using a staging view (T-SQL) 1.2M policy-year exposure records
  • Built a Tweedie GLM in SAS EG using PROC GENMOD with log link, offset for exposure, and systematic variable selection (stepwise AIC)
  • Modelled 12 rating variables: age band, vehicle age, cubic capacity, geographic zone, NCB class, sum insured band, occupation, policy tenure, usage type, annual mileage band, add-on cover flag, renewal channel
  • Output: parameter estimates (relativities) exported to SQL Server via SAS/ACCESS; used as reference rates in the pricing engine
  • Built a Power BI dashboard with 4 pages Actual vs Expected loss ratio, Rating factor relativities, Territory heat-map (custom visual), and Repricing opportunity tracker
  • Applied Role-Level Security (RLS): actuaries see all segments; product managers see only their lines of business
  • Post-implementation, the updated rate schedule improved blended loss ratio from 96.4% → 90.2% (6.2 pp improvement) over the subsequent 18-month review period

Process Walkthrough

1

Data Extraction & Preparation (T-SQL)

Built a T-SQL staging view joining policies, claims, and exposure data. Derived variables: earned premium, pure loss cost, earned exposure in policy-years, banded variables, and NCB class. Data quality checks: outlier flagging, missing value imputation rules.

2

Exploratory Data Analysis (SAS EG)

PROC FREQ and PROC MEANS for categorical and continuous variable distributions. Plotted observed loss frequency and severity by each rating factor level to confirm monotonic relationships before modelling.

3

GLM Development (PROC GENMOD)

Frequency model: Poisson with log link, offset=log(exposure). Severity model: Gamma with log link. Pure premium (Tweedie) model as primary pricing output. Variable selection by backward elimination using AIC. Aliased levels set as base reference class (relativity=1.0).

4

Relativity Extraction & Rate Filing

Exponentiated GENMOD parameter estimates to derive multiplicative relativities. Output written to SQL rating table. Cross-validated against lifted Gini plots and actual-to-expected charts. Submitted rate schedule to the actuarial review committee.

5

Power BI Dashboard & RLS

4-page Power BI report built on the SQL rating and experience tables. RLS roles: Actuary (full access), Product Mgr (line of business filter), Regional Head (territory filter). Dashboard published to Power BI Service with scheduled refresh.

Pricing Analytics Dashboard Insights

Actual vs Expected Loss Ratio Pre & Post Repricing

Rating Factor Relativities (GLM Output)

Loss Ratio by Vehicle Age Band Before & After

Premium Adequacy Index by Territory (Post-GLM)

Code Samples

/* ─── GLM Frequency Model PROC GENMOD ─── */
/* Pure Premium: Tweedie family, log link */

PROC GENMOD data=work.pricing_data;
    /* Exposure offset */
    OFFSET log_exposure;
    
    /* Rating variables all treated as CLASS */
    CLASS
        age_band         (REF='30-40')
        vehicle_age_band (REF='0-3yrs')
        cc_band          (REF='1000-1499cc')
        geo_zone         (REF='Zone_C')
        ncb_class        (REF='NCB00')
        usage_type       (REF='SDL')
        si_band          (REF='Band_3');
    
    /* Model statement: Tweedie with log link */
    MODEL pure_premium =
        age_band
        vehicle_age_band
        cc_band
        geo_zone
        ncb_class
        usage_type
        si_band
        / DIST=tweedie LINK=log
          DSCALE
          TYPE3;
    
    /* Export parameter estimates to SAS dataset */
    OUTPUT OUT=work.glm_params
        PREDICTED=fitted_pure_premium
        STDRESCHI=pearson_resid;
RUN;

/* Convert log-scale parameters to multiplicative relativities */
DATA work.relativities;
    SET work.glm_params (keep=Parameter Level1 Estimate StdErr);
    relativity       = EXP(Estimate);
    lower_95ci       = EXP(Estimate - 1.96 * StdErr);
    upper_95ci       = EXP(Estimate + 1.96 * StdErr);
    pct_from_base    = (relativity - 1) * 100;
RUN;

/* Write relativities back to SQL for pricing engine */
PROC SQL;
    CONNECT TO odbc (datasrc="PricingDW");
    EXECUTE (
        TRUNCATE TABLE dbo.GLM_Relativities
    ) BY odbc;
    INSERT INTO odbc.dbo.GLM_Relativities
        SELECT * FROM work.relativities;
    DISCONNECT FROM odbc;
QUIT;
-- Pricing experience staging view
CREATE OR ALTER VIEW dbo.vw_PricingExperience
AS
SELECT
    p.PolicyID,
    p.PolicyYear,
    p.ProductCode,
    p.Territory,
    -- Rating variables
    t.AgeBand          AS age_band,
    t.VehicleAgeBand   AS vehicle_age_band,
    t.CCBand           AS cc_band,
    t.GeoZone          AS geo_zone,
    t.NCBClass         AS ncb_class,
    t.UsageType        AS usage_type,
    t.SIBand           AS si_band,
    -- Exposure and premium
    e.EarnedExposure                              AS exposure_py,
    LOG(e.EarnedExposure)                         AS log_exposure,
    p.EarnedPremium                               AS earned_premium,
    -- Loss metrics
    ISNULL(c.IncurredLoss, 0)                    AS incurred_loss,
    ISNULL(c.ClaimCount, 0)                      AS claim_count,
    ISNULL(c.IncurredLoss, 0) /
        NULLIF(p.EarnedPremium, 0)               AS loss_ratio,
    ISNULL(c.IncurredLoss, 0) /
        NULLIF(e.EarnedExposure, 0)              AS pure_premium,
    -- A/E ratio (requires GLM expected values loaded)
    ISNULL(c.IncurredLoss, 0) /
        NULLIF(gr.ModelExpected, 0)              AS ae_ratio
FROM            dbo.Policies            p
INNER JOIN      dbo.TechnicalFactors    t   ON p.PolicyID = t.PolicyID
INNER JOIN      dbo.EarnedExposure      e   ON p.PolicyID = e.PolicyID
LEFT  JOIN      dbo.ClaimsSummary      c   ON p.PolicyID = c.PolicyID
                                         AND p.PolicyYear = c.ClaimYear
LEFT  JOIN      dbo.GLM_ModelOutput    gr  ON p.PolicyID = gr.PolicyID
WHERE p.PolicyStatus IN ('INFORCE', 'EXPIRED')
  AND p.PolicyYear BETWEEN YEAR(GETDATE())-5 AND YEAR(GETDATE());
-- Actual vs Expected Loss Ratio
Actual Loss Ratio =
DIVIDE(
    SUM( PricingExp[incurred_loss] ),
    SUM( PricingExp[earned_premium] ),
    0
)

Expected Loss Ratio (GLM) =
DIVIDE(
    SUM( PricingExp[model_expected_loss] ),
    SUM( PricingExp[earned_premium] ),
    0
)

A/E Ratio =
DIVIDE( [Actual Loss Ratio], [Expected Loss Ratio (GLM)], 1 )

-- Premium adequacy index (1.0 = perfectly adequate)
Premium Adequacy Index =
DIVIDE(
    SUM( PricingExp[earned_premium] ),
    SUM( PricingExp[risk_cost_estimate] ),
    0
)

-- RLS: Row-Level Security filter table (defined in Model view)
-- Role: Product Manager filters to assigned LOB
-- DAX expression in RLS role filter on PricingExp table:
LOB Filter (RLS) =
CONTAINS(
    VALUES( UserLOBMapping[ProductCode] ),
    UserLOBMapping[ProductCode],
    LOOKUPVALUE(
        UserLOBMapping[ProductCode],
        UserLOBMapping[UserEmail],
        USERPRINCIPALNAME()
    )
)

-- YTD vs Prior Year Loss Ratio for repricing alert
LR Variance vs PY =
VAR CY = CALCULATE( [Actual Loss Ratio],
              DATESYTD( DateTable[Date] ) )
VAR PY = CALCULATE( [Actual Loss Ratio],
              SAMEPERIODLASTYEAR( DATESYTD( DateTable[Date] ) ) )
RETURN CY - PY

GLM Rating Factor Relativities Key Variables

Rating VariableLevelRelativity95% CI Lower95% CI UpperPrior RateType 3 p-value
Geographic ZoneZone A (metro)1.481.391.571.30< 0.0001
Geographic ZoneZone B (urban)1.211.141.281.15< 0.0001
Vehicle Age10+ years1.351.251.461.18< 0.0001
NCB ClassNCB 50%0.620.580.670.65< 0.0001
Age Band18–25 yrs1.821.691.961.60< 0.0001
Engine CC2000+ cc1.291.181.411.220.0004
Usage TypeCommercial1.541.441.651.40< 0.0001
Policy Tenure≥ 5 years0.880.830.940.900.0021