28 Sensitivity Analysis

Yun-Tien Lee

“Sensitivity analysis is the art of understanding how small changes in assumptions can lead to big changes in outcomes, helping us navigate uncertainty with clarity.” — Unknown

28.1 Chapter Overview

Different approaches to understanding the sensitivity of a model to changes in its inputs: derivatives, finite differences, global sensitivity analysis approaches, and statistical approaches.

28.2 Setup

Let’s assume there are certain insurance policies throughout the chapter.

using Dates

@enum Sex Female = 1 Male = 2
@enum Risk Standard = 1 Preferred = 2

mutable struct Policy
    id::Int
    sex::Sex
    benefit_base::Float64
    COLA::Float64
    mode::Int
    prem::Float64
    pp::Int
    issue_date::Date
    issue_age::Int
    risk::Risk
end

28.3 The Data

using MortalityTables
sample_csv_data =
    IOBuffer(
        raw"id,sex,benefit_base,COLA,mode,prem,pp,issue_date,issue_age,risk
         1,M,100000.0,0.03,1,1000.0,3,1999-12-05,30,Std"
    )

mort = Dict(
    Male => MortalityTables.table(988).ultimate,
    Female => MortalityTables.table(992).ultimate,
)

Dict{Sex, OffsetArrays.OffsetVector{Float64, Vector{Float64}}} with 2 entries:
  Male   => [0.022571, 0.022571, 0.022571, 0.022571, 0.022571, 0.022571, 0.0225…
  Female => [0.00745, 0.00745, 0.00745, 0.00745, 0.00745, 0.00745, 0.00745, 0.0…

using CSV, DataFrames

policies = let

    # read CSV directly into a dataframe
    # df = CSV.read("sample_inforce.csv",DataFrame) # use local string for notebook
    df = CSV.read(sample_csv_data, DataFrame)

    # map over each row and construct an array of Policy objects
    map(eachrow(df)) do row
        Policy(
            row.id,
            row.sex == "M" ? Male : Female,
            row.benefit_base,
            row.COLA,
            row.mode,
            row.prem,
            row.pp,
            row.issue_date,
            row.issue_age,
            row.risk == "Std" ? Standard : Preferred,
        )
    end

end

1-element Vector{Policy}:
 Policy(1, Male, 100000.0, 0.03, 1, 1000.0, 3, Date("1999-12-05"), 30, Standard)

Given a basic insurance product, a pure whole of life (WOL) policy with level benefits and level premiums payable within the first 10 years, the reserve at the end of the \(y^{th}\) policy year is defined by

\[ res(y) = \sum_{t=age+y}^{120} (sur_{t-age-y} * mort_t * B_y * \sqrt{1 + r}) - (P_y * sur_{t-age-y}) \]

where

\(mort_t\) is the mortality at age \(t\)
\(p_y\) is the survival probability adjusted with COLA, with values of
- \(p_{y-1} = 1\),
- \(p_x = p_{x-1} * (1 - mort_{age+y}) / (1 + COLA)\) for \(x >= y\), and
- 0 for \(x < y - 1\) or \(age + x >= 120\), or ultimate age of the current mortality table
\(B_y\) is the level benefit throughout the policy
\(P_y\) is the level premium within the first 10 policy years which is 0 for policy years after 10
\(r\) is the level interest rate throughout the policy

function sur(y::Int, pol::Policy)
    if y == 0
        1
    elseif y < 0 || 120 - y <= pol.issue_age
        0
    else
        sur(y - 1, pol) * (1 - mort[pol.sex][pol.issue_age+y]) / (1 + pol.COLA)
    end
end

function res(y::Int, pol::Policy)
    s = 0.0
    if y >= 1 && y <= 120 - pol.issue_age
        for t in (pol.issue_age+y):120
            prem = 0.0
            if y <= pol.pp
                prem = pol.prem
            end
            s += sur(t - pol.issue_age - y, pol) * mort[pol.sex][t] * pol.benefit_base - prem * sur(t - pol.issue_age - y, pol)
        end
    end
    s
end

res (generic function with 1 method)

28.4 Common Sensitivity Analysis Methodologies

28.4.1 Finite Differences

Define a customized finite difference function with respect to the COLA, rippled by a small difference.

function res_wrt_r_fd(y::Int, pol::Policy, r::Float64, h=1e-3)
    p₊, p₋ = deepcopy(pol), deepcopy(pol)
    p₊.COLA, p₋.COLA = r + h, r - h
    (res(y, p₋) - res(y, p₊)) / (2res(y, pol))
end

res_wrt_r_fd(2, policies[1], 0.03) # changes in reserve at year 2 when the interest rate at 3% with a perturbation of 0.1%

0.021366520936389077

28.4.2 Regression Analyses

using GlobalSensitivity

function r1_wrt_r(r)
    p = deepcopy(policies[1])
    p.COLA = r[2]
    p.prem = r[3]
    res(Int(floor(r[1])), p)
end

# reserve @ year 1/2, interest rate @ 0.03 ± 0.01, prem @ 1000.0 ± 0.1
reg_anal = gsa(r1_wrt_r, RegressionGSA(), [[1, 2], [0.029, 0.031], [999.9, 1000.1]], samples=1000)
@show reg_anal.pearson

reg_anal.pearson = [0.002266826033036182 -0.9999603237826435 -0.003255946696849575]

1×3 Matrix{Float64}:
 0.00226683  -0.99996  -0.00325595

The Pearson Spearman coefficients show the correlation coefficient matrix between inputs and outputs.

28.4.3 Sobol Indices

Sobol is a variance-based method, and it decomposes the variance of the output of the model or system into fractions which can be attributed to inputs or sets of inputs. This helps to get not just the individual parameter’s sensitivities, but also gives a way to quantify the affect and sensitivity from the interaction between the parameters.

\[ Y = f_0 + \sum_{i=1}^{d}f_i(X_i) + \sum_{i<j}^{d}f_{ij}(X_i, X_j) + ... + f_{1,2,...,d}(X_1, X_2, ..., X_d) \]

\[ Var(Y) = \sum_{i=1}^{d}V_i + \sum_{i<j}^{d}V_{ij} + ... + V_{1,2,...,d} \]

The Sobol Indices are “ordered”, the first order indices given by \(S_i = \dfrac{V_i}{Var(Y)}\), the contribution to the output variance of the main effect of \(X_i\). Therefore, it measures the effect of varying \(X_i\) alone, but averaged over variations in other input parameters. It is standardized by the total variance to provide a fractional contribution. Higher-order interaction indices \(S_{ij}, S_{ijk}\) and so on can be formed by dividing other terms in the variance decomposition by \(Var(Y)\).

using QuasiMonteCarlo, GlobalSensitivity

# reserve @ year 1/2, interest rate @ 0.03 ± 0.01, prem @ 1000.0 ± 0.1
L, U = QuasiMonteCarlo.generate_design_matrices(1000, [1, 0.029, 999.9], [2, 0.031, 1000.1], SobolSample())
s = gsa(r1_wrt_r, Sobol(), L, U)
@show s.S1
@show s.ST

┌ Warning: The `generate_design_matrices(n, d, sampler, R = NoRand(), num_mats)` method does not produces true and independent QMC matrices, see [this doc warning](https://docs.sciml.ai/QuasiMonteCarlo/stable/design_matrix/) for more context. 
│     Prefer using randomization methods such as `R = Shift()`, `R = MatousekScrambling()`, etc., see [documentation](https://docs.sciml.ai/QuasiMonteCarlo/stable/randomization/)
└ @ QuasiMonteCarlo ~/.julia/packages/QuasiMonteCarlo/KvLfb/src/RandomizedQuasiMonteCarlo/iterators.jl:255
s.S1 = [0.0, 1.2308143537488936, -0.00010730838653271882]
s.ST = [0.0, 1.0014202549551285, 5.83151055643934e-6]

3-element Vector{Float64}:
 0.0
 1.0014202549551285
 5.83151055643934e-6

The output shows the first order and total order of variations in different input parameters.

28.4.4 Morris Method

The Morris method also known as Morris’s OAT method where OAT stands for One At a Time can be described in the following steps:

\[ EE_i = \frac{f(x_1, x_2, ...x_i + \Delta, ...x_k) - y}{\Delta} \]

We calculate local sensitivity measures known as “elementary effects”, which are calculated by measuring the perturbation in the output of the model on changing one parameter.

These are evaluated at various points in the input chosen such that a wide “spread” of the parameter space is explored and considered in the analysis, to provide an approximate global importance measure. The mean and variance of these elementary effects is computed. A high value of the mean implies that a parameter is important, a high variance implies that its effects are non-linear or the result of interactions with other inputs. This method does not evaluate separately the contribution from the interaction and the contribution of the parameters individually and gives the effects for each parameter which takes into consideration all the interactions and its individual contribution.

using GlobalSensitivity

# reserve @ year 1/2, interest rate @ 0.03 ± 0.01, prem @ 1000.0 ± 0.1
m = gsa(r1_wrt_r, Morris(), [[1, 2], [0.029, 0.031], [999.9, 1000.1]])
@show m.means
@show m.variances

m.means = [0.0 -722655.8489878364 -17.197307422327103]
m.variances = [0.0 1.457292039978048e8 0.009344140590861581]

1×3 Matrix{Float64}:
 0.0  1.45729e8  0.00934414

From the means it can be observed which variables are more important, and the variances imply higher degree of nonlinearity or interactions with other variables.

28.4.5 Fourier Amplitude Sensitivity Tests

FAST offers a robust, especially at low sample size, and computationally efficient procedure to get the first and total order indices as discussed in Sobol. It utilizes monodimensional Fourier decomposition along a curve, exploring the parameter space. The curve is defined by a set of parametric equations,

\[ EE_i = \frac{f(x_1, x_2, ...x_i + \Delta, ...x_k) - y}{\Delta} \]

where \(s\) is a scalar variable varying over the range \(−∞ < s < +∞\), \(G_i\) are transformation functions and \(w_i, ∀i=1,2,…,N\) is a set of different (angular) frequencies, to be properly selected, associated with each factor for all \(N\) (samples) number of parameter sets.

using GlobalSensitivity

# reserve @ year 1/2, interest rate @ 0.03 ± 0.01, prem @ 1000.0 ± 0.1
fast = gsa(r1_wrt_r, eFAST(), [[1, 2], [0.029, 0.031], [999.9, 1000.1]], samples=1000)
@show fast.S1
@show fast.ST

fast.S1 = [5.41782112263861e-12 0.9976904881022093 5.793960601815569e-6]
fast.ST = [7.287395510369166e-7 0.9999937214978624 0.0023218885194085104]

1×3 Matrix{Float64}:
 7.2874e-7  0.999994  0.00232189

The output shows the first order and total order of variations in different input parameters.

28.4.6 Automatic Differentiation

By applying the chain rule repeatedly on elementary operations of computer calculations, automatic differentiation can be applied to measure impacts of small differences. More details in Chapter 16 on automatic differentiation.

28.4.7 Scenario Analyses

Scenarios can be generated following scenario generation methodologies to evaluate impacts. More details in Chapter 26 on scenario generation.

When scenarios are generated to evaluate sensitivities, one may need to take the following into consideration.

Reverse stress testing. Reverse stress testing in scenario analysis involves identifying extreme scenarios that could potentially lead to catastrophic outcomes for a financial institution or a system. Unlike traditional sensitivity testing to simulate the impact of adverse events on the system, reverse stress testing starts with a catastrophic outcome and works backwards to determine the combination of events or circumstances that could lead to such an outcome.

One typically follows these steps to do reverse stress testing. – Define a critical failure point (e.g., bankruptcy, system outage, regulatory breach). – Analyze the combinations of events or variables that could cause the failure. – Model the path from normal conditions to the adverse outcome.

Potential benefits that reverse stress testing could bring include: – Focusing on Vulnerabilities: Highlights specific scenarios to avoid at all costs. – Enhancing Resilience: Strengthens systems against extreme risks. – Regulatory Compliance: Often required in highly regulated industries like banking and energy.

Stylistic scenarios. Developing stylistic scenarios in scenario analysis involves creating narratives or storylines that describe plausible future states or situations. These scenarios are crafted to capture key uncertainties, trends, and factors that could significantly impact the organization, industry, or environment under study.
Backtesting against historical data. Backtesting in scenario analysis involves an iterative process of using past data to validate the effectiveness and accuracy of scenarios developed for forecasting future outcomes. Scenarios are first defined and applied on selective historical data, and refined after any discrepancies of scenario outcomes versus historical results are identified.

One typically follows these steps to do backtesting against historical data. – Define Scenarios: Establish hypothetical scenarios (e.g., market crashes, changes in interest rates, or operational disruptions), and ensure scenarios cover a range of possibilities, such as best-case, base-case, and worst-case scenarios. – Collect Historical Data: Gather relevant historical data for key variables (e.g., stock prices, interest rates, production metrics), and ensure data spans periods where similar events occurred in the past. – Model Scenario Impacts: Use historical data to simulate the impacts of the scenarios on key metrics or performance indicators. – Compare Results: Compare the modeled results of the scenarios with the actual historical outcomes, and assess how well the scenarios predict or explain the observed data. – Adjust and Refine: If the scenarios do not align with historical outcomes, refine the assumptions or parameters in the scenario models, and incorporate lessons learned from historical patterns to improve future scenario analyses.

Some considerations in incorporating historical data. – Data Quality: Ensure historical data is accurate, complete, and relevant to the scenarios being tested. – Model Limitations: Scenario models are based on assumptions that might not fully capture real-world complexities. – Overfitting: Avoid fine-tuning scenarios to perfectly match historical outcomes, as this reduces their applicability to future events. – Changing Dynamics: Historical events may not fully represent future possibilities due to changes in market conditions, regulations, or technology.