Measuring Changes in Breach Rates

security differently
Author

John Benninghoff

Published

June 5, 2024

Modified

June 6, 2024

A critical review of using breach frequency as a measure of security success, inspired by Incident Metrics in SRE: Critically Evaluating MTTR and Friends.

library(poilog)
library(tibble)
library(dplyr)

Background

Incident Metrics in SRE: Critically Evaluating MTTR and Friends demonstrated the difficulty at detecting changes in MTTR over time. The paper tests a 10% reduction in MTTR using Monte Carlo simulation, and found:

“Yikes! Even though in the simulation the improvement always worked, 38% of the simulations had the MTTR difference fall below zero for Company A, 40% for Company B, and 20% for Company C. Looking at the absolute change in MTTR, the probability of seeing at least a 15-minute improvement is only 49%, 50%, and 64%, respectively. Even though the product in the scenario worked and shortened incidents, the odds of detecting any improvement at all are well outside the tolerance of 10% random flukes.”

Safety Differently argues that we shouldn’t measure success based on the absence of negative events. We can use real data on data breaches and Monte Carlo simulation to explore the question, “Is an organization’s breach rate a useful metric?”

Poisson Log-Normal Model

The 2022 Information Risk Insights Study (IRIS) found that a Poisson log-normal distribution fits the data and provides realistic forecasts for years with multiple breaches, and provides suggested values for mean (\(\mu\)) and standard deviation (\(\sigma\)) in Table 1, along with the likelihood of a breach with models that are adjusted for firm size in Table 2, since larger firms are more likely to experience a breach.

Using the provided parameters from Table 1, we get:

breach_poilog <- function(mu, sig, runs = 1e6) {
  breaches <- rpoilog(runs, mu, sig)
  tibble(
    "One or more" = sum(breaches >= 1) / runs,
    "Two or more" = sum(breaches >= 2) / runs,
    "Three or more" = sum(breaches >= 3) / runs
  )
}

breach_poilog(mu = -2.284585, sig = 0.8690759)
# A tibble: 1 × 3
  `One or more` `Two or more` `Three or more`
          <dbl>         <dbl>           <dbl>
1         0.129        0.0163         0.00256

These values are close to the upper bound for a firm in the $100M to $1B range on Table 2.

Poisson Model

If we were to use the number of breaches a firm has experienced as a useful metric, in the best-case scenario, we would have a high rate of breaches with a significant reduction after implementing a new control. Unfortunately, the IRIS paper doesn’t provide the model parameters for the group with the highest likelihood of a breach, firms with more than $100B in revenue. Additionally, because it is easier to understand and calculate improvements, we’d like to use a plain Poisson distribution, which has a single parameter for average arrival rate, \(\lambda\).

We can find a reasonable value for \(\lambda\) using trial and error:

breach_pois <- function(lambda, runs = 1e6) {
  breaches <- rpois(runs, lambda)
  tibble(
    "One or more" = sum(breaches >= 1) / runs,
    "Two or more" = sum(breaches >= 2) / runs,
    "Three or more" = sum(breaches >= 3) / runs
  )
}

base_lambda <- 0.347
breach_pois(base_lambda)
# A tibble: 1 × 3
  `One or more` `Two or more` `Three or more`
          <dbl>         <dbl>           <dbl>
1         0.293        0.0481         0.00543

A Poisson model with a \(\lambda\) of 0.347 (a little more than once every three years) gives similar results to the Poisson log-normal model for the largest firms for one or more breaches, but underestimates the likelihood for 2+ and 3+.

Simulate Improvement

Using the Poisson model, it is easy to simulate improvements.

Let’s assume that the firm implements a new security control that reduces the likelihood of a breach by 25%, and compares the number of breaches in the 5 years before and after implementing that control. We then simulate this scenario 100,000 times:

simulation <- function(years, reduction, runs = 1e5) {
  tibble(
    base = replicate(runs, sum(rpois(years, base_lambda))),
    new = replicate(runs, sum(rpois(years, base_lambda * (1 - reduction))))
  ) |>
    mutate(
      better = .data$new < .data$base, worse = .data$new > .data$base,
      same = .data$new == .data$base
    )
}

results <- simulation(5, 0.25)
head(results, 10)
# A tibble: 10 × 5
    base   new better worse same 
   <int> <int> <lgl>  <lgl> <lgl>
 1     2     1 TRUE   FALSE FALSE
 2     2     3 FALSE  TRUE  FALSE
 3     2     2 FALSE  FALSE TRUE 
 4     1     2 FALSE  TRUE  FALSE
 5     2     2 FALSE  FALSE TRUE 
 6     1     0 TRUE   FALSE FALSE
 7     1     1 FALSE  FALSE TRUE 
 8     1     3 FALSE  TRUE  FALSE
 9     2     3 FALSE  TRUE  FALSE
10     2     0 TRUE   FALSE FALSE

OK, what are the results?

summarize_results <- function(data) {
  data |>
    summarize(
      better = sum(.data$better) / n(), worse = sum(.data$worse) / n(),
      same = sum(.data$same) / n()
    )
}

summarize_results(results)
# A tibble: 1 × 3
  better worse  same
   <dbl> <dbl> <dbl>
1  0.477 0.288 0.235

For this scenario, even though the true rate of breaches has been reduced by 25%, the number of breaches in the 5 years after implementing the control is better less than 50% of the time, and is actually worse nearly 30% of the time!

Larger Sample

To improve the likelihood of detecting the improvement, we can adjust the scenario, taking a larger sample of years before and after implementing the new control. How many years would we need to get a confident measurement of improvement?

simulation(10, 0.25) |>
  summarize_results()
# A tibble: 1 × 3
  better worse  same
   <dbl> <dbl> <dbl>
1  0.559 0.285 0.156
simulation(50, 0.25) |>
  summarize_results()
# A tibble: 1 × 3
  better worse   same
   <dbl> <dbl>  <dbl>
1  0.758 0.190 0.0528
simulation(250, 0.25) |>
  summarize_results()
# A tibble: 1 × 3
  better  worse    same
   <dbl>  <dbl>   <dbl>
1  0.957 0.0363 0.00659

Only the simulation which tests 250 years before and after (a total of 500 years!) gives a somewhat reasonable indication of improvement - and keep in mind that “better” is a reduction of any amount. Running metrics for a period this long at a single firm is impractical, which leads to our conclusion…

Conclusion

Even in a “best case” scenario of a very large firm that is able to measure the number of breaches over a 10 year period, a 25% reduction in breach rate only results in a lower number of actual breaches less than 50% of the time. A sample period of 500 years or more is needed to have confidence that the breach rate has been reduced.

Realistically, this means that the only way to test the effectiveness of security controls is by sampling across large numbers of firms. A single organization simply won’t have enough data to make reasonable conclusions based on events that happen so infrequently.