Using the NIST LEV Metric to Detect Potentially Exploited CVEs

Written by Sıla Özeren Hacıoğlu | May 30, 2025 7:27:45 AM

Security teams are flooded with vulnerability data, yet most of it leads nowhere. Thousands of CVEs are labeled critical every year, but only a small fraction are ever exploited. Traditional scoring systems like CVSS measure theoretical severity, while EPSS predicts short-term risk. Neither answers the question that matters most in day-to-day prioritization: Has this vulnerability likely been exploited already?

To help close that gap, NIST introduced the Likely Exploited Vulnerabilities (LEV) metric, a probabilistic model that estimates the historical likelihood of exploitation using past EPSS data.

In this blog, we explain how LEV works, what it adds to existing frameworks, and why combining it with continuous validation techniques like breach and attack simulation and automated pentesting is key to translating probability into action.

What Is the LEV Metric?

The Likely Exploited Vulnerabilities (LEV) metric is a proposed probabilistic score developed by NIST to estimate the likelihood that a vulnerability has already been exploited in the wild. Unlike the Exploit Prediction Scoring System (EPSS), which forecasts future exploitation within a 30-day window, LEV quantifies past exploitation probability using historical EPSS scores.

In plain terms: The longer a vulnerability keeps getting non-trivial EPSS scores, the more likely it has already been exploited.

The Role of LEV in Traditional Vulnerability Management

TL;DR

Too Many CVEs, Too Little Time: In 2024 alone, over 41,000 CVEs were published, more than 60% rated as high or critical. Patching them all is unrealistic [1].
5% or Fewer Are Exploited: Studies show that only a small fraction of vulnerabilities, around 5%, are ever exploited in the wild [2].
LEV Adds Evidence-Based Focus: LEV helps security teams zero in on vulnerabilities that show signs of past exploitation, reducing noise and improving remediation efficiency.

Vulnerability prioritization remains one of the biggest challenges for cybersecurity teams. Each year, tens of thousands of new CVEs are disclosed, with the majority labeled as high or critical severity. As a result, teams are often overwhelmed by large patch backlogs.

Yet in practice, only a small fraction of these vulnerabilities are ever exploited. This disconnect between volume and actual risk highlights a major limitation in current prioritization approaches.

Traditional methods focus on theoretical severity scores or short-term exploit predictions. But they often overlook a critical factor: Has this vulnerability already been exploited in the wild?

To help answer that, NIST introduced the LEV metric in May 2025 through Cybersecurity White Paper CSWP 41.

LEV estimates the probability that a vulnerability has been exploited in the past by analyzing historical EPSS trends. This approach offers a new perspective. Instead of relying solely on potential impact, LEV prioritizes vulnerabilities based on patterns of real-world attacker behavior.

The Context: What CVSS, EPSS and KEV Metrics Miss

Before diving into the mechanics of NIST’s Likely Exploited Vulnerabilities (LEV) metric, it's important to understand the strengths and limitations of existing vulnerability scoring and prioritization methods, such as CVSS, EPSS, KEV.

Metric	Purpose	Limitations
CVSS	Quantifies the severity of vulnerabilities based on technical impact, exploitability, and scope	Does not account for threat context or real-world exploitation; may over- or under-prioritize vulnerabilities
CISA’s KEV	Confirms past exploitation via public sources	Not comprehensive (covers ~0.5% of CVEs)
EPSS	Predicts future exploitation within 30 days	Underscores previously exploited CVEs
NIST’s LEV	Estimates probability of past exploitation	Dependent on EPSS quality; probabilistic, not definitive

Common Vulnerability Scoring System (CVSS)

The CVSS assigns a static score between 0.0 and 10.0 to represent the technical severity of a vulnerability. It evaluates multiple factors such as attack vector, exploit complexity, impact on confidentiality, integrity, and availability

However, CVSS is inherently theoretical. It models the maximum potential impact of a vulnerability, not its real-world behavior. As a result, CVSS can overemphasize vulnerabilities that are unlikely to be exploited and underrepresent low-score CVEs that attackers frequently target.

Exploit Prediction Scoring System (EPSS)

The EPSS takes a more predictive approach. It uses machine learning and threat intelligence to estimate the probability that a vulnerability will be exploited in the next 30 days.

This short-term outlook makes EPSS incredibly useful for prioritization, but it has a blind spot. By design, EPSS excludes past exploitation as a feature to maintain its forward-looking accuracy. That means even if a CVE has been exploited in the wild many times before, EPSS might assign it a low score if the data doesn’t suggest imminent exploitation.

Known Exploited Vulnerabilities (KEV)

The Known Exploited Vulnerabilities (KEV) list, maintained by organizations like CISA, provides binary confirmation that a vulnerability has been exploited. While highly actionable, KEV lists are manually curated, reactive, non-exhaustive.

Thousands of exploited vulnerabilities never make it into KEV catalogs due to limited visibility, lack of confirmation, or reporting lag.

How LEV Works: A Simplified Breakdown

The LEV formula works by estimating the probability that a vulnerability has been exploited at least once over a given time period. It does this by first calculating the probability that the vulnerability was not exploited in each individual 30-day (or partial) window, based on the EPSS score and the length of that window.

Not in window 1
And not in window 2
And not in window 3
...

These non-exploitation probabilities are then multiplied together under the assumption that each window is independent.

In other words, the multiplication gives us the overall probability of never being exploited, across all windows.

LEV = 1− P(no exploitation in any window)

P(no exploitation ever) = P_1 × P_ 2 × ... × P_n

To get the opposite, that it was exploited at least once, we subtract that product from 1. The subtraction from 1 flips that into the risk that it was exploited at least once.

This final result, the LEV score, represents the cumulative likelihood of exploitation across all windows and reflects how risk compounds over time.

A Technical Deep Dive into the LEV Equation

The LEV metric estimates the probability that a vulnerability has been exploited in the wild by aggregating EPSS scores over time. The formula is:

Figure 1. NIST’s LEV Equation

Below is the table that explains each variable.

Variable	Description
v	a vulnerability (e.g., a CVE)
d	a date without a time component (e.g., CVE-2025-3102)
d_0	the first date on which an EPSS score is available for the associated v
d_n	the date on which the calculation should be performed (usually the present day)
epss(v, d)	the EPSS score for vulnerability v on date d
dates(d_0, d_n, w)	the set of dates to include d_0 and not exceeding d_n, formed by adding multiples of w to d_0
datediff(d_i, d_j)	the number of days between d_i and d_j inclusive
winsize(d_i, d_n, w)	datediff(d_i, d_n) ≥ w: w datediff(d_i, d_n) < w: datediff(d_i, d_n)
weight(d_i, d_n, w)	winsize(d_i, d_n, w) / w

A Step-by-Step Case Study for NIST’s LEV Metric for a Known CVE

Here is the step-by-step LEV calculation for CVE-2025-3102, based on the formal NIST LEV formula and the real EPSS score of 84.4% as of May 28, 2025.

Step 1: Define Dates

The vulnerability CVE-2025-3102 was published on April 10, 2025.

For our analysis, we evaluate its likelihood of exploitation using the LEV formula as of May 28, 2025 (the date of this writing). This gives us a total observation period of 49 days between the publication and evaluation dates.

Publish Date = April 10, 2025

Evaluation Date = May 28, 2025

Total Duration = May 28, 2025 − April 10, 2025 = 49 days

Step 2: Generate Time Windows

The LEV metric evaluates risk over rolling 30-day windows. Given a 49-day observation period between April 10, 2025 and May 28, 2025, we identify the start of each 30-day window as follows:

dates(d_0,d_n,30) = {April 10, May 10}

So, we have two windows.

Window 1: April 10 – May 9 (30 days). This is a full window
Window 2: May 10 – May 28 (19 days). This is a partial window

Step 3: Compute Weights

For each date d_i, calculate the following.

The full window (April 10 - May 9). Each is 30 days long.

weight(d_i,d_n,30) = 30 = 1.0

The partial window (May 10–May 28). It is 19 days.

datediff(May 10, May 28) = 19
weight(May 10, May 28, 30) = 19 / 30 ≈ 0.633

Step 4: Apply the Current EPSS Score

On May 28, 2025, we checked the latest available EPSS score for CVE-2025-3102. The score was reported as 0.844, or 84.4%, indicating a high probability of exploitation within the next 30 days.

Figure 2. EPSS Scores for the Past 30 Days

epss(v, d_i) = 0.844

Step 5: Apply the LEV Formula

We apply this same EPSS score across both windows.

Window 1 (April 10–May 9):

Full window → weight = 1.0
1 − (0.844 × 1.0)= 0.1561

Window 2 (May 10–May 28):

Partial window → weight = 19/30 ≈ 0.633
1 − (0.844 × 0.633) ≈ 1 − 0.534 = 0.4661

Step 6: Multiplication for Non-Exploitation Probability

After calculating the individual probabilities that CVE-2025-3102 was not exploited in each window, we combine them to find the overall probability that the vulnerability remained unexploited throughout the entire 49-day observation period.

This is done by multiplying the two values:

P(not exploited in either window) = 0.156 × 0.466 ≈ 0.0726

This result tells us there's roughly a 7.26% chance that the vulnerability was never exploited during this period.

Step 7: Final Step (Subtract from 1)

To determine the likelihood that at least one exploitation occurred, we take the complement of the non-exploitation probability. In other words, we subtract the result from 1:

LEV= 1 - 0.0726 = 0.9274 or 92.74%

This means that, as of May 28, 2025, the LEV for CVE-2025-3102 is approximately 92.74%. Given its high EPSS score and ongoing exposure, this indicates a strong likelihood that the vulnerability has already been exploited in the wild.

LEV vs. LEV2: Time Granularity Explained

There are two versions of the LEV model: LEV and LEV2. Both are built on the same logic but differ in time granularity.

LEV Formula (30-day intervals)

LEV divides the timeline from the first available EPSS score (d_0) to the current date (d_n) into 30-day windows.

For each window, it takes the EPSS score from the first day, multiplies it by a weight (to account for partial windows), and then computes the product of the non-exploitation probabilities for each window. The final LEV score is the complement of this product.

LEV2 Formula (1-day intervals)

LEV2 applies the same model but uses daily EPSS scores, dividing each by 30 to simulate daily probability since EPSS scores represent 30-day likelihoods. While this offers higher resolution, especially useful for fast-evolving vulnerabilities, LEV2 is computationally expensive.

NIST reports that calculating LEV2 for the entire CVE dataset may take several hours on commodity hardware.

Figure 3. Equation for LEV2 Metric from NIST

Composite Exploitation Probability Explained

TL:DR|

NIST proposes that LEV can be most effectively used in combination with EPSS and KEV in a composite scoring approach:

Confirmed exploited vulnerabilities (KEV),
Those with imminent exploitation risk (EPSS),
And those statistically likely to have been exploited (LEV)

Figure 4. Composite Exploitation Probability by NIST

NIST recommends using a composite scoring approach to better prioritize vulnerabilities by combining three key indicators of exploitation risk. The composite score is calculated as the maximum of EPSS, KEV, and LEV for a given CVE, ensuring that the strongest available signal is used.

EPSS reflects the probability of future exploitation within 30 days, KEV indicates confirmed past exploitation, and LEV estimates the historical likelihood that a vulnerability has already been exploited based on EPSS trends over time.

By taking the maximum of these values, this method helps avoid blind spots that could result from relying on a single, potentially incomplete data source. It enables vulnerability management teams to confidently focus on threats that are either proven, imminent, or statistically likely to have been exploited.

Use Cases and Operational Considerations for the LEV Metric

KEV list validation: Organizations can use LEV to evaluate whether high-LEV CVEs are missing from curated KEV lists. This could reveal blind spots in vendor advisories or threat intel sources.
Patch prioritization: Security teams overwhelmed with alerts may choose to prioritize CVEs with high LEV scores over others with lower risk indicators.
Board reporting: LEV provides a quantifiable, continuous value for past exploitation risk, useful for summarizing posture trends and justifying remediation decisions.

However, it is important to note that LEV is a probabilistic estimate, not forensic evidence. It does not confirm exploitation; it only models likelihood based on one input: the history of EPSS scores.

Limitations and Caveats of NIST’s LEV Score

Data Dependency and Model Bias

NIST’s LEV metric is a probabilistic model built entirely on historical EPSS scores. As a result, it inherits all the data limitations, biases, and blind spots present in the EPSS system. These may include underrepresentation of certain sectors, geographic regions, or less-monitored attack surfaces. If a threat is not visible to EPSS data providers, LEV cannot account for it, regardless of actual exploitation activity.

Unrealistic Independence Assumption

Another methodological caveat lies in LEV’s assumption that exploitation across time windows is statistically independent. In reality, attackers often behave persistently, repeatedly probing the same targets, chaining vulnerabilities, or timing campaigns to coincide with patch cycles. Assuming clean statistical separation between time periods risks underestimating how threats evolve and persist.

Misuse as a Standalone Score

Like any numeric risk model, LEV can be misused if treated as a definitive answer. A high LEV score does not mean a vulnerability will be exploited in every environment, and a low score does not guarantee safety. NIST emphasizes that LEV is meant to augment, not replace, other data sources and security validation processes.

No Proof of Exploitability in Your Environment

Perhaps the most critical limitation is that LEV offers no proof that a vulnerability is exploitable in your organization’s unique environment. Even if a CVE has a high LEV score, layered defenses, network segmentation, and compensating controls may already mitigate it. Conversely, a low-scoring CVE might pose significant risk in a poorly secured or misconfigured system. Static scores cannot see your real attack paths.

Why Continuous Exposure Validation Is Essential

To bridge the gap between theoretical probability and actual risk, organizations need continuous exposure validation. This means testing whether vulnerabilities can be exploited in practice, considering the real-world behavior of attackers and the current state of your security controls.

Tools like Breach and Attack Simulation (BAS) and automated penetration testing enable ongoing validation, tracking how defensive measures respond across time, configuration changes, and product updates. Without this continuous feedback loop, LEV remains just that: a probability, not a guarantee.

Perspectives from the Community

Industry reactions to LEV have been mixed. Some see it as a valuable supplement to traditional vulnerability scoring, especially for organizations lacking real-time threat intelligence. Others question whether LEV’s mathematical elegance might obscure the limitations of its input data.

Critics also point out that LEV cannot assess intent or target selection logic. A CVE may be widely exploitable but see no real-world use due to environmental irrelevance or lack of attacker interest. These nuances are beyond the reach of purely statistical models.

Final Thoughts

LEV introduces a novel and much-needed approach to estimating historical exploitation risk by analyzing EPSS score behavior over time. It is mathematically transparent, operationally feasible, and provides a valuable new perspective in the prioritization process.

But it’s important to remember: LEV is still a global score.

It reflects statistical trends across the internet, not the reality of your specific network, configurations, or compensating controls.

A high LEV score might mean nothing if your defenses already neutralize the threat. Conversely, a low score might lull you into complacency if an attacker finds a unique way to exploit that same CVE in your environment. That’s why LEV must be paired with continuous validation. Security teams need to simulate attacks, test controls, and verify outcomes in their own environment, not rely solely on external probability models.

Breach and Attack Simulation (BAS) and Automated Penetration Testing make this possible by providing ongoing, context-aware feedback on what can actually be exploited today, not just what might have been exploited somewhere else.

Whether LEV becomes an industry standard remains to be seen. But its arrival underscores a broader shift: as vulnerability volumes increase and patch fatigue deepens, organizations need smarter ways to focus. The real value of LEV isn’t just in the number, it’s in how it sharpens the questions we ask about risk.

View full post