Security teams are flooded with vulnerability data, yet most of it leads nowhere. Thousands of CVEs are labeled critical every year, but only a small fraction are ever exploited. Traditional scoring systems like CVSS measure theoretical severity, while EPSS predicts short-term risk. Neither answers the question that matters most in day-to-day prioritization: Has this vulnerability likely been exploited already?
To help close that gap, NIST introduced the Likely Exploited Vulnerabilities (LEV) metric, a probabilistic model that estimates the historical likelihood of exploitation using past EPSS data.
In this blog, we explain how LEV works, what it adds to existing frameworks, and why combining it with continuous validation techniques like breach and attack simulation and automated pentesting is key to translating probability into action.
The Likely Exploited Vulnerabilities (LEV) metric is a proposed probabilistic score developed by NIST to estimate the likelihood that a vulnerability has already been exploited in the wild. Unlike the Exploit Prediction Scoring System (EPSS), which forecasts future exploitation within a 30-day window, LEV quantifies past exploitation probability using historical EPSS scores.
In plain terms: The longer a vulnerability keeps getting non-trivial EPSS scores, the more likely it has already been exploited. |
TL;DR
Too Many CVEs, Too Little Time: In 2024 alone, over 41,000 CVEs were published, more than 60% rated as high or critical. Patching them all is unrealistic [1].
5% or Fewer Are Exploited: Studies show that only a small fraction of vulnerabilities, around 5%, are ever exploited in the wild [2].
LEV Adds Evidence-Based Focus: LEV helps security teams zero in on vulnerabilities that show signs of past exploitation, reducing noise and improving remediation efficiency.
Vulnerability prioritization remains one of the biggest challenges for cybersecurity teams. Each year, tens of thousands of new CVEs are disclosed, with the majority labeled as high or critical severity. As a result, teams are often overwhelmed by large patch backlogs.
Yet in practice, only a small fraction of these vulnerabilities are ever exploited. This disconnect between volume and actual risk highlights a major limitation in current prioritization approaches.
Traditional methods focus on theoretical severity scores or short-term exploit predictions. But they often overlook a critical factor: Has this vulnerability already been exploited in the wild? |
To help answer that, NIST introduced the LEV metric in May 2025 through Cybersecurity White Paper CSWP 41.
LEV estimates the probability that a vulnerability has been exploited in the past by analyzing historical EPSS trends. This approach offers a new perspective. Instead of relying solely on potential impact, LEV prioritizes vulnerabilities based on patterns of real-world attacker behavior.
Before diving into the mechanics of NIST’s Likely Exploited Vulnerabilities (LEV) metric, it's important to understand the strengths and limitations of existing vulnerability scoring and prioritization methods, such as CVSS, EPSS, KEV.
Metric |
Purpose |
Limitations |
CVSS |
Quantifies the severity of vulnerabilities based on technical impact, exploitability, and scope |
Does not account for threat context or real-world exploitation; may over- or under-prioritize vulnerabilities |
CISA’s KEV |
Confirms past exploitation via public sources |
Not comprehensive (covers ~0.5% of CVEs) |
EPSS |
Predicts future exploitation within 30 days |
Underscores previously exploited CVEs |
NIST’s LEV |
Estimates probability of past exploitation |
Dependent on EPSS quality; probabilistic, not definitive |
The CVSS assigns a static score between 0.0 and 10.0 to represent the technical severity of a vulnerability. It evaluates multiple factors such as attack vector, exploit complexity, impact on confidentiality, integrity, and availability
However, CVSS is inherently theoretical. It models the maximum potential impact of a vulnerability, not its real-world behavior. As a result, CVSS can overemphasize vulnerabilities that are unlikely to be exploited and underrepresent low-score CVEs that attackers frequently target.
The EPSS takes a more predictive approach. It uses machine learning and threat intelligence to estimate the probability that a vulnerability will be exploited in the next 30 days.
This short-term outlook makes EPSS incredibly useful for prioritization, but it has a blind spot. By design, EPSS excludes past exploitation as a feature to maintain its forward-looking accuracy. That means even if a CVE has been exploited in the wild many times before, EPSS might assign it a low score if the data doesn’t suggest imminent exploitation.
The Known Exploited Vulnerabilities (KEV) list, maintained by organizations like CISA, provides binary confirmation that a vulnerability has been exploited. While highly actionable, KEV lists are manually curated, reactive, non-exhaustive.
Thousands of exploited vulnerabilities never make it into KEV catalogs due to limited visibility, lack of confirmation, or reporting lag.
The LEV formula works by estimating the probability that a vulnerability has been exploited at least once over a given time period. It does this by first calculating the probability that the vulnerability was not exploited in each individual 30-day (or partial) window, based on the EPSS score and the length of that window.
Not in window 1 |
These non-exploitation probabilities are then multiplied together under the assumption that each window is independent.
In other words, the multiplication gives us the overall probability of never being exploited, across all windows.
LEV = 1− P(no exploitation in any window)
P(no exploitation ever) = P_1 × P_ 2 × ... × P_n |
To get the opposite, that it was exploited at least once, we subtract that product from 1. The subtraction from 1 flips that into the risk that it was exploited at least once.
This final result, the LEV score, represents the cumulative likelihood of exploitation across all windows and reflects how risk compounds over time.
The LEV metric estimates the probability that a vulnerability has been exploited in the wild by aggregating EPSS scores over time. The formula is:
Figure 1. NIST’s LEV Equation
Below is the table that explains each variable.
Variable |
Description |
v |
a vulnerability (e.g., a CVE) |
d |
a date without a time component (e.g., CVE-2025-3102) |
d_0 |
the first date on which an EPSS score is available for the associated v |
d_n |
the date on which the calculation should be performed (usually the present day) |
epss(v, d) |
the EPSS score for vulnerability v on date d |
dates(d_0, d_n, w) |
the set of dates to include d_0 and not exceeding d_n, formed by adding multiples of w to d_0 |
datediff(d_i, d_j) |
the number of days between d_i and d_j inclusive |
winsize(d_i, d_n, w) |
datediff(d_i, d_n) ≥ w: w |
weight(d_i, d_n, w) |
winsize(d_i, d_n, w) / w |
Here is the step-by-step LEV calculation for CVE-2025-3102, based on the formal NIST LEV formula and the real EPSS score of 84.4% as of May 28, 2025.
The vulnerability CVE-2025-3102 was published on April 10, 2025.
For our analysis, we evaluate its likelihood of exploitation using the LEV formula as of May 28, 2025 (the date of this writing). This gives us a total observation period of 49 days between the publication and evaluation dates.
Publish Date = April 10, 2025 Evaluation Date = May 28, 2025 Total Duration = May 28, 2025 − April 10, 2025 = 49 days |
The LEV metric evaluates risk over rolling 30-day windows. Given a 49-day observation period between April 10, 2025 and May 28, 2025, we identify the start of each 30-day window as follows:
dates(d_0,d_n,30) = {April 10, May 10} |
So, we have two windows.
For each date d_i, calculate the following.
The full window (April 10 - May 9). Each is 30 days long.
weight(d_i,d_n,30) = 30 = 1.0 |
The partial window (May 10–May 28). It is 19 days.
datediff(May 10, May 28) = 19 |
On May 28, 2025, we checked the latest available EPSS score for CVE-2025-3102. The score was reported as 0.844, or 84.4%, indicating a high probability of exploitation within the next 30 days.
Figure 2. EPSS Scores for the Past 30 Days
epss(v, d_i) = 0.844 |
We apply this same EPSS score across both windows.
Window 1 (April 10–May 9):
Full window → weight = 1.0 |
Window 2 (May 10–May 28):
Partial window → weight = 19/30 ≈ 0.633 |
After calculating the individual probabilities that CVE-2025-3102 was not exploited in each window, we combine them to find the overall probability that the vulnerability remained unexploited throughout the entire 49-day observation period.
This is done by multiplying the two values:
P(not exploited in either window) = 0.156 × 0.466 ≈ 0.0726 |
This result tells us there's roughly a 7.26% chance that the vulnerability was never exploited during this period.
To determine the likelihood that at least one exploitation occurred, we take the complement of the non-exploitation probability. In other words, we subtract the result from 1:
LEV= 1 - 0.0726 = 0.9274 or 92.74% |
This means that, as of May 28, 2025, the LEV for CVE-2025-3102 is approximately 92.74%. Given its high EPSS score and ongoing exposure, this indicates a strong likelihood that the vulnerability has already been exploited in the wild.
There are two versions of the LEV model: LEV and LEV2. Both are built on the same logic but differ in time granularity.
LEV divides the timeline from the first available EPSS score (d_0) to the current date (d_n) into 30-day windows.
For each window, it takes the EPSS score from the first day, multiplies it by a weight (to account for partial windows), and then computes the product of the non-exploitation probabilities for each window. The final LEV score is the complement of this product.
LEV2 applies the same model but uses daily EPSS scores, dividing each by 30 to simulate daily probability since EPSS scores represent 30-day likelihoods. While this offers higher resolution, especially useful for fast-evolving vulnerabilities, LEV2 is computationally expensive.
NIST reports that calculating LEV2 for the entire CVE dataset may take several hours on commodity hardware.
Figure 3. Equation for LEV2 Metric from NIST
TL:DR|
NIST proposes that LEV can be most effectively used in combination with EPSS and KEV in a composite scoring approach:
Figure 4. Composite Exploitation Probability by NIST
NIST recommends using a composite scoring approach to better prioritize vulnerabilities by combining three key indicators of exploitation risk. The composite score is calculated as the maximum of EPSS, KEV, and LEV for a given CVE, ensuring that the strongest available signal is used.
EPSS reflects the probability of future exploitation within 30 days, KEV indicates confirmed past exploitation, and LEV estimates the historical likelihood that a vulnerability has already been exploited based on EPSS trends over time.
By taking the maximum of these values, this method helps avoid blind spots that could result from relying on a single, potentially incomplete data source. It enables vulnerability management teams to confidently focus on threats that are either proven, imminent, or statistically likely to have been exploited.
However, it is important to note that LEV is a probabilistic estimate, not forensic evidence. It does not confirm exploitation; it only models likelihood based on one input: the history of EPSS scores.
NIST’s LEV metric is a probabilistic model built entirely on historical EPSS scores. As a result, it inherits all the data limitations, biases, and blind spots present in the EPSS system. These may include underrepresentation of certain sectors, geographic regions, or less-monitored attack surfaces. If a threat is not visible to EPSS data providers, LEV cannot account for it, regardless of actual exploitation activity.
Another methodological caveat lies in LEV’s assumption that exploitation across time windows is statistically independent. In reality, attackers often behave persistently, repeatedly probing the same targets, chaining vulnerabilities, or timing campaigns to coincide with patch cycles. Assuming clean statistical separation between time periods risks underestimating how threats evolve and persist.
Like any numeric risk model, LEV can be misused if treated as a definitive answer. A high LEV score does not mean a vulnerability will be exploited in every environment, and a low score does not guarantee safety. NIST emphasizes that LEV is meant to augment, not replace, other data sources and security validation processes.
Perhaps the most critical limitation is that LEV offers no proof that a vulnerability is exploitable in your organization’s unique environment. Even if a CVE has a high LEV score, layered defenses, network segmentation, and compensating controls may already mitigate it. Conversely, a low-scoring CVE might pose significant risk in a poorly secured or misconfigured system. Static scores cannot see your real attack paths.
To bridge the gap between theoretical probability and actual risk, organizations need continuous exposure validation. This means testing whether vulnerabilities can be exploited in practice, considering the real-world behavior of attackers and the current state of your security controls.
Tools like Breach and Attack Simulation (BAS) and automated penetration testing enable ongoing validation, tracking how defensive measures respond across time, configuration changes, and product updates. Without this continuous feedback loop, LEV remains just that: a probability, not a guarantee.
Industry reactions to LEV have been mixed. Some see it as a valuable supplement to traditional vulnerability scoring, especially for organizations lacking real-time threat intelligence. Others question whether LEV’s mathematical elegance might obscure the limitations of its input data.
Critics also point out that LEV cannot assess intent or target selection logic. A CVE may be widely exploitable but see no real-world use due to environmental irrelevance or lack of attacker interest. These nuances are beyond the reach of purely statistical models.
LEV introduces a novel and much-needed approach to estimating historical exploitation risk by analyzing EPSS score behavior over time. It is mathematically transparent, operationally feasible, and provides a valuable new perspective in the prioritization process.
But it’s important to remember: LEV is still a global score.
It reflects statistical trends across the internet, not the reality of your specific network, configurations, or compensating controls. |
A high LEV score might mean nothing if your defenses already neutralize the threat. Conversely, a low score might lull you into complacency if an attacker finds a unique way to exploit that same CVE in your environment. That’s why LEV must be paired with continuous validation. Security teams need to simulate attacks, test controls, and verify outcomes in their own environment, not rely solely on external probability models.
Breach and Attack Simulation (BAS) and Automated Penetration Testing make this possible by providing ongoing, context-aware feedback on what can actually be exploited today, not just what might have been exploited somewhere else.
Whether LEV becomes an industry standard remains to be seen. But its arrival underscores a broader shift: as vulnerability volumes increase and patch fatigue deepens, organizations need smarter ways to focus. The real value of LEV isn’t just in the number, it’s in how it sharpens the questions we ask about risk.