What Is a YARA Rule?

LAST UPDATED: December 27, 2023

Picus Labs   By Picus Labs  •  October 23, 2023, 12 min read

YARA rules serve as a crucial tool for cyber security experts in detecting and analyzing malware. To truly harness its power, one must understand its core components and operational dynamics.  

In this article, we delve into the technical intricacies of YARA rules. We'll break down its five critical elements, such as malware strings and imports, elucidate the underlying mechanics of how YARA rules function, and provide an in-depth, step-by-step guide to crafting an effective YARA rule. Ideal for defensive security professionals aiming to enhance their understanding and application of YARA.

What Is a YARA Rule?

YARA, humorously coined as "Yet Another Ridiculous Acronym," is a framework dedicated to large-scale pattern matching, where rules are its cornerstone. These YARA rules are devised to classify and identify malware samples, constructing descriptions of malware families rooted in textual or binary patterns. 

Beyond merely detecting malware, they're adept at pinpointing a diverse array of digital artifacts, from commonplace files to intricate hex sequences within binary datasets.

Each YARA rule is structured around sets of strings, regular expressions, and condition logic. These components are meticulously designed to detect specific byte sequences or behaviors in the analyzed samples. Due to its precision and adaptability, the YARA framework is a favorite among cybersecurity professionals, malware researchers, and threat analysts, aiding in both digital forensic investigations and real-time threat detection.

Five YARA Elements to Know Before Writing a YARA Rule

Understanding the main components of a YARA rule is crucial for anyone looking to leverage its capabilities. 

Below are the five primary elements of a YARA rule:

  • Import
  • Metadata
  • Strings
  • Condition
  • Rule Name

Below, you will see a detailed description for each element with a corresponding example.

Import 

In YARA, the import section allows a rule to utilize external modules that augment its capabilities. These modules provide additional functions or methods, enabling the rule to access and evaluate data types or structures that are not natively supported by YARA's standard rule language.

For instance, the PE module lets YARA rules analyze the internal structures of Portable Executable files, which are commonly used in Windows environments. By importing this module, one can craft rules that search for specific characteristics within a PE file's headers or sections.

The syntax for importing a module is straightforward:

import "pe"

Once imported, you can access the functions and data types specific to that module within your rule's conditions.

Metadata of a YARA Rule

A YARA rule can have a metadata section, which pertains to the additional information provided by the author of the rule, such as:

  • Author
  • Source
  • Description
  • Date.

Below, you will find an example YARA rule for AvosLocker Ransomware presented by the joint Cybersecurity Advisory (CSA) by CISA [1].

rule NetMonitor
{
  meta:

    author = "FBI"
    source = "FBI"
    sharing = "TLP:CLEAR"
    status = "RELEASED"
    description = "Yara rule to detect NetMonitor.exe"
    category = "MALWARE"
    creation_date = "2023-05-05"
  strings:
     $rc4key = {11 4b 8c dd 65 74 22 c3}
     $op0 = {c6 [3] 00 00 05 c6 [3] 00 00 07 83 [3] 00 00 05 0f 85 [4] 83 [3] 00 00 01 75 ?? 8b [2] 4c 8d [2] 4c 8d [3] 00 00 48 8d [3] 00 00 48 8d [3] 00 00 48 89 [3] 48 89 ?? e8}
  condition:
    uint16(0) == 0x5A4D
    and filesize < 50000
    and any of them

}

Strings

When analyzing the behavior and attributes of a malware sample, one crucial step involves the extraction and examination of embedded strings. These strings can provide insights into the malware's capabilities, such as whether it has the potential to query or modify the system's registry.

Figure 1. Strings Listed from an Arbitrary Malware Example

The figure above displays a set of strings extracted from a malware example, named "ABC" for our discussion. These strings, such as "RegCreateKeyA" or "CreatePipe", are indicative of the potential actions the malware might perform. For instance:

Strings related to "Reg" often pertain to registry operations.

  • "CreatePipe" might be an indicator of inter-process communication.
  • Calls to "KERNEL32.dll" suggest interactions with core Windows functions.

The Extraction Process:

When embarking on the extraction process, one of the handy tools available is strings.exe from the Sysinternals Suite, which is renowned for its efficacy in extracting human-readable strings from binary files.

To use it, navigate to the directory containing strings.exe:

C:\Users\Desktop\Tools\SysinternalSuites>

Then, execute the tool on your target malware sample, in this case, "ABC":

C:\Users\Desktop\Tools\SysinternalSuites>strings.exe "C:\Users\Desktop\Analysis\Examples\ABC"

Upon executing this command, the tool sifts through the binary data of the file, hunting for sequences that resemble human-readable strings. The outcome can be an extensive list, but as demonstrated in Figure 1, not all strings are of equal relevance. Some, like the ones related to registry operations, stand out as particularly interesting for further analysis.

By understanding the significance of these strings and recognizing patterns among them, analysts can make informed decisions about the potential behavior and intent of the malware sample in question.

Strings in a YARA Rule

YARA, a powerful tool in the cybersecurity arsenal, employs rules to classify files. These rules depend on discerning patterns or unique features in a file to determine its intent, whether harmless or malicious. 

Strings, sequences of characters that represent information, are integral to both programming and data representation and often play a crucial role in this determination. They can be thought of as the textual DNA within a program. They are not just mere sequences of characters; they narrate the software's intent and functionality. 

These strings can range from security codes embedded in compressed files to cryptocurrency addresses found in ransomware. In YARA, strings can be defined in various formats, including text strings, hexadecimal byte sequences, and regular expressions.

Here's a basic overview of how the strings section is structured in a YARA rule:

rule ExampleRule {
    strings:
        $a = /RegCreateKeyA|CreatePipe|KERNEL32\.dll/
       
    condition:
        $a
}

In this example:

  • $a is a regular expression that searches for any of the terms "RegCreateKeyA", "CreatePipe", or "KERNEL32.dll".

The rule will be considered a match if the string pattern $a (as defined above) is found within the analyzed file or data. In simpler terms, if you run a file through YARA with this rule, YARA will flag the file as a match to "ExampleRule" if it contains any of the strings "RegCreateKeyA", "CreatePipe", or "KERNEL32.dll".

The importance of the strings section cannot be overstated. The precision and accuracy of these strings determine the rule's efficacy. Well-defined strings can drastically reduce false positives and false negatives, making the rule a potent tool in malware detection and threat hunting.

Conditions

This section provides the logic that needs to be satisfied for the rule to match.

In a CISA alert, AA23-129A, issued by CISA titled "Hunting Russian Intelligence 'Snake' Malware," there is a specific and relatively basic YARA rule that we can examine to understand the condition element within a YARA rule.

rule HighEntropy
{
    meta:
        description = "entropy rule"

    condition:
       math.entropy(0, filesize) >= 7.0
}

Let’s break down the condition part.

math.entropy(0, filesize)

This calculates the entropy of the entire file. Entropy is a measure of randomness or unpredictability in a dataset. In the context of files or data streams, a high entropy value often indicates compressed, encrypted, or packed data because such data appears random.

>= 7.0 

This condition checks if the calculated entropy is greater than or equal to 7.0. Typically, entropy values range between 0 and 8 for binary data. A value close to 8 indicates high randomness (e.g., encrypted data), whereas a value close to 0 indicates low randomness (e.g., large blocks of repeated or null bytes).

Rule Name

The title is self-explanatory; the rule name in a YARA rule denotes the name of the rule. 

In the example we discussed under the "Conditions" subsection, we examined a rule that checks the entropy of a file to determine if it's encrypted. In this instance, the rule's name is "HighEntropy."

rule HighEntropy
{
    meta:
        description = "entropy rule"

    condition:
        math.entropy(0, filesize) >= 7.0
}

The Structure of a YARA Rule

YARA rules provide a way for researchers to identify patterns within files, making it a powerful tool for malware detection. YARA's proprietary rule-writing language is intuitive yet demands a deep understanding of the desired patterns. 

Essential to every YARA command are two arguments: 

  • the rule file, and 

  • the target (file, directory, or process ID). 

For instance, suppose we aim to apply the rule "threatDetect.yar" on a directory named "suspiciousFiles". The appropriate command would be:

yara threatDetect.yar suspiciousFiles

Each rule must possess a unique name and a defining condition

For instance, in the rule below named "malwarePattern", the condition checks if a certain string pattern, "maliciousCode", is present:

rule malwarePattern {
    strings:
        $codePattern = "maliciousCode"

    condition:
        $codePattern
}

Here, "malwarePattern" is the unique name, and the presence of the string "maliciousCode" serves as its defining condition.

What Is an Example of a YARA Rule?

In this instance, we will explore a YARA rule aimed at identifying malicious actions associated with the MOVEit Transfer Zero Day Vulnerability. 

This rule has been shared by CISA in their CL0P Ransomware Gang Cybersecurity Advisory. Click here to see the advisory (with alert AA23-158A), and other YARA rules.

rule CISA_10450442_01 : LEMURLOOT webshell communicates_with_c2 remote_access
{
  meta:
      Author = "CISA Code & Media Analysis"
      Incident = "10450442"
      Date = "2023-06-07"
      Last_Modified = "20230609_1200"
      Actor = "n/a"
      Family = "LEMURLOOT"
      Capabilities = "communicates-with-c2"
      Malware_Type = "webshell"
      Tool_Type = "remote-access"
      Description = "Detects ASPX webshell samples"
      SHA256_1 = "3a977446ed70b02864ef8cfa3135d8b134c93ef868a4cc0aa5d3c2a74545725b"
  strings:
      $s1 = { 4d 4f 56 45 69 74 2e 44 4d 5a }
      $s2 = { 25 40 20 50 61 67 65 20 4c 61 6e 67 75 61 67 65 3d }
      $s3 = { 4d 79 53 51 4c }
      $s4 = { 41 7a 75 72 65 }
      $s5 = { 58 2d 73 69 4c 6f 63 6b 2d }
  condition:
      all of them
}

The provided YARA rule is designed to detect the presence of the LEMURLOOT webshell, which was used by the TA505 group in the described attack [2]. 

Let's break down the rule:

meta:

  • Author: The creator or maintainer of the YARA rule, in this case, "CISA Code & Media Analysis".

  • Incident: Associated with incident number "10450442".

  • Date & Last_Modified: Show when the rule was created and the last time it was modified.

  • Actor: Currently not specified (n/a), could be updated if the specific actor is identified.

  • Family: Malware family this rule targets, which is "LEMURLOOT".

  • Capabilities: What the malware or tool is capable of. In this case, it's communicating with its command & control (C2) server.

  • Malware_Type: The type of malware being detected; here it's a "webshell".

  • Tool_Type: It's a "remote-access" tool, indicating it can be used to remotely access compromised systems.

  • Description: Provides a brief about the rule. This one detects "ASPX webshell samples".

  • SHA256_1: A SHA-256 hash of a known sample of the malware or tool. This could be used for further verification.

strings: 

The strings section contains byte sequences, text, or regular expressions to match against files:

  • $s1: Looks for the byte pattern corresponding to "MOVEit.DMZ", likely part of the code or a library that the webshell interacts with.

The string "4d 4f 56 45 69 74 2e 44 4d 5a" is a sequence of hexadecimal values. Each pair of characters represents a byte, and each byte can be converted to its ASCII representation to decode the string.

Here's the decoding of our string:

String Hexadecimals

ASCII Conversion

4d

M

4f

O

56

V

45

E

69

i

74

t

2e

.

44

D

4d

M

5a

Z

So, combining all the ASCII characters together gives you the string "MOVEit.DMZ".

  • $s2: Looks for the byte pattern which translates to a typical declaration in ASPX files setting the language, possibly indicating an ASPX page.
  • $s3: Searches for the byte pattern corresponding to "MySQL", hinting that the webshell interacts with MySQL databases.
  • $s4: Searches for the byte pattern corresponding to "Azure", suggesting that the webshell may have functionalities related to Azure.
  • $s5: Looks for the byte pattern "X-siLock-", which could be related to a specific HTTP header or parameter the webshell uses for authentication or command execution.

Condition:

Specifies under which circumstances the rule should generate an alert:

  • all of them: This means that for the rule to generate a match, all of the above strings ($s1 to $s5) must be present in the analyzed file.

In summary, this YARA rule will trigger if it detects a file (likely an ASPX file given the indicators) that contains all the mentioned strings, suggesting it's an instance of the LEMURLOOT webshell.

9 Best Practices for Writing a YARA Rule

In this section, we listed nine main best practices for writing a YARA rule.

1. Aiming for Great Signatures:

  • Not just matching specific samples, but similar ones by the same author or group.

  • E.g., Instead of just catching a typo like “Micorsoft Corportation,” aim to detect patterns associated with the malware author [3].

2. Rule Complexity:

  • Combine very specific strings with common strings for broader but still precise detection.

3. Using Magic Headers & File Size:

  • Make the magic header the first condition for performance.

  • Restrict file sizes to avoid false positives.

    • E.g., filesize > 200KB and filesize < 600KB [4].

4. Location-Specific Detection:

  • If malware code is in an unexpected location, use it as a signature [4].

  • E.g., Detecting strings relative to the end of the file like $s1 in (filesize-500..filesize)

5. String Specificity:

  • Use strings less likely to be changed by attackers in future versions.

  • E.g., Function combinations in webshells that are less likely to change than variable names.

6. Performance Optimization:

  • Follow guidelines to write Yara rules that utilize fewer CPU cycles.

  • Avoid CPU-intensive checks and employ new condition-checking shortcuts.

7. Caution with New Modules:

  • Avoid using newer, untested modules in production environments.

  • E.g., Being cautious about using the PE module until it's tested thoroughly.

8. Use Tools for Rule Generation and Analysis:

  • yarGen for rule creation with specific parameters.

  • yarAnalyzer to check rule coverage and detect overlapping rules.

9. String Extraction and Colorization:

  • Use tools or scripts to review and colorize strings in a sample.

  • E.g., Using a shell one-liner for string extraction and prisma for colorization [4].

Incorporating these best practices with the previously mentioned ones will ensure a more comprehensive and effective approach to Yara rule writing.

Frequently Asked Questions (FAQs)
Here are the most frequently asked questions about YARA Rule.
What Does YARA Stand for?
YARA stands for "Yet Another Recursive Acronym." Initially designed by Victor Manuel Alvarez to help malware researchers identify and classify malware samples, YARA has evolved into a versatile tool employed by cybersecurity experts worldwide. With its unique syntax, YARA provides an efficient method for pattern matching, making it a crucial tool in threat detection and digital forensics.
What Is YARA in Cybersecurity?
In cybersecurity, YARA is a pivotal tool for creating and applying patterns to detect malware or other malicious files in a digital environment. It helps analysts create custom rules for unique threats, improving the efficiency and accuracy of malware detection. Through YARA, analysts can pinpoint specific malware families, variants, or behaviors, bolstering defense mechanisms and expediting response times.
What Are YARA Rules in Ransomware?
YARA rules in ransomware context refer to patterns designed to detect and classify ransomware threats. Given the evolving nature of ransomware, YARA rules provide a means to identify even the newest strains by looking for specific behaviors, signatures, or strings. For instance, a rule might target unique ransom notes or specific encryption methods employed by a ransomware family. Utilizing these rules, organizations can swiftly detect and mitigate ransomware threats.
References
Please click here to see the references

[1] “#StopRansomware: AvosLocker Ransomware (Update),” Cybersecurity and Infrastructure Security Agency CISA. Available: https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-284a. [Accessed: Oct. 23, 2023]

[2] “#StopRansomware: CL0P Ransomware Gang Exploits CVE-2023-34362 MOVEit Vulnerability,” Cybersecurity and Infrastructure Security Agency CISA. Available: https://www.cisa.gov/news-events/cybersecurity-advisories/aa23-158a. [Accessed: Oct. 23, 2023]

[3] F. Roth, “How to Write Simple but Sound Yara Rules,” Feb. 16, 2015. Available: https://www.nextron-systems.com/2015/02/16/write-simple-sound-yara-rules/. [Accessed: Oct. 23, 2023]

[4] F. Roth, “How to Write Simple but Sound Yara Rules - Part 2 - Nextron Systems,” Oct. 17, 2015. Available: https://www.nextron-systems.com/2015/10/17/how-to-write-simple-but-sound-yara-rules-part-2/. [Accessed: Oct. 23, 2023]

Table of Contents:

THE BLUE REPORT 2023

Understand the 4 trade-offs limiting security teams in managing their organization's threat exposure.

DOWNLOAD