VirusTotal Intelligence: A goldmine of Closed Source Intelligence (CSINT)

The purpose of this post is to document hunting procedures leveraging VirusTotal Intelligence [1]. At a high level, this post is aligned to the MITRE ATT&CK technique T1597 - Search Closed Sources documented in [2].

The procedures discussed, may introduce alternative routes of acquiring intelligence that can be leveraged during Red Team Assessments (RTA) and go deeper than the documented Open Source Intelligence counterparts.

The post is divided in the following sections:

Pivoting on VirusTotal Intelligence
Brief Review of Referenced Literature
Risk Mitigation
Search Modifiers Summary Table
References

Pivoting on VirusTotal Intelligence

VirusTotal has the potential to become a useful tool in the arsenal of a Red Team seeking to identify sensitive information of a target network. Public literature [3], [13], [14] indicates that sensitive information indeed finds its way to VirusTotal, access to which - primarily - threat intelligence analysts have. This post aims to increase the awareness among defensive teams through high-level examples of searches that can be made on VirusTotal.

A quick and easy way to search on VirusTotal that additionally does not require knowledge of YARA, is through VTGrep [6]. VirusTotal from time to time publishes useful posts, one of which is the ‘VT Intelligence Cheat Sheet’ [5] that provides an insight of how to use the available search modifiers. A curated list of official articles that discuss the supported functionality is available at [9].

What makes VirusTotal the goldmine it is, is the combination of the resources it hosts (files, domains, URLs) and the connection it offers between these resources. A user can easily pivot to many other resources starting from a certain point.

The sections that follow provide generic examples about searches and the results these may return. All the searches can be performed by either using the web interface or the VT command line interface (cli) available in [12].

Email Addresses

Email addresses and their respective structure constitute an important piece of information attackers are after. This is because an email address may be used for phishing activities at a later stage in the attack liecycle. Additionally, the username part of an email address is the piece of information used to further login to internal systems.

Assuming the domain of the target organization we are after is nxfictionaldomainnx.com, we can use VirusTotal to search for email addresses of users. The VirusTotal search modifiers that can be used for this purpose, is the content modifier. The following search will return any files that contain email addresses:

content:”@nxfictionaldomainnx.com”

In my experience, a search like the above may return matches within:

password dumps
emails (.eml or .msg)
html pages
portable executable (PE)
other types of files

Of particular interest for further pivoting are the password dumps as well as the email files. It is very common for users to use their work email address to sign up for services they use privately (such as forums). If they additionally sign up re-using the password of their work account then a leak of that password through a third-party compromise could allow an attacker access the work account as well. There are many marketplaces in the internet that trade compromised accounts.

One of the very first steps of reconnaissance is the acquisition of bounced emails to investigate if information from internal systems is returned within the email headers. This step may actually not be required if email files exist on VirusTotal. Additionally, this prevents communicating with the target infrastructure directly. By analyzing an email file it is possible to identify what systems have forwarded the email (hops) and the structure of the internal domain name (for example nxfictionaldomainnx.local).

The information acquired in this step can drive further searches on VirusTotal or on other search engines to identify additional hosts, software in use or even credentials posted publicly. It is quite common for developers to post log output from internal systems in public portals. The author, pivoting on an internal domain name, identified a PowerShell console log that discloses a password of a user domain account consistent with an administrator.

Email files obviously contain email addresses and as such can be used to also acquire valid email addresses of employees. Furthermore, the body of an email (the actual content) may reveal sensitive information such as the operating system of the mail server (e.g. included within meeting invites - look for MIME content type ‘text/calendar’), sensitive attachments or even plaintext credentials.The content of an email can also provide information about processes within the targetted organization.

Certificates

When looking for portable executable files (PE) results may include files that are signed with a code signing certificate. VirusTotal parses the certificate, calculates the fingerprint/thumbprint of the certificate and with this process it makes it easy to identify other samples that may have been signed with the same certicate. The search modifier for this search, is:

signature:”<certificate_thumbprint>”

The signature modifier is not limited to only the thumbprint but can also be expanded to include information such as the signer’s name. If the company ‘NxFictionalDomainNx LLC’ produces a software that uses internally and signs it with a certificate, the following search modifier could be used to identify what other sample may have been signed by this company (not necessarily with the same code signing certificate):

signature:”NxFictionalDomainNx LLC”

File Metadata

Sensitive Documents

Documents such as Microsoft Excel, Word, Powerpoint etc, may include the company name - or other information - in the metadata section. Using the company name and the relevant search modifiers it may be possible to identify sensitive company files on VirusTotal. An example search could be the following:

metadata:”<company name>” (type:xls or type:xlsx or type:doc or type:docx)

In the era of cloud environments a relatively big number of companies use Microsoft cloud products. When a company creates a Microsoft tenant, this tenant is assigned a unique identifier generally known as tenant ID. When a document is labeled with a sensitive label, this label is added in the metadata section of the document along with the tenant ID and a label GUID [10]. Since the tenant ID is public information, anyone can associate a company with a tenant ID (if they have one) and look up the tenant ID on Virustotal. This search takes the following form:

metadata:”<tenant ID>” (type:xls or type: xlsx or type:doc or type:docx)

With the above searches, someone could find the following information:

sensitive intellectual property of the company (client information, internal processes, sensitive images within the files)
email addresses or names of employees (can later be confirmed on social media and be used for phishing)
internal systems and therefore internal domain names (excel documents may disclose printer names)

The information returned may be used in further searches to identify even more related files.

PDB

Files uploaded on VirusTotal may include debug artifacts such as PDB paths and GUID. These artifacts can become extremely valuable. Pivoting using these may reveal additional useful resources. Internal tools may include PDB paths that disclose internal system names, usernames, software, etc. Parts of the path that look unique are good candidates ffor subsequent searches. The search modifier for file metadata is metadata.

Presuming the fictional PDB path C:\NXFICTIONALDOMAINNX\DEVOPSCICD\Projects\Deployment\Remote\AppServices.pdb is located in a file, the following search could be a good lead for identifying related files:

metadata:”C:\\NXFICTIONALDOMAINNX\\DEVOPSCICD”

For more information about PDB and its use as a threat hunting pivot see reference [7]. An example of the metadata modifier in use to identify documents related to Emotet campaigns is demonstrated in [8].

Relations

This section describes how relations can be leveraged to identify additional related files and provide additional context. As a starting point we have a file that was found using one of the methods that have already been discussed.

VirusTotal parses a file and lists any relations between that file and additional resources. For example, if the submitted file was in a compressed directory, in the relations tab of that file, VirusTotal will list the compressed parent as well as all the other files that might be included in the compressed parent. In another example, if a file makes a request to a domain, a URL or communicates with an IP address, these will be listed in the relations tab. Therefore, the users of VT Intelligence can easily pivot to these domains, URLs or IPs to identify additional files that may have these resources in common.

Submitter ID

When a submission is made to VirusTotal, the submitter is assigned an ID - the submitter ID. It is possible to monitor submissions on VirusTotal using the API [4] and identify addtional samples that have potentially been submitted by the same submitter. However, it must be noted that this functionality was not designed for this purpose and therefore it may not be a reliable way of jumping into conclusions.

Learning from others

Although this section does not describe a certain search with search modifiers, it stands on its own to underscore the knowledge that teams can acquire by finding various data on VirusTotal. More specifically, individuals can use the findings from VirusTotal to improve their understanding on what techniques or payloads other teams are using and get inspiration to add these techniques in their workflow. Going a step further, the observed techniques can be improved to evade detections, act as an example or be combined with additional techniques to create new attack paths which can be used to assess the maturity of implemented defenses.

Configuration and other files

It is not only payloads people can identify on Virustotal to mine useful information from. All sorts of data is uploaded to VirusTotal. Bearing this in mind, VirusTotal can be used to search for files that have specific extensions or content. Consider for example an application that reads its configuration from a specific file. Although the vendor may have sufficient documentation in the public domain, someone with no prior experience with the product may benefit from a configuration to is found on VirusTotal.

There is so much software available in the world, therefore listing all the potential configuration files that exist is impossible. That does not limit us from naming some example configuration files that can be identified on Virustotal - likely coming from production environments - and can assist in furthering one’s understanding of the technologies. Two of them are web.config (configuration for applications hosted IIS) and configuration.aamp (Ivanti application control configuration files, discussed elsewhere in this website [11]).

To summarize this section, if wondering about “how this looks like in production”, VirusTotal may have the answer!

Azure Blob Storage accounts

It is possible to identify Azure Blob accounts on VirusTotal, by applying the following filter:

entity:domain domain_regex:.*placeholder.*\.blob\.core\.windows\.net

Just replace the word placeholder with a string descriptive of the entity you are researching.

Subdomain Enumeration

This is a rather short note that VirusTotal can be used for domain enumeration purposes. It is useful as it can reveal subdomains along the IP addresses these domain resolve to.

Gateway IP

It possible to identify IP addresses of gateways in files uploaded on VirusTotal. Gateway IP is the IP address users of a network are accessing the Internet from. This information may be useful to Red Teams that want to implement guardrails to ensure only intended targets establish a connection with the command and control server.

The search of a gateway IP can start from (at least) two different points:

A range of IP addresses
A domain

By using the search modifier entity:ip ip:<IP RANGE>, VirusTotal may return a list of IP addresses. Each of these IP addresses may be referred to (see Relations tab) by files that are in the VirusTotal database. Similarly, by using the search modifier content:"@<DOMAIN>" (the second part of an email address), may return files that include an email address of the target organization, along with IP addresses.

The files returned by both searches may include various logs (for example, legitimate files or even logs from information stealers) that include gateway IP addresses. Further analysis is required to verify what purpose each IP address serves. An indicator that an IP is used as gateway, could be the lack of any open ports.

Phone Numbers

VirusTotal can be a source of phone numbers that may later be used for vishing (voice phishing) and smishing (SMS phishing) operations. This is possible because files that include data such as name, last names, email addresses as well as phone numbers may be found in text format within the vast pool of data of VirusTotal. These files are primarily found in underground marketplaces and for some reason end up on VirusTotal.

In a practical example let’s consider an employee that use their name, last name, their corporate email address and their corporate mobile number to register in a web portal for some service. If the underlying database that supports this web portal is compromised and the collected data are put for sale in underground marketplaces, these data can be used for further targeting of that employee. So by simply searching for the corporate domain on VirusTotal, it may be possible to identifiy data in CSV format that may include information extracted from the compromised database.

There are additional scenarios though. For example, any employee may have been the victim of phishing and entered their personal information such as their corporate number to a phishing portal. The cybercriminals running the portal collect the submitted data and offer it in underground forums for a set price or in auctions. For some reason, this file again is uploaded to VirusTotal and then becomes available to users who can search within its content.

Brief Review of Referenced Literature

This section discusses published research on sensitive information that researchers have discovered on VirusTotal. At the time of writing, various online sources (see [3], [13], [14]) have addressed the topic:

Otorio in [3] discovered sensitive business information and the intellectual property of dozens of companies.
Safebreach in [13] describes the process that led them to discover thousands of credentials and other sensitive information, primarily harvested by information-stealing malware.
Researcher @YumiSec in [14] shares various examples of sensitive information included in submitted URLs.

Risk Mitigation

Organizations should be monitoring VirusTotal to identify any sensitive internal information that may have been exposed - intentionally or unintentionally. Sensitive information uploaded to VirusTotal may present a risk to business operations affecting people, processes and infrastructure. If it is not possible to build and maintain a monitoring capability internally, organizations should seek assistance from companies that offer Threat Intelligence services.

Search Modifiers Summary Table

search modifier	search result
content:”@<fqdn>” content:”@<fqdn>” type:pdf	employee email addresses email headers internal systems sensitive emails around internal processes
content:”<fqdn>”	potentially relevant binaries/internal tools command .line history logs with credentials
metadata:”\<directory>\” metadata:”<tenant ID>” metadata:”<company name>”	files that include this metadata
signature:”<company name>” signature:<certificate thumbprint>	signed binaries
entity:domain domain_regex:.placeholder.\.blob\.core\.windows\.net	Azure Blob Storage Accounts

References

[1] https://www.virustotal.com/gui/intelligence-overview

[2] https://attack.mitre.org/techniques/T1597/

[3] https://www.otorio.com/blog/manufacturers-unknowingly-leak-classified-project-files/

[4] https://developers.virustotal.com/reference/feeds-file-hourly

[5] https://blog.virustotal.com/2022/12/vt-intelligence-cheat-sheet.html

[6] https://support.virustotal.com/hc/en-us/articles/360001386897-Content-search-VTGrep-

[7] https://www.mandiant.com/resources/blog/definitive-dossier-of-devilish-debug-details-part-one-pdb-paths-malware

[8] https://0xdf.gitlab.io/2019/05/22/emotet-pivot.html

[9] https://support.virustotal.com/hc/en-us/sections/360000340597-VT-Enterprise

[10] https://learn.microsoft.com/en-us/purview/sensitivity-labels-office-apps

[11] https://stmxcsr.com/micro/application-control.html

[12] https://github.com/VirusTotal/vt-cli

[13] https://www.safebreach.com/blog/the-perfect-cyber-crime/

[14] https://medium.com/@YumiSec/virus-total-the-best-way-to-disclose-your-company-secrets-92988396f36a

tags:#reconnaissance