Written by: Idan Ron
Background
My story started a few months ago, when I performed a red team assessment for a major retail company. During the Open Source Reconnaissance (OSINT) phase, I reviewed the SSL certificates that included the client name. In these certificates I identified that the client owned its own top-level domain (TLD). A TLD is the last part of a domain name, the letters that come after the final dot. For example, in the domain name “google.com,” the TLD is “.com”. Companies such as Google own many TLDs, including .goog, .go, and .google. Google manages all of those TLDs and the domains underneath them (for example, “site.goog”).
Owning TLDs is not uncommon. The following table lists various companies and the TLDs they own (for a full list of TLDs, see TLD-List or IANA List):
Company
TLD
.google, .go
Marriott
.marriott
American Express
.americanexpress, .amex
Barcelona
.Barcelona
Gucci
.gucci
Jaguar
.jaguar
Kia
.kia
Netflix
.netflix
The client’s TLD piqued my interest, so I used Google and GitHub to try to find tools that could provide me with a list of TLDs owned by an organization, or to search for domains under a specific TLD. To my surprise, I couldn’t find any.
You’re probably asking yourself, why should I care about TLDs? Well, one of my favorite bug bounty hunters, Jason Haddix, a person known within the hacker community for his OSINT abilities, once said during a Bug Hunter’s Methodology Live Course, “For every apex domain you find you 4x your chance of hacking the target.” Basically, the more TLDs you discover that belong to a company, the more likely you are to find those juicy vulnerabilities.
And so, I sought to compile a list of TLDs owned by the client, which could lead to the identification of new domains that might not have been known and scanned.
The Issue
As I mentioned earlier, I couldn’t find a tool or a site that would provide me with a list of TLDs and the specific domains for each TLD.
Currently, all of the enumeration tools’ focus is on subdomains. Subdomain enumeration is a big topic in the OSINT/Bug Bounty community, but domain enumeration based on TLD is not. There are projects such as Zip by Trickest (the repository was disabled due to DMCA takedown notice) that focus on brute-forcing a domain based on a specific TLD (in this case, the “.zip” TLD). The tool is actively brute-forcing domains based on a wordlist, and while it’s a good idea, I was wondering if there are other ways to do it passively that could lead to better results, similar to how passive subdomain enumeration is done.
Other sites, such as Internet Corporation for Assigned Names and Numbers (ICANN) or the Centralized Zone Data Service (CZDS), will allow you to perform a request to the TLD owner for those domain names. However, this would alert the customer to my actions, and thus is not a viable solution. The nature of Mandiant Red Team work requires remaining undetected.
So, I had to create something new. And this is how the new tool idea came to fruition.
<ListValue: [StructValue([(‘title’, ‘Introducing tldfinder!’), (‘body’, <wagtail.rich_text.RichText object at 0x3e50262aaf40>), (‘btn_text’, ‘Download tldfinder’), (‘href’, ‘https://github.com/projectdiscovery/tldfinder’), (‘image’, None)])]>
The Solution
After giving it some thought, I split the actions that needed to be taken to achieve the goal into two steps:
Find the TLD by performing a regex search from a list of TLDs for the client name and variations of their name. This could be achieved using sites such as the IANA list and TLD-List.
Determine how to search the domains inside the TLDs. Is it possible to use some of those subdomain enumeration techniques to search for “*.tld” instead of “*.domain.tld”?
There are many tools that perform subdomain enumeration passively and actively. Passively could be via querying a variety of data sources for the subdomains. Actively could be done via brute-forcing using a wordlist. What if I use the same data sources that I do passively, but instead of searching for subdomains, I search for domains? Will the data sources support this?
One of the most common (and probably the most widely used) subdomain enumeration tools is Subfinder by ProjectDiscovery. I extracted the current data sources from Subfinder (this can be achieved by looking at the source code and/or using the “-list” flag). Then I manually checked each data source to confirm whether it would allow me to search for a domain instead of a subdomain.
It took a few days to experiment with all the APIs to determine if they allow the new regex or not, but I learned that some did, and some did not.
For example, crtsh, the SSL certificate’s search application, did allow me to search for domains using the Postgres SQL server they expose publicly.
An example of how this could be done using crtsh is by connecting to the Postgres database with the following command: psql -t -h crt.sh -p 5432 -U guest certwatch. Installation of the psql utility is out of scope for this blog post, but instructions are available by performing a quick search. Executing the following query will search the TLD part for the domains you’re trying to identify:
certificate_identity ci WHERE ci.NAME_TYPE = ‘dNSName’ AND
reverse(lower(ci.NAME_VALUE)) LIKE reverse(lower(‘%.TLD’));
A similar approach can be used with Netlas, an OSINT company that allows users to register a new user and use an API to query domain information. Using Netlas and the following query, I was able to retrieve a list of domains based on a specific TLD:
Figure 1: Enumeration of TLDs using Netlas
The following list shows the data sources that allowed me to perform a search query for domains, which ones didn’t allow me, and which ones I couldn’t check due to not having access to a paid API key:
Allowed domain enumeration: Crtsh, netlas, bufferover, censys, github, dnsrepo, wayback machine
Did not allow domain enumeration: Alienvault, anubis, bevigil, binaryedge, chaos, commoncrawl, digitorus, dnsdumpster, fullhunt, hackertarget, intelx, leakix, rapiddns, redhuntlabs, securitytrails, shodan, sitedossier, virustotal, zoomeyeapi
Unknown: C99, Certspotter, Chinaz, Dnsdb, Fofa, Hunter, Passivetotal, Quake, Riddler, Robtex, Threatbook, whoisxmlapi, Facebook, builtwith
This leads me to the part you all have been waiting for: the new tldfinder tool!
Tool Release
In collaboration with ProjectDiscovery, I would like to announce tldfinder, a streamlined tool for discovering TLDs, associated subdomains, and related domain names.
Display tldfinder help to see all supported options:
The tool has three discovery modes:
DNS (default): Searches for the input string entered in the TLD section
TLD: Searches for the input string in the domain field and performs fuzzing using dnsx
Domain: Pulls out the related domain names of the given input domain using reverse Whois (requires API for one of the following: dnsrepo, whoxy, or whoisxmlapi)
It’s worth noting that DNS and TLD modes are active, meaning they perform DNS brute force. The Domain mode is passive, meaning there is no touching the domains, only querying from the data sources.
Sources
Currently, at time of release, tldfinder queries the following APIs for information.
__ __ ______ __
/ /_/ /__/ / _(_)__ ___/ /__ ____
/ __/ / _ / _/ / _ / _ / -_) __/
__/_/_,_/_//_/_//_/_,_/__/_/
projectdiscovery.io
[INF] Current list of available sources. [9]
[INF] Sources marked with an * need key(s) or token(s) to work.
[INF] You can modify /home/idanron/.config/tldfinder/provider-config.yaml
to configure your keys/tokens.
whoisxmlapi *
bufferover *
censys *
dnsrepo *
dnsx
whoxy *
netlas *
waybackarchive
crtsh
The DNS and TLD discovery modes query all the sources (and perform fuzzing in TLD mode), while the Domain mode only queries dnsrepo, whoxy, or whoisxmlapi (as only those support reverse Whois).
Using DNS Mode
The following example uses tldfinder with the default DNS discovery mode option to search for domains that are part of the “.google” TLD.
__ __ ______ __
/ /_/ /__/ / _(_)__ ___/ /__ ____
/ __/ / _ / _/ / _ / _ / -_) __/
__/_/_,_/_//_/_//_/_,_/__/_/
projectdiscovery.io
[INF] Loading provider config from
/home/idanron/.config/tldfinder/provider-config.yaml
[INF] Enumerating domains for google
partners.cloudskillsboost.google
community.grow.google
epp.registry-sandbox.google
epp.registry.google
www.registry.google
registry-qa.google
support.registry-qa.google
citystpaul.google
jira-testing.gss.google
cloudskillsboost.google
stg.cloudvmwareengine.google
domains.google
environment.google
cloud.google
design.google
netbotzcircunvalacion.google
app.google
channel-app.google
xc-8d1db3.dns.google
whois.registry-sandbox.google
epp.nic.google
service.cloudvmwareengine.google
dev.cloudvmwareengine.google
travel.google
lxc-wordpress.dns.google
registry-sandbox.google
www.registry-sandbox.google
www.registry-qa.google
lxc-gftcloud.dns.google
lers.google
test.cloud.gss.google
www.cloudskillsboost.google
grow.google
registry.google
8888.google
europe-west2-test.prodtest.cloudvmwareengine.google
mail.ai.google
dns64.dns.google
earlydays.google
privx.google
whois.registry.google
support.registry.google
fx06b9f1.google
whois.nic.google
dns.google
cloud.gss.google
partner.cloudskillsboost.google
support.registry-sandbox.google
[INF] Found 48 domains for google in 637 milliseconds 994 microseconds
Using TLD Mode
In the following example, I use tldfinder with the TLD discovery mode option to search for domains that have the string “google” in them; then I use dnsx (the aforementioned ProjectDiscovery tool) to perform fuzzing against the TLD part.
__ __ ______ __
/ /_/ /__/ / _(_)__ ___/ /__ ____
/ __/ / _ / _/ / _ / _ / -_) __/
__/_/_,_/_//_/_//_/_,_/__/_/
projectdiscovery.io
[INF] Loading provider config from
/home/idanron/.config/tldfinder/provider-config.yaml
[INF] Enumerating domains for google
google.meme
google.xn--fiq228c5hs
google.ltda
google.vote
google.amsterdam
google.hiphop
google.sv
google.boo
google.cymru
google.llc
google.us
google.xn--kprw13d
google.christmas
google.dad
google.app
google.monster
google.rs
[–snip–]
[INF] Found 468 domains for google in 3 minutes 3 seconds
* Please note that the results were redacted due to the length of the results.
Using Domain Mode
The following example uses tldfinder with the Domain discovery mode to search for domains that may be related to the input string (“projectdiscovery.io,” in this case). This works by performing reverse DNS using one (or more) of the APIs allowed for this mode (see Sources section for more information).
__ __ ______ __
/ /_/ /__/ / _(_)__ ___/ /__ ____
/ __/ / _ / _/ / _ / _ / -_) __/
__/_/_,_/_//_/_//_/_,_/__/_/
projectdiscovery.io
[INF] Enumerating domains for projectdiscovery.io
projectdiscovery.org
projectdiscovery.in
projectdiscoveryinc.org
projectdiscoveryu.com
projectdiscoverycaribbean.org
nuclei.projectdiscovery.io
projectdiscovery.tech
projectdiscovery.org.uk
projectdiscoveryofva.com
projectdiscoveryou.com
projectdiscovery.dev
projectdiscovery.co.uk
projectdiscoveryprograms.com
projectdiscoverymovie.com
dns.projectdiscovery.io
projectdiscovery.com
projectdiscovery.net
projectdiscovery.io
projectdiscoveryprograms.org
docs.projectdiscovery.io
[INF] Found 20 domains for projectdiscovery.io in 1 second 571 milliseconds
How to Improve the Results
As always, it is recommended that you add your API keys to the provider-config.yaml file by creating the file (the file is also created automatically during the first run of the tool) in /home/{USER}/.config/tldfinder/provider-config.yaml on Linux or C:Users{USER}.configtldfinderprovider-config.yaml for Windows (replace {USER} with your username on the operation system).
The file format is as follows (replace the REDACTED with your API keys).
censys: []
dnsrepo: [“REDACTED”]
netlas: [“REDACTED”]
whoisxmlapi: [“REDACTED”]
whoxy: []
Acknowledgment
I would like to thank ProjectDiscovery for the help and support in creating this tool!