Black hat SEO and You Won
By our experts
In this blog we want to inform you about a malicious campaign that we have been observing for a while. We would like to make you aware of this campaign to encourage you to join the fight against cybercrime activities. If you fell victim to this kind of scam, please report it to the police so they can investigate your case. By giving insights into this type of malicious activities, we want to encourage you to do your own research.
Introduction
In this study we investigated how manipulation of search results within search engines (Black hat SEO) ultimately led to cybercrime activities. From a technical perspective, we analyzed the modus operandi of the actors behind this form of cybercrime.
For many websites, including Governmental websites, search results from search engines such as Google are manipulated to mislead and attract visitors. This allows the actor to "steal" visitor traffic from legitimate websites. If the visitor clicks on the suspicious search results, he or she will be tempted with offers that would result in leaking personal information such as credit card details etc. The organization behind this appears to be taking a professional approach and makes significant investments to build up the infrastructure and keep it operational.
Methods to avoid detection are identified, as well as the underlying infrastructure of the actor(s), the investigated method of manipulating search results and the mapped network infrastructure. This blog tries to give a view of the techniques being used in this type of cybercrime. Also, the mapped infrastructure gives you an idea of the scale on which these activities are taking place. Please feel free to leave a comment or share your thoughts in the comments section.
Victim of fraud or Black Hat SEO?
If you are a victim of fraud, please report it to the police so they may start an investigation. If your organization is a victim of spoofed search results, you may report abuse to:
- Bing
- Cloudflare
- TLD's .tk, .ga, .mil, .cf, .gq etc. to
- Most of the redirect domains in our investigation were registered at NameSilo
What did the NCSC do with the results?
We shared our findings with law enforcement and worked on this investigation with partners. We shared our findings by making this public blog. The information about IP's and domains mentioned in this blog is only a fraction of what we found during our research. The domain names mentioned in this blog serve as an illustration to give an idea of how this campaign was designed.
Legitimate websites are a popular target for Black hat SEO (Search Engine Optimization) practices. Black hat SEO is a way of maliciously influencing internet search results. The goal of the actor is to take traffic from legitimate websites and to seduce the unsuspecting website visitor with advertising and malicious content, also known as click-baiting. The search index terms, as well as the description of the clickbait domain has a strong similarity to another authentic website. As a result, a potential victim is easily inclined to click on the link, after which the victim is bombarded with aggressive advertising about products, winning prizes and surveys. Below you can see screenshots (image 1-4) of government's sites that were spoofed for Black hat SEO practices.
Note: there is only one authentic site for the Dutch tax service which is belastingdienst.nl.
The malicious websites use techniques that hinder the visitor from going back to the previous page. The links are easy to find through Google. Below (image 5 and 6) are some examples of click-baiting from Government websites.
Image 5: .GA domain example of Government websites: Search result page of malicious search results related to the Dutch Government
Image 6: .GA domain examples of DigiD: Search result page of malicious search results related to the Dutch Government
We see this phenomenon of manipulating search results on a large scale. The domain names that are used are mostly free .tld domain names. In this blog, we want to take you along in the exploration of how this form of fraud is constructed and how it works behind the scenes.
What is Black hat SEO?
To understand how Black hat SEO is applied, it is important to understand what Black hat SEO means. SEO stands for Search Engine Optimization. You can use certain techniques to ensure that a website is positioned higher in Google's search results. We call the position of a website within the search results page rank (PR). The algorithms that Google uses to assign a page rank for websites are unknown, but there are various known techniques to positively influence the position within the search results. Here are some brief examples:
- Link reputation: A page has a certain link reputation. If there are links from a page with an important reputation to ncsc.nl, then ncsc.nl will get a higher page rank.
- Keywords: By including specific keywords on a website you will increase your page rank when a user searches for those keywords.
- Link building: The more pages refer to ncsc.nl, the higher ncsc.nl will be in the search results. If more websites refer to a certain site, that site is likely to be more credible.
- Cloaking: Serving different content depending on the visitor. For example ncsc.nl displays a different page when it is visited from a Desktop PC with Chrome than when it is visited by a mobile phone with Chrome. Customized webpage content based on your geolocation, browser or language is not unusual. By serving the Google Indexing Bot different content than a regular visitor, the search results can be influenced.
You can use these techniques in a legitimate way. It helps a webmaster to make a website easier to find. However, it is also possible to use it in a non-legitimate manner.
- Link reputation: If there are domains with a high page rank that will expire, a malicious actor can purchase the domain name and create a website with links to malicious websites. Because the initial domain has a higher credibility ranking, the pages that are referred to will have a higher page rank as well.
- Keywords: It is possible to include keywords or pieces of text and give them the same color as the background color. As a result, the visitor does not see the text and the Google index bot is misled by indexing text that has nothing to do with the relevant page and is invisible to the visitor.
- Link building: A malicious actor can automatically register many domains and create websites that link to a specific malicious website. Because the malicious website has many domains that refer to itself, it is more likely to end up higher in a SERP. This is known as a link farm.
- Cloaking: A malicious actor can present different content to regular visitors than to the search engine indexing bot.
During the investigation, we found out how an actor used Black hat SEO. For example, we found that the search engine index bot will see the page in image 7.
As showed in the screenshot (image 7), blocks of text are used with specific keywords. Also, there are many links to websites with topics that are also used for Black hat SEO practices. An interesting fact is, if you load this page in a browser, you will immediately see another (redirected) page instead of the screenshot.
This occurs because an I-frame is used which loads another page. An I-frame creates an overlay so that the original page content is no longer visible to the visitor. Only by removing this I-frame code, as shown in image 8, the site becomes visible to a normal visitor. If the I-frame code is not removed the visitor will be visually directed away to another page, while the search index bot does index the original page. You can see an example code on the page in image 8.
Image 7 and 8:
What’s behind the Black Hat SEO
When the visitor clicks on the Black hat SEO search result, the visitor will be redirected to different domains. Before we explain how this redirect works, we first want to show you what kind of scam activity is behind these redirects.
Malicious activity
Behind the suspicious domain names, there seems to be a fraud that, depending on variables such as your geographical location, browser type, etc., tries to trick you into accepting a tempting offer. For example, you will be tempted to win a new iPhone or Galaxy mobile phone when you visit the page from a Dutch IP-address. See the examples in images 9 and 10.
The personal data of victims are collected and according to the small text, the users also enter a continuous subscription. It is unclear what happens with the submitted information that a victim enters on the website. It is also unclear how to terminate the subscription. If you are victim of this kind of activity, please report it to the police.
Scope and scale
During our investigation, it became clear that this kind of Black hat SEO practice does not only occur for governmental websites but also seems to be internationally oriented. In the next chapter, we will take a closer look at the redirect and cloaking mechanism.
Image 9 and 10:
Redirect / Cloaking
During our investigation, we were able to determine that depending on the included request HTTP headers, the webserver presents different content. In SEO jargon we call this technique cloaking. We have tested this on a random basis on many of these domains. In the next example, we have observed this for blacktrackx[.]ga.
Request headers
The site seems to be checking the request headers, among other criteria:
- Referrer site: The HTTP referrer header is a request header that contains the domain where the visitor comes from before they go to the current website. If you set this referrer header with a URL from a search engine URL (such as Google, Bing, Yahoo) you will get a response from the webserver with different content than when you access it with a referrer from any other domain. If you use a referrer from another domain than a search engine, the content is mostly Greek language with junk content. We have determined that it is written in Greek based on Google translate automatic detection. Image 11 illustrates the output of a curl command where ncsc.nl is specified as referrer site:
Image 11:
- Public source IP / country: Depending on your source public IP and the associated geolocation, you will be served different content. Before serving the custom content, you get redirected multiple times to various domains. To illustrate this technique you can see screenshots in images 12 - 17. Each screenshot is made with a different source public IP.
- Number of page views from the source public IP: During the various observations, it appeared that when a page was visited too often, the page showed nonsense (scrapped) Greek language content.
- Browser type: Depending on the browser type(Chrome / Firefox etc.), you will receive customized surveys or offers. The malicious party uses browser type detection to adjust the survey questions.
- HTTP Host Header Request: When you change the HTTP Host header to a value that does not match the destination domain, a page with Greek characters is displayed.
Image 12, 13, 14, 15, 16 and 17:
Screenshots of junk pages
When deviating from the "correct" HTTP request headers, pages with junk content are often displayed. It seems like it is scrapped data from other websites. Below are some screenshots (image 18 - 20) of such pages with junk content. Notable is that these are all written in Greek.
Image 18, 19 and 20:
HTTP Redirect and obfuscation
If the HTTP request headers and other parameters (such as public source IP) are correct (as-in they will not show a page in Greek), the content as in image 21 is served as first redirect script. In image 22 you can see the script written in a more readable form.
The code shows that variables a + b + c + d are combined and load the following URL into an I-frame: http: // fazebook [.] Party/?u=4xfkaeg&o=8mrpkza&t=slayer. During our observations, it became clear that all initial pages redirect to fazebook [.] Party. During our observations before the fazebook [.] Party appeared, the domain name luckylife2019 [.] live was used. During our observations, we saw that this domain changes periodically (images 23 and 24). Thousands of domains redirected to Luckylife2019[.]online / fazebook[.]party.
Image 21, 23, 24 and 25:
“Hide-and-seek” behind Cloudflare
The observed (.tk / .ga) domains that are used for Black hat SEO all contain DNS records that refer to an IP behind Cloudflare. This makes it harder to find the real IP of the webserver that serves these pages. Cloudflare is a widely used platform to protect websites but is also often used by phishing sites to hide the back-end infrastructure. The redirect domains luckylife2019[.]online / fazebook[.]party are also located behind Cloudflare IPs. However, luckylife2019[.]online and fazebook[.]party both also redirect to other websites of which the majority is not behind Cloudflare. This made it possible to partially map their underlying infrastructure. For example WHOIS and dig results of blacktrackx[.]ga in images 25-28.
Image 25, 26, 27 and 28:
Back-end infrastructure
As described earlier, if the correct conditions are met, you get redirected to fazebook[.]party (the central domain which is frequently changing). The fazebook[.]party again redirects to "other" domains. The "other" domain names have some notable structure (image 29). In the observed samples, they had similar names followed by a number in their domain name. For example, one of the referred domains was simplerdr168[.]life (image 30).
Image 29 and 30:
Because we knew that these domains were being registered with a similar name and different numbers, we first tried to find the upper and lower numbers of the particular domain name. Based on our results with NSLOOKUP we did not see any active domains in upper and lower number ranges. So we used WHOIS to see if other domains within the number range where registered. So we checked whether WHOIS data can be obtained from “Domainname+lower and higher numbers.TLD” . This turned out to be the case. The domains within the same range had (almost) the same creation date and time and if NS / SOA records where present for the domains, they would refer to the same name server. The update date of many of these domains was also similar within the registered domain ranges. With the help of WHOIS, reverse nameserver WHOIS, passive DNS and other sources we were able to map thousands of domain names that have the same characteristics. Almost daily new domain ranges get created.
Frequently changing A records
What is curious about the mapped domain names is that only a limited number have a DNS A record with an IP address. The A records often disappear after a while and then appears at other domains again. This technique is also a technique that is used for snowshoe spam. To avoid detection and bypassing filters see Techopedia for more information about snowshoe-spamming.
To substantiate this, please see the screenshot below of the passive DNS results of * .checkingyourbrowser * domains that have been seen on IP 79.110.23[.]93. This shows that the FIRST-SEEN date of checkingyourbrowser domains is 23-8-19. The LAST-SEEN column shows that the IP was observed only very temporarily with only a few counts in the COUNT-SEEN column (image 31).
Image 31:
NS record configuration
What is also typical to the DNS configuration, is that some domain names don’t have any registered SOA / NS records. As noted before, finding registered domain names without SOA / NS records is hard when you don’t use WHOIS.
Subdomain characteristics
The sub-domains are having similar characteristics. The name convention of sub-domains is very similar to that of other domains within our research. To illustrate this, please see the screenshots (images 32-36), a list of (sub-) domains that are used.
Image 32, 33, 34, 35 and 36:
Ranges of different domains
Of all the related domains gathered, we noticed that the actor(s) are not always using the same range. For example, in many cases domain-name1-100[.]live is used, but this number range can also be wider or narrower for some domain name ranges. Like the domains punksgotoserver1-150[.]live, it goes further than 100. Another example is ceapass[.]life, which starts at 200 and ends at 300.
Various TLD’s for same domain ranges
We noticed that various TLDs are used for ranges of domain names. What do we mean by this?
For example, the domains series temporaryserverhere1-100 [.] Life and temporaryserverhere1-100 [.]live are registered. So they used both the TLDs: life / live in this example.
Common used TLDs:
- .Live
- .Life
- .Agency
This is the end of this blog. We hope you liked it. Please leave a comment to give us feedback!
More blog posts
Leave a comment
You can leave a comment here. Inappropriate comments will be removed. Comments are limited to 2,000 characters.
Comments
-
There are no preventive measures to take against Black Hat SEO. The only way to combat this is by detecting / monitoring and submitting an abuse report.
-
Think things are very well explained here. Except the part that you can talk to the police about this. They simply dont understand this and ask you to fill out a form which is irrelevant and case get dismissed. Just try follow the link provided in this article and you will see.
-
Your blog describes what an organisation can do if they are the victim of Black hat SEO. Do you also have recommendations on how to prevent it?