The Internet Society hosted the 2015 Network and Distributed System Security (NDSS) Symposium from 8-11 February in San Diego, California, USA. The conference featured a broad set of security topics, ranging from fundamental compiler and run-time vulnerabilities in code execution to observations about security issues at Internet scale. We asked some of the NDSS participants to write guest blog posts about their papers. This is one of those contributions.
Domain name abuse exists in many forms, including soundsquatting, typosquatting and bitsquatting, but in all cases consists of purposefully registering a domain name that is in some way or another confusingly similar to another domain name. In the specific case of typosquatting, a miscreant will register a domain name that is a mistype of a popular domain name or trademark. For instance, a typosquatter might register internetsoceity.org, in the hope of getting hits from visitors intending to visit internetsociety.org. Typosquatting is certainly not a new phenomenon; it has been known and studied for over 15 years. Over these years, many interesting results about typosquatting have been revealed, such as the fact that most typosquatting domains are monetized using ads and that shorter domains are targeted more frequently than longer domains.
What previous studies did not investigate, however, is the evolution over time of the content of typosquatting domains. For instance, it would be interesting to know whether typosquatting domains are changing hands from abusive to legitimate owners or vice versa. These kinds of questions are what inspired us to perform a longitudinal study of the typosquatting landscape, in which we collected and analyzed the contents of the typosquatting domains of the Alexa top 500 websites over a period of 7 months. The main effort of the analysis consisted of classifying those typosquatting domains into content-based categories. Some types of content can be considered legitimate use of a typosquatting domain, while others are abusive. For instance, we consider typosquatting domains that simply redirect their visitors back to the authoritative domain (so-called defensive registrations), and domains with original content that just happen to reside on a typosquatting domain of a popular website as legitimate. On the other hand, domains hosting ads, scams, adult content or malware were deemed abusive.
One of the results that was immediately clear from our analysis, was that the very large majority of the popular domains are the victim of typosquatting: 95% of the 500 authoritative domains we investigated have at least one abusive typosquatting domain. On the other hand, only 31% of the domains have a defensive registration. Hence it seems that legitimate domain owners are either not aware of the problem or do not consider the risks great enough to proactively defend against it.
As for longitudinal results, we found that typosquatting domains are indeed changing hands from abusive to legitimate owners, albeit not in great numbers: from the 28,179 domains we investigated, only 63 changed from legitimate to malicious usage and 91 changed from malicious to legitimate use during our data gathering period. This averages out to about 3 and 2 of such transitions per week respectively. On the other hand, if we look at the number of category transitions within the set of malicious categories, there are over 1,000 transitions per week. These are domains that appear to be diversifying their monetization strategy by switching, for example, from showing ads to hosting scams or vice versa.
Another surprising result is that about 50% of all abusive typosquatting domains are hosted by only 4 companies. Three of these companies are, in some way or another, involved in the domain parking business. A typical parked domain contains no other content than automatically computed advertising banners and links, in an attempt to generate revenue for its owner.
In a second study we conducted, we explored the ecosystem around such parked domains and concluded that it is driven by so-called domainers, domain investors that possess large portfolios of domains names. These investors rely on separate domain parking services, which facilitate the monetization of domain names by hosting parked webpages. In order for this business to be lucrative, parked domains need to receive a lot of visitors.
In the past, before search engines were strongly incorporated into web browsers, surfing the web involved significantly more typing of domain names in a browser’s URL bar than it does today. For instance, users would try to “guess” the domain names of websites relevant to their needs. Thus, a user who is interested in finding “cheap gas” could either visit a search engine and search for that phrase or, alternatively, concatenate the two words, append the most popular TLD and attempt to visit the cheapgas.com website. It is also worth noting that, at the time, browsers were trying to “help” users by appending TLDs before giving up on a domain. That is, if the user typed in cheapgas and the domain did not resolve, the browser would automatically append the .com TLD and try again. In both scenarios, a user could land on a parked page by attempting to type the address of a website. The traffic resulting from this action was appropriately named “type-in” traffic and is, historically, the reason why domain parking services appeared. Most likely, the recent extensive search engine integration into browsers heavily reduced this regular type-in traffic for parking pages, resulting in a huge decline in profit for the domain parking industry.
The way this industry appears to be compensating for this decline, is by resorting to the exploitation of abusive domain names, such as typosquatting domains, as a means of establishing a new source of monetizable visitors. In our research, we found that currently 16% of parked domains are abusing existing trademarks. Moreover, 37% of these domains are displaying advertisements of a competitor. This means that when a user lands on such a trademark-abusing domain, not only would the trademark holder “lose” that user’s visit, but the user could potentially end up on a website of a competing company.
Apart from displaying advertisements, we found that in 7% of the cases, parked domains are monetized by redirecting visitors to entirely different websites, often hosting scams, malware and adult content. This indicates that the domain parking industry is at least partly responsible for the malicious monetization strategies that were also observed in our typosquatting study.
One way to protect end users as well as trademark owners against these practices, is by blocking access to parked pages or by alerting users when they end up on a parked page during a browsing session. To this end, we developed a classifier that can detect parked domains based on page features such as the number of frames and iframes, the amount of third-party HTML content, the number and length of links to third-party content, etc.
From these two studies, we learned that both typosquatting and domain parking are still actively practiced today, despite being well-known for over a decade. Although they can be considered separate phenomena, they are clearly heavily intertwined.