Ad Laundering: How Websites Deceive Advertisers into Rendering Ads Next to Illicit Content

Providing online content monetized via ads to users is a lucrative business. But what if the content is pirated or illicit, thus harming the brand safety of the advertiser? In this paper, we are the first to investigate Ad Laundering: a technique with which bad actors deceive advertisers by hiding illicit content within evidently lawful websites to monetize the generated traffic. We develop a client-side detection methodology to detect and analyze websites performing ad laundering. We describe in detail the techniques these websites use to cloak content, and provide estimations for the ad revenues they are able to collect on a monthly basis. Finally, we attribute the generated revenue to different traffic channels and establish that even popular brands have their ads rendered next to undesirable content.


INTRODUCTION
Digital advertising is what keeps modern Web free of charge for the browsing users.Publishers provide content that is monetized via ad impressions rendered within slots advertisers buy on the website, next to the content the user came for.But what if the content is illicit or pirated?
There were many such scenarios in the past, where in programmatic advertising, ad impressions of premium brands were found to be placed in websites with illicit or even terrorist content [18].To eliminate such incidents and preserve the brand safety, advertisers and ad networks have deployed various tactics to ensure they can vet and filter out websites with content that is not aligned with what the brand represents [2,5].
In this work, we study Ad Laundering [4]: an ad fraud technique that bad actors use in programmatic advertising to funnel ad inventory through various intermediaries and conceal its origin from advertisers, thus making them pay for low-quality or fraudulent placements.Specifically, we empirically study popular websites with pirated anime and manga content and investigate how publishers are able to launder their content through unrelated legitimate-looking websites.We present the various techniques used to conceal the copyright infringing content in such a way that it cannot be discovered by ad networks or search engine crawlers that would block the website from being served ads.We investigate the traffic channels for these websites and we estimate the potential revenues of these sites from ad laundering.
The contributions of this work include: (1) We develop a methodology to detect Ad Laundering in the wild and visit and analyze 2,800 websites.We explore the mechanisms websites use to make content accessible only to selected visitors as well as the domains they utilize.(2) We find that Ad Laundering is able to mislead even the most prominent ad networks (e.g., Google and Amazon) and obtain ads from respectable and highly popular brands.(3) We provide estimations and show that bad actors are able to extort thousands of dollars from the advertising ecosystem.Surprisingly, we discover that a portion of the generated revenue derives from unsuspecting visitors that stumble upon a seemingly benign website.(4) We make the tools and data used in this study publicly available to foster further research [1].

AD LAUNDERING OVERVIEW
Ad Laundering, as defined by IAB [4], is a technique that allows publishers to conceal illicit content within seemingly legitimate and benign websites, while also, monetize it and generate ad revenue.
To achieve this, a publisher needs to control two distinct websites: an "index" website that announces the content, and a seemingly benign website, serving as a "front".This "front" website has the sole purpose of hiding the true illicit content.One can discover the illicit content only if they are properly directed from the index website.Figure 1 illustrates the details of this technique.First, the user visits the "index" website in order to access illicit content (step 1).Such websites become popular through forums or other similar platforms and are often blocked in corporate networks or by national ISPs.When, the user selects the content they want to access (step 2), they are redirected to a different domain (step 3).This domain acts as the front for the whole laundering operation.It is a website that serves actual content, completely unrelated to the illicit content.This website is able to monetize its content and receive ads from popular networks.However, when the user is redirected, the website modifies the current URL (step 4) (i.e., the URL that appears on the address bar of the user's browser).Even though the user was initially redirected to tech-news.io/watch/movice-ABC,the website modifies the URL to appear as tech-news.io/news,while simultaneously keeping the content of the page exactly the same.That is, the URL indicates that the user is exploring tech news while the content of the page is the pirated movie the user was promised.
Interestingly, if a third party (e.g., verification service or advertiser) directly visits the URL on the address bar (i.e., tech-news.io/news),they will observe legitimate and legal content and not the illicit.Similarly, users that might want to share the URL of the illicit content, will be disappointed when they observe that the link they shared navigates to completely different content.As a result, the publishers are able to hide the illegal content and the only way to access it is if you know the correct initial URL of step 3 (i.e., technews.io/watch/movice-ABC).When unsuspected users, commercial licensing entities, search engine crawlers or government agencies visit the website directly and navigate through it, they won't be able to access or find the illicit content in any way.
The essence of the ad laundering technique is that the illicit content can be accessed only if a user visits the website the "correct way".There is anecdotal evidence [17] that the ad laundering practice takes place on the Web and even ordinary users have taken notice of this technique [23].Indeed, a cautious user will observe that even though they initially visited piratedmovies.com to access illicit content, they have been redirected to tech-news.io,an evidently unrelated website.In this scenario, both the "front" and the "index" websites are controlled by the bad actor.We anticipate that in most cases, bad actors own, control and operate both websites.In fact, in Section 3 we find that some "index" websites alternate the "front" websites they use so as to not attract enough attention.However, there is still the possibility that the "front" website is indeed benign and that it has been compromised.In fact, similar cases have already been reported, where benign WordPress websites were infected with pharmaceutical products [16].

AD LAUNDERING TECHNIQUES
Investigating this technique suggests that Ad Laundering is common among anime or manga websites, with a publisher even publicly admitting they use this technique to host and monetize pirated content [22].We use this publisher as a case study and scrutinize their website to better understand its behavior.Towards that extent, we form a list of anime, manga and piracy websites by merging SimilarWeb's top "Animation and Comics" Websites [13], NextDNS' piracy blocklists [19], as well as some open-source anime-or mangarelated projects.We accumulate a list of 2,819 websites and make it publicly available [1].Using an open-source crawling tool [21], we crawl these websites on November 2022 and perform a random walk on a sample of subpages.That is, we emulate the behavior of an ordinary user and navigate through each website by clicking on 20 random internal URLs and 20 random external URLs.We then revisit each URL directly and with a clean browser state to discover if there any major differences in the content served by the website.The tool automatically captures the HTML code, network traffic, as well as JavaScript code execution of websites.This allows us to better understand how they are able to circumvent the policies and restrictions of popular ad networks.
JavaScript-based Laundering: Analysing captured data reveals that multiple websites make use of the history.replaceStatenative JavaScript method to "hide" the URL that hosts the illegal content (i.e., step 4 of Fig. 1).This JavaScript method updates the current URL that the browser displays.Listing 1 contains a snippet of JavaScript code that fronts use in order to randomly modify the current URL to point to benign and legal content.It is interesting to notice that if the user refreshes the page, they will no longer have access to the illicit content.This is because the history.replaceStatemethod has modified the address bar URL and the browser will issue a new request towards the benign content.This technique is trivial to implement, compared to cloaking software that would cost thousands of dollars [11].We detect 6 "index" websites and 8 "front" websites involved in ad laundering.In particular, the websites animeow.me,animeowl.net,animeowl.usand lightnovels.medeploy this technique in order to serve copyrighted Anime and Manga content.In fact, when users visiting these websites they will be presented with a disclaimer that the video player will open on a different domain.All of these websites seem to be operated by the same entity and use either pandapama.comor portablegaming.coas a front that hides the illicit content.We also observe websites that use different fronts and regularly change the domain they redirect to.mangago.meuses 3 distinct fronts (lady-first.cc,lady-first.me and you-him-me.com) to hide illicit content and might redirect the user to a different one every time they try to access it.Similarly, vyvymanga.netuses aovheroes.com,summonersky.comand aov-news.comas fronts.
HTTP-based Laundering: Furthermore, we also discover that some websites prefer to implement the same technique but rather on the server-side using HTTP mechanisms.Specifically, when the user chooses what illicit content they want to access (i.e., step 2 of Figure 1), an HTTP request is issued towards the front website but with the referer header set accordingly.When, the server of the laundering website detects the valid referer field, it returns a different response.We discover that based on the referer field, the server returns different content and only if the original "index" website is the referer (i.e., step 3 of Figure 1) the user can access the copyrighted content.As a result, if any user or third party (e.g., verification service) attempts to directly visit the same URL, they will observe a completely different page simply because they do not have the correct referer field.
We discover a cluster of websites that employ this technique and seem to be part of the same network.Specifically, the websites tenmanga.com,nineanime.com,es.ninemanga.com,novelcool.com,taadd.comand niadd.comall serve anime-or manga-related content and use HTTP-based Ad Laundering to monetize their content.These websites use glanceoflife.com,yyzzbaby.com,technologpython.comand greenhomestyle.comas fronts, which hide the copyrighted or adult content.In fact, at first glance, all of these "front" websites are seemingly benign and contain interesting articles.

TRAFFIC CHANNELS
Using SimilarWeb's analytics portal [13], we extract traffic and performance data for all websites discussed in Section 3. First, we study the popularity rank of these websites and find they are extremely popular.For instance, we find that mangago.me, a website that serves Manga comic books, is ranked 390 th across all websites globally.Surprisingly, we observe that in some cases, the "front" website has a higher popularity rank than the "index" website, suggesting that laundering websites also have unsuspected visitors that actually consume its benign content.
An interesting question is how visitors come across these websites.One would assume that since they are very popular, users would visit them directly.This is, indeed, the case for most of the "index" websites that announce the illicit content.For instance, 91.17% of taadd.comand 84.67% of animeow.mevisitors are attributed to direct traffic.However, the behavior of laundering websites is immensely different.The majority of traffic towards laundering websites comes from referrals.For instance, 84.26% of lady-first.cc'straffic comes from referring websites.This is inline with the behavior of ad laundering since the only way to access the illicit content is by knowing the correct URL (Section 2).This strengthens our finding that they mainly serve as "fronts".They hide illicit content from unsuspected visitors, while the great majority of visitors are aware of their true operations.
Finally, we evaluate whether the industry is able to detect these cases of ad abuse and handle them.To that extent, we feed the detected domains (both the domains that announce the illicit content and the domains that act as benign "fronts") to popular URL safety checking services, including VirusTotal, Norton Safe Web, Google Safe Browsing, SSLTrust and Trend Micro Site Safety Center.We find that half of the "index" websites that announce illicit content are flagged by at least one service.As expected, security experts have identified that these websites are involved in Web piracy operations and advise caution when visiting them.On the other hand, we find that most of these services do not flag the "front" websites that actually serve the illicit content and only Norton Safe Web advises caution for three of them.These websites are doing an excellent job staying underground and appearing benign not only to ordinary users, but also to security services.

REVENUE GENERATION
Manually visiting laundering websites reveals that the "fronts" are indeed displaying ads.We identify ads being served by popular ad networks such as Google, Outbrain, Amazon Ads and Criteo.These advertisements came from very popular and acclaimed brands, including Qatar Airways, Philips, Dell, LG, Toyota, T-Mobile, and Adidas.To add insult to injury, we even find ads from streaming services, such as Disney Plus, Amazon Prime and Paramount Plus appearing next to illicit content.We provide some example screenshots in [1].These websites are not only able to monetize pirated content, but also attract prestigious advertisers.
Most importantly, we investigate how much financial damage ad laundering inflicts on the advertising ecosystem, and if these websites are able to drain ad budgets.First, we study the number of visits from either desktop or mobile clients and find that a lot of these websites have millions of visits on a monthly basis.For example, during February 2023, mangago.mehad 59.7M visits, and novelcool.comhad 11.1M visits.This is inline with the findings about the rank of websites and simply indicates that users search for pirated content.More importantly, we also find laundering websites with millions of monthly visits, with yyzzbaby.comhaving 10.9M visits on February 2023 and lady-first.me9.3M.Such websites manage to monetize illicit content, and generate a substantial revenue from it.We provide some popularity data for the most popular laundering websites in Table 1.It is important to note that the revenue these laundering websites generate is partly because of unsuspected visitors, unaware that it is actually a front.
Previous work has demonstrated that a website with 500K daily visitors and an average number of 2 pages per visit, can generate almost $10.000 per month [14].Considering that multiple websites in our dataset have over 10 pages per visit and millions of visits per month (e.g., lady-first.me9.3M visits and 25 pages per visit), one can readily conclude that these websites are generating dozens of thousands of USD on a monthly basis.Taking the geographic value of audience into consideration and that for most laundering websites visitors come from the United States, there are estimates that ad revenue can reach hundreds of thousands of dollars [10].

RELATED WORK
Online advertising is a profitable ecosystem, thus attracting bad actors that attempt to abuse it and generate ephemeral revenue [3].Techniques like Domain Spoofing [6], Click-Jacking [28], Ad Stacking [25] and Ad Injection [26] are often used by bad actors to divert money from the ad ecosystem.Additionally, the value of a user can differentiate based on their characteristics [20] and publishers can increase their revenue by loading counterfeit content [15].In [27], authors discussed how Cloaking can disguise scams from search engines and build a Cloaking detection framework.In [11], authors study how Web Cloaking can hide malicious content from entities such as vetting crawlers or ad scanners.Similarly, in [29] authors study how phising campaigns can use Web Cloaking to effectively evade detection.Finally, various Cloaking detection mechanisms have been proposed over the years (e.g., [8,12]) as well as countermeasures [9,24] for detecting compromised websites [7].Similar to Ad Laundering, Web Cloaking attempts to conceal the true content of a website [27].In Web Cloaking, organic visitors see the harmful content while search engine or security crawlers are served benign content.In this work, we study content cloaking from a different angle that requires the control of two distinct websites.Vetting entities and unsuspecting visitors are served benign content and only properly transferred users can access the concealed content.

CONCLUSION
In this work, we study Ad Laundering, an ad fraud technique that bad actors use to funnel ad inventory through a network of intermediaries and obscure its true origin.This way, they deceive advertisers into paying for low-quality ad slots.We explore two different implementation mechanisms and disclose cases of websites that employ this technique to monetize anime-or manga-related content.We study the traffic towards "front" websites and show that they are extremely popular, with the majority of visitors coming to consume concealed content.Finally, we demonstrate that that this operation can empower bad actors to generate substantial revenue from ad impressions, and that ads come from respectable brands and are served by legitimate ad networks such as Google and Amazon.Evidently, bad actors have found methods to overcome the restrictions of ad networks and drain the budget of advertisers.To make matters worse, these bad actors pose an important threat to advertisers' reputation that pay substantial amounts of money to reach specific audiences.Using ad laundering, bad actors mislead advertisers to display ads next to unrelated or undesirable content.

Table 1 :
Performance metrics of ad laundering websites in our dataset extracted from SimilarWeb.The column "PPV" contains the average number pages per visit.