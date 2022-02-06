Check out the shiny new web UI https://www.thirdpartyweb.today/

Data on third party entities and their impact on the web.

This document is a summary of which third party scripts are most responsible for excessive JavaScript execution on the web today.

Table of Contents

Goals

Quantify the impact of third party scripts on the web. Identify the third party scripts on the web that have the greatest performance cost. Give developers the information they need to make informed decisions about which third parties to include on their sites. Incentivize responsible third party script behavior. Make this information accessible and useful.

Methodology

HTTP Archive is an inititiave that tracks how the web is built. Every month, ~4 million sites are crawled with Lighthouse on mobile. Lighthouse breaks down the total script execution time of each page and attributes the execution to a URL. Using BigQuery, this project aggregates the script execution to the origin-level and assigns each origin to the responsible entity.

NPM Module

The entity classification data is available as an NPM module.

const {getEntity} = require ( 'third-party-web' ) const entity = getEntity( 'https://d36mpcpuzc4ztk.cloudfront.net/js/visitor.js' ) console .log(entity)

2021-01-01 dataset

Due to a change in HTTPArchive measurement which temporarily disabled site-isolation (out-of-process iframes), all of the third-parties whose work previously took place off the main-thread are now counted on the main thread (and thus appear in our stats). This is most evident in the change to Google-owned properties such as YouTube and Doubleclick whose complete cost are now captured.

2019-05-13 dataset

A shortcoming of the attribution approach has been fixed. Total usage is now reported based on the number of pages in the dataset that use the third-party, not the number of scripts. Correspondingly, all average impact times are now reported per page rather than per script. Previously, a third party could appear to have a lower impact or be more popular simply by splitting their work across multiple files.

Third-parties that performed most of their work from a single script should see little to no impact from this change, but some entities have seen significant ranking movement. Hosting providers that host entire pages are, understandably, the most affected.

Some notable changes below:

Third-Party Previously (per-script) Now (per-page) Beeketing 137 ms 465 ms Sumo 263 ms 798 ms Tumblr 324 ms 1499 ms Yandex APIs 393 ms 1231 ms Google Ads 402 ms 1285 ms Wix 972 ms 5393 ms

2019-05-06 dataset

Google Ads clarified that www.googletagservices.com serves more ad scripts than generic tag management, and it has been reclassified accordingly. This has dropped the overall Tag Management share considerably back down to its earlier position.

2019-03-01 dataset

Almost 2,000 entities tracked now across ~3,000+ domains! Huge props to @simonhearne for making this massive increase possible. Tag Managers have now been split out into their own category since they represented such a large percentage of the "Mixed / Other" category.

2019-02-01 dataset

Huge props to WordAds for reducing their impact from ~2.5s to ~200ms on average! A few entities are showing considerably less data this cycle (Media Math, Crazy Egg, DoubleVerify, Bootstrap CDN). Perhaps they've added new CDNs/hostnames that we haven't identified or the basket of sites in HTTPArchive has shifted away from their usage.

Data

Summary

Across top ~4 million sites, ~2700 origins account for ~57% of all script execution time with the top 50 entities already accounting for ~47%. Third party script execution is the majority chunk of the web today, and it's important to make informed choices.

How to Interpret

Each entity has a number of data points available.

Usage (Total Number of Occurrences) - how many scripts from their origins were included on pages Total Impact (Total Execution Time) - how many seconds were spent executing their scripts across the web Average Impact (Average Execution Time) - on average, how many milliseconds were spent executing each script Category - what type of script is this

Third Parties by Category

This section breaks down third parties by category. The third parties in each category are ranked from first to last based on the average impact of their scripts. Perhaps the most important comparisons lie here. You always need to pick an analytics provider, but at least you can pick the most well-behaved analytics provider.

Overall Breakdown

Unsurprisingly, ads account for the largest identifiable chunk of third party script execution.

Advertising

These scripts are part of advertising networks, either serving or measuring.

Analytics

These scripts measure or track users and their actions. There's a wide range in impact here depending on what's being tracked.

Social

These scripts enable social features.

Rank Name Usage Average Impact 1 AddToAny 42,529 80 ms 2 Pinterest 124,652 103 ms 3 Shareaholic 1,021 108 ms 4 reddit 1,166 147 ms 5 LinkedIn 14,592 221 ms 6 Facebook 2,084,243 255 ms 7 TikTok 66,563 308 ms 8 AddShoppers 1,547 313 ms 9 ShareThis 104,092 315 ms 10 Twitter 286,904 343 ms 11 Kakao 28,534 396 ms 12 Instagram 6,010 828 ms 13 AddThis 119,408 974 ms 14 SocialShopWave 3,403 1451 ms 15 VK 40,210 1473 ms 16 PIXNET 15,332 2110 ms 17 Tumblr 14,801 2418 ms 18 LiveJournal 4,814 2939 ms

Video

These scripts enable video player and streaming functionality.

Rank Name Usage Average Impact 1 Twitch 1,019 56 ms 2 Vimeo 55,804 356 ms 3 Brightcove 12,697 1261 ms 4 Wistia 15,065 2276 ms 5 YouTube 559,091 3195 ms

Developer Utilities

These scripts are developer utilities (API clients, site monitoring, fraud detection, etc).

Rank Name Usage Average Impact 1 Accessibe 4,829 73 ms 2 Siteimprove 5,272 80 ms 3 Seznam 14,432 82 ms 4 iovation 1,869 100 ms 5 Cloudflare 78,437 107 ms 6 New Relic 97,453 114 ms 7 iubenda 34,029 120 ms 8 Key CDN 3,497 125 ms 9 Klevu Search 1,402 137 ms 10 Highcharts 1,716 159 ms 11 Foxentry 1,063 160 ms 12 TrustArc 3,845 175 ms 13 Hexton 22,100 181 ms 14 LightWidget 7,864 198 ms 15 OneSignal 62,786 225 ms 16 Riskified 1,092 236 ms 17 Cookiebot 55,639 240 ms 18 GitHub 3,439 241 ms 19 Bold Commerce 16,235 263 ms 20 Swiftype 1,166 264 ms 21 Cookie-Script.com 5,083 266 ms 22 Trusted Shops 13,944 274 ms 23 Other Google APIs/SDKs 1,297,162 285 ms 24 Affirm 4,681 293 ms 25 Google reCAPTCHA 8,854 329 ms 26 GetSiteControl 3,069 391 ms 27 WisePops 2,851 404 ms 28 Fastly 24,865 425 ms 29 Amazon Pay 3,928 426 ms 30 Forter 1,563 470 ms 31 AppDynamics 2,029 470 ms 32 PayPal 28,366 489 ms 33 Mapbox 9,079 510 ms 34 GoDaddy 22,215 568 ms 35 Google Maps 657,418 576 ms 36 Bugsnag 6,014 594 ms 37 Sentry 21,964 613 ms 38 Luigi’s Box 1,270 638 ms 39 Stripe 46,463 738 ms 40 MaxCDN Enterprise 7,027 980 ms 41 Vidyard 1,331 1204 ms 42 Secomapp 4,078 1415 ms 43 Yandex APIs 18,529 2056 ms 44 Freshchat 5,647 2340 ms 45 Rambler 11,257 3045 ms 46 Esri ArcGIS 1,848 3958 ms 47 POWr 23,595 4094 ms

Hosting Platforms

These scripts are from web hosting platforms (WordPress, Wix, Squarespace, etc). Note that in this category, this can sometimes be the entirety of script on the page, and so the "impact" rank might be misleading. In the case of WordPress, this just indicates the libraries hosted and served by WordPress not all sites using self-hosted WordPress.

Rank Name Usage Average Impact 1 Blogger 88,978 177 ms 2 Civic 2,905 224 ms 3 WordPress 175,204 537 ms 4 Ecwid 3,126 888 ms 5 Dealer 1,449 1033 ms 6 Shopify 224,160 1831 ms 7 Tilda 22,245 2052 ms 8 Squarespace 69,369 2083 ms 9 Weebly 21,559 2214 ms 10 Salesforce Commerce Cloud 3,278 2441 ms 11 Hatena Blog 21,310 2805 ms 12 Wix 139,882 3086 ms 13 WebsiteBuilder.com 1,408 4106 ms

Marketing

These scripts are from marketing tools that add popups/newsletters/etc.

Rank Name Usage Average Impact 1 Madison Logic 1,111 69 ms 2 DemandBase 1,849 89 ms 3 Beeketing 2,738 142 ms 4 Albacross 2,025 149 ms 5 iZooto 1,724 152 ms 6 Pardot 1,516 160 ms 7 Sojern 1,060 231 ms 8 Listrak 1,207 277 ms 9 Judge.me 21,552 323 ms 10 Mailchimp 34,723 324 ms 11 Hubspot 75,834 397 ms 12 RD Station 15,819 407 ms 13 Yotpo 18,100 501 ms 14 OptinMonster 4,681 525 ms 15 Wishpond Technologies 1,066 566 ms 16 PureCars 2,680 1052 ms 17 Sumo 14,134 1251 ms 18 Bigcommerce 12,867 1808 ms 19 Drift 6,275 3348 ms 20 Tray Commerce 7,409 8661 ms

Customer Success

These scripts are from customer support/marketing providers that offer chat and contact solutions. These scripts are generally heavier in weight.

Rank Name Usage Average Impact 1 SnapEngage 1,313 69 ms 2 Foursixty 1,777 158 ms 3 BoldChat 1,544 165 ms 4 Tidio Live Chat 24,408 218 ms 5 Pure Chat 4,593 287 ms 6 LiveTex 1,748 329 ms 7 LivePerson 3,974 423 ms 8 Comm100 1,146 644 ms 9 Intercom 18,411 672 ms 10 Smartsupp 19,185 729 ms 11 iPerceptions 3,842 729 ms 12 LiveChat 22,979 786 ms 13 Help Scout 2,980 806 ms 14 Tawk.to 79,685 865 ms 15 Jivochat 57,192 986 ms 16 ContactAtOnce 1,454 1005 ms 17 Olark 6,986 1137 ms 18 ZenDesk 69,695 1166 ms 19 Dynamic Yield 1,420 2263 ms

Content & Publishing

These scripts are from content providers or publishing-specific affiliate tracking.

Rank Name Usage Average Impact 1 Accuweather 1,067 127 ms 2 CPEx 1,271 154 ms 3 SnapWidget 9,852 223 ms 4 OpenTable 3,672 263 ms 5 Booking.com 2,002 267 ms 6 Covert Pics 2,063 326 ms 7 Tencent 5,409 331 ms 8 Revcontent 1,027 450 ms 9 AMP 72,557 861 ms 10 Embedly 3,969 1200 ms 11 Hotmart 1,355 1298 ms 12 Spotify 4,933 1606 ms 13 SoundCloud 4,288 1999 ms 14 issuu 2,112 2285 ms 15 Dailymotion 3,301 6711 ms 16 Medium 5,866 15098 ms

CDNs

These are a mixture of publicly hosted open source libraries (e.g. jQuery) served over different public CDNs and private CDN usage. This category is unique in that the origin may have no responsibility for the performance of what's being served. Note that rank here does not imply one CDN is better than the other. It simply indicates that the scripts being served from that origin are lighter/heavier than the ones served by another.

Tag Management

These scripts tend to load lots of other scripts and initiate many tasks.

Consent Management Provider

IAB Consent Management Providers are the 'Cookie Consent' popups used by many publishers. They're invoked for every page and sit on the critical path between a page loading and adverts being displayed.

Rank Name Usage Average Impact 1 Consent Manager CMP 3,984 260 ms 2 Optanon 55,110 301 ms 3 Quantcast Choice 26,290 434 ms

Mixed / Other

These are miscellaneous scripts delivered via a shared origin with no precise category or attribution. Help us out by identifying more origins!

Rank Name Usage Average Impact 1 ResponsiveVoice 2,753 67 ms 2 ReadSpeaker 2,591 101 ms 3 Skype 1,105 213 ms 4 Browsealoud 1,449 263 ms 5 Amazon Web Services 67,304 268 ms 6 Parking Crew 2,761 444 ms 7 Calendly 2,707 711 ms 8 Polyfill service 2,106 920 ms 9 Heroku 10,912 2638 ms 10 uLogin 1,834 2704 ms

Third Parties by Total Impact

This section highlights the entities responsible for the most script execution across the web. This helps inform which improvements would have the largest total impact.

Future Work

Introduce URL-level data for more fine-grained analysis, i.e. which libraries from Cloudflare/Google CDNs are most expensive. Expand the scope, i.e. include more third parties and have greater entity/category coverage.

FAQs

I don't see entity X in the list. What's up with that?

This can be for one of several reasons:

The entity does not have references to their origin on at least 50 pages in the dataset. The entity's origins have not yet been identified. See How can I contribute?

What is "Total Occurences"?

Total Occurrences is the number of pages on which the entity is included.

How is the "Average Impact" determined?

The HTTP Archive dataset includes Lighthouse reports for each URL on mobile. Lighthouse has an audit called "bootup-time" that summarizes the amount of time that each script spent on the main thread. The "Average Impact" for an entity is the total execution time of scripts whose domain matches one of the entity's domains divided by the total number of pages that included the entity.

Average Impact = Total Execution Time / Total Occurrences

How does Lighthouse determine the execution time of each script?

Lighthouse's bootup time audit attempts to attribute all toplevel main-thread tasks to a URL. A main thread task is attributed to the first script URL found in the stack. If you're interested in helping us improve this logic, see Contributing for details.

The data for entity X seems wrong. How can it be corrected?

Verify that the origins in data/entities.js are correct. Most issues will simply be the result of mislabelling of shared origins. If everything checks out, there is likely no further action and the data is valid. If you still believe there's errors, file an issue to discuss futher.

How can I contribute?

Only about 90% of the third party script execution has been assigned to an entity. We could use your help identifying the rest! See Contributing for details.

Contributing

Thanks

A huge thanks to @simonhearne and @soulgalore for their assistance in classifying additional domains!

Updating the Entities

The domain->entity mapping can be found in data/entities.js . Adding a new entity is as simple as adding a new array item with the following form.

{ "name" : "Facebook" , "homepage" : "https://www.facebook.com" , "categories" : [ "social" ], "domains" : [ "*.facebook.com" , "*.fbcdn.net" ], "examples" : [ "www.facebook.com" , "connect.facebook.net" , "staticxx.facebook.com" , "static.xx.fbcdn.net" , "m.facebook.com" ] }

Updating Attribution Logic

The logic for attribution to individual script URLs can be found in the Lighthouse repo. File an issue over there to discuss further.

Updating the Data

This is now automated! Run yarn start:update-ha-data with a gcp-credentials.json file in the root directory of this project (look at bin/automated-update.js for the steps involved).

Updating this README

This README is auto-generated from the templates lib/ and the computed data. In order to update the charts, you'll need to make sure you have cairo installed locally in addition to yarn install .

brew install pkg-config cairo pango libpng jpeg giflib yarn build yarn start

Updating the website