Revisiting this topic with 2021 data. Back when I wrote this query, the pages summary table was in the runs
dataset. Now it is stored in summary_pages
. I’ve updated this query with the following changes:
- Point to the correct summary tables
- I’ve simplified how I was extracting the date from the table suffix so that it no longer requires JavaScript.
- I’ve updated the approximate aggregate function to increase the precision (since I noticed that the results from my previous query do not match what is reported in the curated stats).
Here’s the updated SQL:
SELECT
SUBSTR(_TABLE_SUFFIX,12) AS client,
REPLACE(SUBSTR(_TABLE_SUFFIX, 0, 10), '_', '-') AS yyyymmdd,
COUNT(*) AS sites,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(101)] / 1024, 2) AS p10,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(251)] / 1024, 2) AS p25,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(501)] / 1024, 2) AS p50,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(751)] / 1024, 2) AS p75,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(851)] / 1024, 2) AS p85,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(901)] / 1024, 2) AS p90,
ROUND(APPROX_QUANTILES(bytesTotal, 1001)[OFFSET(951)] / 1024, 2) AS p95,
FROM
`httparchive.summary_pages.*`
WHERE
bytesTotal > 0
GROUP BY
client,
yyyymmdd
ORDER BY
client,
yyyymmdd
One important thing to note is that the dataset size has changed a few times over the years. Originally, the dataset was using the Alexa top sites. Back in 2010 there were ~16k desktop sites measured. That increased as testing capacity increased, upwards of 300,000 sites in 2012 and 500,000 sites in 2014. In 2018 the dataset changed from the Alexa list to the Chrome User Experience report and that brought us to over 1 million sites. From 2019 onwards, the dataset continued to grow to the point where it’s over 7 million sites. I’ve written about the growth of the web from a CrUX perspective here as well.
In the 4 years since this post was published, the page weight has continued to increase linearly at each percentile. Here’s a breakdown of Desktop page weight by month.
as well as Mobile
More than 15% of mobile homepages (ie, roughly 1 million out of the 7 million sites tracked) are larger than 5MB in size!
Here’s the data and graphs in case you want to explore this some more - HTTP Archive Page Weight Percentiles - Google Sheets