menu

Frontend Cafe

A community for discussing frontend engineering news, best practices, and new technologies.

Channels
Chat
view-forward
# All channels
view-forward
# General
view-forward
# Articles
view-forward
# Jobs
view-forward
# Opinion
view-forward
# Questions
view-forward
Team

Why you should be focusing on the 99th percentile when tracking request…

January 11, 2018 at 12:28pm

Why you should be focusing on the 99th percentile when tracking request performance

January 11, 2018 at 12:28pm
Even though it isn't the point of the post, this is a great explanation why you should focus on improving the 1% slowest requests in your app: https://phabricator.wikimedia.org/phame/live/7/post/83/measuringwikipedia
Intuitively, this did not make a lot of sense to me—bringing the average request time down seemed much more benefitial than focusing on the 1% of users with slow network connections?! But that's not how those percentiles work at all:
When working on a service used by millions, we focus on the 99th percentile and the highest value (100th percentile). Using medians, or percentiles lower than 99%, would exclude many users. A problem with 1% of requests is a serious problem. To understand why, it is important to understand that, 1% of requests does not mean 1% of page views, or even 1% of users.
A typical Wikipedia pageview makes 20 requests to the server (1 document, 3 stylesheets, 4 scripts, 12 images). A typical user views 3 pages during their session (on average).
This means our problem with 1% of requests, could affect 20% of pageviews (20 requests x 1% = 20% = ⅕). And 60% of users (3 pages x 20 objects x 1% = 60% ≈ ⅔). Even worse, over a long period of time, it is most likely that every user will experience the problem at least once. This is like rolling dice in a game. With a 16% (⅙) chance of rolling a six, if everyone keeps rolling, everyone should get a six eventually.

January 11, 2018 at 12:29pm
/cc this is pretty vital knowledge for understanding our API performance stats
like-fill
1
  • reply
  • like
That's some really crazy statistical math going on there that doesn't seem obvious. :P
  • reply
  • like
It's like the opposite of what I assume "average" means.
  • reply
  • like
Yeah, I think a fair TL;DR would be "1% of the requests is not 1% of the users, but much more"
like-fill
1
  • reply
  • like
Because requests:users is a many:1 relationship, not a 1:1 relationship
  • reply
  • like
Aaaah
  • reply
  • like
And that number goes up depending on how many requests you have on your particular page. Mind blown.
  • reply
  • like
One user will make many requests, so the 1% slowest requests are going to hit every user every once in a while, basically
like-fill
1
  • reply
  • like
Yeah exactly
  • reply
  • like
Another reason to love static generators ;)
  • reply
  • like
Haha I wanna see you build Spectrum as a static site! 😜
like-fill
1
  • reply
  • like
Also I mean that still applies to static sites, you're probably still loading at least a handful of requests on every page (one html + one css + one js + a couple of images)
like-fill
1
  • reply
  • like
Fair enough, I'm just saying that it would reduce the possibility of 60+ requests.
  • reply
  • like
trigger static built per user comment. lol
like-fill
2
  • reply
  • like