The main idea is that some users see version A and other see version B and we collect analytics on both versions to gauge user behaviour and see which version performs better.

We combine A/B testing with statistical Hypothesis Testing.

We can use randomized trials to assign features per user. We do this with a random number generator. Note that this does not work if there is a systematic bias. Also note that you need a statistically significant sample per condition. You do not need to send the same number of users to use a statistical test. You may want to do a correction if you are testing multiple hypothesis.

Ramping Up

  • Start with 1% of users seeing the new feature
  • Once you’ve collected enough samples for the new feature, run a statistical test to see the effects of the new feature.
    • We can measure things like clicks, sales, etc to gauge performance

Things to Measure

  • Scroll prediction (predict finger future position during scrolls allowing time to render the frame before the finger gets there)
  • Single click autofill (Suggest autofill fields, the idea here is that users will fill out forms in less time)
  • Enable PiP (With PiP enabled, 2% of users will use it)