emilyrobinson – paulvanderlaken.com

I wrote about Emily Robinson and her A/B testing activities at Etsy before, but now she’s back with a great new blog full of practical advice: Emily provides 12 guidelines for A/B testing that help to setup effective experiments and mitigate data-driven but erroneous conclusions:

Have one key metric for your experiment.
Use that key metric do a power calculation.
Run your experiment for the length you’ve planned on.
Pay more attention to confidence intervals than p-values.
Don’t run tons of variants.
Don’t try to look for differences for every possible segment.
Check that there’s not bucketing skew.
Don’t overcomplicate your methods.
Be careful of launching things because they “don’t hurt”.
Have a data scientist/analyst involved in the whole process.
Only include people in your analysis who could have been affected by the change.
Focus on smaller, incremental tests that change one thing at a time.

More details regarding each guideline you can read in Emily’s original blogpost.

In her blog, Emily also refers to a great article by Stephen Holiday discussing five online experiments that had (almost) gone wrong and a presentation by Dan McKinley on continuous experimentation.

Generating numbers is easy; generating numbers you should trust is hard!

Emily Robinson is a data scientist at Etsy, an e-commerce website for handmade and vintage products. In the #rstats community, Emily is nearly as famous as her brother David Robinson, whom we know from the tidytext R-package.

Like any large tech company, Etsy relies heavily on statistics to improve their way of doing business. In their case, data from real-life experiments provide the business intelligence that allow effective decision-making. For instance, they experiment with the layout of their buttons, with the text shown near products, or with the suggestions made after a search query. To detect whether such changes have (ever so) small effects on Etsy’s KPI’s (e.g., conversion), data scientists such as Emily rely on traditional A/B testing.

In a 40-minute presentation, Emily explains how statistical issues such as skewed distributions, outliers, and power are dealt with at Etsy, among others using bootstrapping and simulations. Moreover, 30 minutes in Emily shares her lessons when it comes to working with (less stats-savvy) business stakeholders. For instance, how to help identify and transform business questions into data questions back into business solutions, or how to deal with the desire to peek at the results of experiments early.

Overall, I can the presentation below, the slides of which you find on Emily’s GitHub.