I was recently tasked with acquiring a certain number sense in regards to A/B testing. After Googling around a bit, I noticed that much of the information on A/B Testing were fairly rudimentary introductions that simply skimmed the concepts of said testing. There was a myriad of products and services that offered A/B Testing, but these got straight to the results of the sample tests. What I wanted was the nitty, gritty, gory, superfluous details of the math. Understanding the Wikipedia definition is one thing, actually being able to wield the math was another. What I thought would be a simple 30 minute Wikipedia read, quickly spiraled into a furious mad dash to derive the underlying principles of what is essentially, everything I forgot in AP Statistics back in high school. 15 hours and a bottle of 2 buck chuck later, I had an elementary grasp of the glorified math behind the aptly named, A/B Testing.
What Is It? Well the best way I can describe it is as follows: It's a process used within marketing and business intelligence to measure the effectiveness of certain changes to their product (in most cases, it's their website). Unfortunately for me, the folks that developed the industry standard around A/B testing managed to spin up their own jargon and parlance surrounding what could have, in my opinion, kept the original Statistics based terminology. Needless to say, much of the time I spent practicing A/B testing was spent researching what the Marketing and BI terminology meant. I'll try to elaborate on this later. In its basic statistical form, an A/B Test is a hypothesized, single or two-tail, multi-variate, (hopefully) randomized test, that compares the difference between the sampled multi-variates. In hindsight, I can see why they decided to shorten it to A/B Testing.
In layman's terms, the individuals that manage a website may want to optimize the product and increase the occurrence of a certain action that end-users perform, such as "Add to Cart" or "Sign Up." To do this, they may change certain factors of the of the end-user action, such as changing the color of the "Add to Cart" button, or increasing the size of the "Sign Up" button. The term they use for this "change" are referred to as, "recipes, treatments, or variations," while the rate of end-user action is called, a "conversion." In the world of manufacturing, they refer to these as a key performance index, or KPI for short. KPIs can vary greatly, and are composed of measures and dimensions. For the rest of this post, I'll only used the word conversion.
That's enough for now, for a basic definition of A/B Testing, now it's time for a fake yet somewhat real-world example!
Let's assume a certain company wants to increase the conversion rate for a user pressing the "Sign Up" button. They want to do this by speculating that changing the wording on the button will increase conversion rates. They decide to change the wording to an obnoxiously large and capitalized, "SIGN UP NOW." Great, now that's all settled, let's set up the data.
So what's it all mean? Well, with a z score of 3.02, that effectively puts our p score at .0011. This is substantially lower than our error level of .05 which means that we can safely reject our null hypothesis, and conclude that the changes we made to the button did in fact cause a rise in our conversion rates.
Attention to detail? Nah, attention to the whole picture.