Learning about Multilevel Models
statistics

Learning about Multilevel Models

The concept of a multilevel model, also called a mixed effects model or a hierarchical model, is reasonably new to me. It's not the kind of thing typically taught in physics (where there are very explicit models) or in machine learning, but is quite common in social science. I first came across it through Lauren Kennedy on the Learning Bayesian Statistics Podcast, through talking with a trained neuroscientist and a trained statistician who were talking about fixed and variable effects as I went cross-eyed, and through the excellent Regression and Other Stories textbook which makes many allusions to it (to be expounded on in their upcoming sequel Applied Regression and Multilevel Models).

How big a sample to measure conversion?
data

How big a sample to measure conversion?

A common question with conversions and other rates, is how big a sample do you need to measure the conversion accurately? To get an estimate with standard error \(\sigma\) you need at most \( \frac{1}{4 \sigma^2} \) samples. In general if the true conversion rate is p it is \(\frac{p(1-p)}{\sigma^2}\). So let's say we want to measure the conversion rate within about 5%. To be conservative we'd want the standard error to be a bit less than that, say 3% Then we would need at least \( \frac{1}{4 (0.

Statistical Testing: 2.8 Standard Deviations
data

Statistical Testing: 2.8 Standard Deviations

What sample size do you need to capture an effect with 95% confidence 80% of the time? For a normal/binomial distribution the answer is roughly \( \left(2.8 \frac{\sigma}{\epsilon}\right)^2 \), where \( \sigma \) is the standard deviation of the data and \( \epsilon \) is the size of the effect. The ratio \( \frac{\sigma}{\epsilon} \) says the smaller the effect size is relative to the variability in the data the larger the sample size you will need.

Experimental Generalisability
statistics

Experimental Generalisability

Experiments reveal the relationship between inputs and outcomes. With statistical methods you can often, with enough observations, tell whether there's a strong relationship or if it's just noise. However it's much harder to know how generally the relationship holds, but it's essential for making decisions. Suppose you're testing two alternate designs for a website. One has a red and green button with a santa hat and bauble, and the other has a blue button.