GoWin Tools
Tools
โ† Random Name Picker

Random Name Picker ยท 6 min read

Random Sampling in Research: Why Scientists Pick Names From a Hat

Without random sampling, scientific findings cannot be generalised. Without random assignment, cause and effect cannot be established. Here is why randomness is the foundation of valid research.

The Two Types of Randomness in Research

Scientific research uses randomness in two distinct ways, with different purposes:

  • Random sampling: Selecting participants from the population of interest randomly, so the sample represents the population
  • Random assignment: Randomly assigning participants to different conditions (experimental vs. control) in an experiment, so the groups are equivalent before the treatment is applied

Both matter, but for different reasons. Random sampling determines whether findings can be generalised โ€” whether what you found in your sample is likely true of the broader population. Random assignment determines whether findings are causal โ€” whether the treatment caused the outcome, or whether some pre-existing difference between the groups explains the result.

Why Random Sampling Matters: The Generalisation Problem

Any research finding is only as good as the sample it came from. If you study 100 people to draw conclusions about "humans," you need those 100 people to be a representative sample of humans โ€” not just the people who were most convenient to study.

The classic failure mode is convenience sampling โ€” studying whoever is easiest to recruit. In academia, this most commonly means university students: they are physically present, available in large numbers, and often required to participate in studies for course credit. A famous critique by psychologist Joseph Henrich and colleagues, published in 2010 as "The WEIRDest people in the world," documented that the vast majority of psychological research was conducted on populations that are WEIRD (Western, Educated, Industrialised, Rich, Democratic) โ€” and that findings from these populations do not generalise to most of the world's population.

Random sampling from the target population is the solution. If you want to draw conclusions about UK adults, you need to randomly select from UK adults โ€” not just students, not just internet users, not just respondents to self-selected surveys.

Types of Random Sampling

Simple random sampling

Every member of the population has an equal probability of being selected. The equivalent of putting all names in a hat and drawing without replacement. Simple and unbiased, but requires a complete list of the population (a sampling frame), which is often difficult to obtain for large populations.

Stratified random sampling

The population is divided into subgroups (strata) โ€” age bands, geographic regions, income levels โ€” and random samples are drawn from each stratum. This ensures that important subgroups are adequately represented even when they are small. For example, if a country is 3% Asian, a simple random sample of 100 people might include 0โ€“6 Asian respondents (the random variation is large). A stratified sample would include exactly 3, ensuring representation.

Cluster sampling

When a complete sampling frame is unavailable, clusters (schools, hospitals, cities) are randomly selected, and then individuals within selected clusters are sampled. More practical for geographically dispersed populations but introduces clustering effects that must be accounted for in analysis.

Why Random Assignment Establishes Causation

The difference between correlation and causation is one of the most frequently misunderstood concepts in science. Ice cream sales correlate with drowning rates โ€” but ice cream does not cause drowning. Both increase in summer, driven by a common underlying factor (hot weather). Without random assignment, observed correlations are always potentially explained by unmeasured common causes.

Random assignment โ€” the randomised controlled trial (RCT) โ€” is the gold standard for establishing causation. When participants are randomly assigned to receive treatment or not, the two groups are equivalent on all variables โ€” measured and unmeasured โ€” before the treatment begins. Any difference in outcomes after treatment can only be explained by the treatment itself.

Ronald Fisher, the statistician who developed the mathematical foundation of experimental design in the 1920s and 1930s, identified randomisation as the key innovation. Before Fisher, experiments compared a treated group with a control group, but the two groups were often systematically different before treatment began. Randomisation solved this by making the groups equivalent by construction.

When Randomness Is Skipped: The Replication Crisis

The "replication crisis" in psychology and social science โ€” the finding, starting around 2011, that many classic results failed to replicate in independent studies โ€” was partly caused by inadequate randomisation and inadequate attention to sampling validity.

John Ioannidis's influential 2005 paper "Why Most Published Research Findings Are False" predicted this crisis by showing that with small samples, high numbers of variables tested, and publication bias toward positive results, most findings would be false positives even with technically valid methodology. The expected cure includes larger samples, pre-registration of hypotheses, and better randomisation.

The practical implication for anyone reading research: a study without random sampling cannot be generalised to the broader population. A study without random assignment cannot establish causation. These are not technicalities โ€” they are the difference between evidence and coincidence.

Pick a random name for your research โ†’

References

  1. Fisher, R.A. (1935). The Design of Experiments. Oliver & Boyd.
  2. Cochran, W.G. (1977). Sampling Techniques (3rd ed.). Wiley.
  3. Ioannidis, J.P.A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124.
  4. Shadish, W.R., Cook, T.D., & Campbell, D.T. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Houghton Mifflin.
  5. Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.