When it comes to surveys, people often ask me: How big of a sample do I need?
It's a hard question to answer and there are several key things to keep in mind. For one, the larger the sample, the more likely it is to be representative of the target population; this is called "external validity." However, it's not that easy.
Just because a sample is large, doesn't make is representative. Therefore, size alone will not answer the question of whether your 20 or 20,000 respondents are nationally representative. There's nothing magical about 20 or 20,0000 except that they're prettier numbers than 19,787 (at least to most people).
The only way a sample is truly representative is if it contains the same proportions and combinations of all the millions of measured and unmeasured variables in the target population. The best way to ensure that is to have a large, random probability sample, which MTurk and other crowd-sourced platforms do not directly provide.
The issue is a sample of convenience. The people on crowdsourcing platforms will not include people who don't like to do research, people with lots of money, people with poor tech skills, people who don't use Amazon products, etc.
However, that doesn't mean that crowdsourced data cannot be nationally representative. You have to compare the sample characteristics to the national population parameters to see if the sample appears representative, and if not, weight the sample accordingly. To get technical, this is generating a "sampling model." This can't guarantee representativeness (due to the convenience sampling), but gets you close enough that hopefully the differences are insignificant.
Now the trick is to determine which sample characteristics to check against the population, or a "proximal similarity model." The population is unfortunately unknown, but the U.S. census provides good enough data (for most research questions, certainly not questions about undocumented residents), to make comparisons. So, if you get the race and gender and age distributions from the census, compare them to your 20 or 20,0000 samples, and then weight your samples to match the census data.
In the end, I recommend that you get as many respondents as you can, then try to supplement by oversampling people who are less represented in your current sample. So, if you have no one from West Virginia, restrict some to that state. Or, if you don't have many Asians, attempt to gain more. That way, you modify the "gradient of similarity" to improve your external validity.
Like in so many things, whether it's beer or life experiences, quality matters more than quantity.
By Joe McFall