| ||||||||||||||||||||||||||
How do you work out what sample size to use for your survey? It is actually a complex calculation. And consequently, in my experience, people use rules of thumb – like 10%. Such rules of thumb cannot hope to give an adequate estimate of the needed sample size and consequently people either under-sample or over-sample. Often these samples are vastly too small - causing inappropriate decisions - or much too large - a wasted effort and expense.
What sample size should you take? The answer is a balance between your intolerance for 'false positives' and 'false negatives'.
The diagram below answers the following commonly asked question:
I wish to sample my customers to see if they are satisfied with the service I provide. I think that at least 85% of them are at least 'very satisfied'. I can tolerate a 'false positive' (ie saying they were satisfied when the were not) no more than 5% of the time. On the other hand, if more than 85% really are at least 'very satisfied', I want to be at least 90% certain that I will find out.Use the sliders to set up your own sampling plans. The calculator shows 4 graphs. Anti-clockwise from the left they show:
|
||||||||||||||||||||||||||
To veiw the calculator, click the button below
Questionnaire design and analysisWhen people think of doing a survey like the one described above, they usually make up some type of satisfaction scale (say, from 1 to 6) and make one end of the scale (eg the 6) be 'extremely satisfied' and the other end (eg the 1) be 'extremely dissatisfied'. 5 and 2 are 'very' etc as shown below. The intent is to provide a few split points in the continuum of opinion. I prefer to always force people to make a choice one way or the other. So I don't give a 'fence sitting' middle point.
Sample Size Calculation in Experimental DesignSampling Theory. In most situations in statistical analysis, we do not have access to an entire statistical population of interest, either because the population is too large, is not willing to be measured, or the measurement process is too expensive or time-consuming to allow more than a small segment of the population to be observed. As a result, we often make important decisions about a statistical population on the basis of a relatively small amount of sample data.Typically, we take a sample and compute a quantity called a statistic in order to estimate some characteristic of a population called a parameter. For example, suppose services manager at Telstra is interested in the proportion of Telstra customers who are currently 'very satisfied' or more with Telstra's level of service on a particular issue. Telstra's customer base is 1,500,000 in that state. In this case, the parameter of interest, which we might call , is the proportion of customers in the entire population of Telstra customers in that state who are 'very satisfied' or more. The services manager is going to commission an opinion poll, in which a (hopefully) random sample of people will be asked whether or not they are satisfied with Telstra service. The number (call it N) of people to be polled will be quite small, relative to the size of the population. Once these people have been polled, the proportion of them who rate Telstra's service as 'very satisfied' or higher will be computed. This proportion, which is a statistic, can be called p. One thing is virtually certain before the study is ever performed: p will not be equal to ! Because p involves "the luck of the draw," it will deviate from . The amount by which p is wrong, i.e., the amount by which it deviates from , is called sampling error. In any one sample, it is virtually certain there will be some sampling error (except in some highly unusual circumstances), and that we will never be certain exactly how large this error is. If we knew the amount of the sampling error, this would imply that we also knew the exact value of the parameter, in which case we would not need to be doing the opinion poll in the first place. In general, the larger the sample size N, the smaller sampling error tends to be. (One can never be sure what will happen in a particular experiment, of course.) If we are to make accurate decisions about a parameter like , we need to have an N large enough so that sampling error will tend to be "reasonably small." If N is too small, there is not much point in gathering the data, because the results will tend to be too imprecise to be of much use. On the other hand, there is also a point of diminishing returns beyond which increasing N provides little benefit. Once N is "large enough" to produce a reasonable level of accuracy, making it larger simply wastes time and money. So some key decisions in planning any experiment are, "How precise will my parameter estimates tend to be if I select a particular sample size?" and "How big a sample do I need to attain a desirable level of precision?" The Sample Size Calculator above provides you with the statistical methods to answer these questions quickly, easily, and accurately. Hypothesis Testing. Suppose that the service manager was interested in showing that more than 85% of customers were 'very satisfied' or more. Her question, in statistical terms: "Is > .85?" In statistics, the following strategy is quite common. State as a "statistical null hypothesis" something that is the logical opposite of what you believe. Call this hypothesis H0. Gather data. Then, using statistical theory, show from the data that it is likely H0 is false, and should be rejected. By rejecting H0, you support what you actually believe. This kind of situation, which is typical in many fields of research, for example, is called "Reject-Support testing," (RS testing) because rejecting the null hypothesis supports the experimenter's theory. The null hypothesis is either true or false, and the statistical decision process is set up so that there are no "ties." The null hypothesis is either rejected or not rejected. Consequently, before undertaking the experiment, we can be certain that only 4 possible things can happen. These are summarized in the table below
Note that there are two kinds of errors represented in the table. A Type I error represents, in a sense, a "false positive" for the researcher's theory. From society's standpoint, such false positives are particularly undesirable. They result in much wasted effort, especially when the false positive is interesting from a theoretical or political standpoint (or both), and as a result stimulates a substantial amount of research. Such follow-up research will usually not replicate the (incorrect) original work, and much confusion and frustration will result. A Type II error is a tragedy from the researcher's standpoint, because a theory that is true is, by mistake, not confirmed. So, for example, if a drug designed to improve a medical condition is found (incorrectly) not to produce an improvement relative to a control group, a worthwhile therapy will be lost, at least temporarily, and an experimenter's worthwhile idea will be discounted. (In our example, the research might fail to identify that more that 85% of Telstra's customers were satisfied even though they were.) Many statistics textbooks present a point of view that is common in the social sciences, i.e., that , the Type I error rate, must be kept at or below .05, and that, if at all possible, , the Type II error rate, must be kept low as well. "Statistical power," which is equal to 1 - , must be kept correspondingly high. Ideally, power should be at least .80 to detect a reasonable departure from the null hypothesis. The conventions are, of course, much more rigid with respect to than with respect to . For example, in the social sciences seldom, if ever, is allowed to stray above the magical .05 mark. The statistically well-informed researcher makes it a top priority to keep low. Ultimately, of course, everyone benefits if both error probabilities are kept low, but unfortunately there is often, in practice, a trade-off between the two types of error. ReferencesStatsoft Electronic Textbook. (for power analysis and sample size calculation) Cooke, Craven & Clarke (1981) Basic Statistical Computing. (for the binomial and normal probability and reverse normal algorithms). Abramowitz & Stegun (1972) Handbook of Mathematical Functions. Conover (1971) Practical Nonparametric Statistics. |
Copyright © 2000- netgm pty ltd. All rights reserved.