Source of n: The Sample Size
The sample size is designated as n and the total population size as N. In this context, size means number of elements or units of analysis. Elements may be persons but they could be employees, households, farms, schools, or any other unit. For example, the total population for a study of the value of a city's houses may be the total number of existing houses N. A sample would be a smaller number n of houses, a subset of the total.
A frequent source cited as authoritative for computing the required sample size for SRS in which a sample mean is used as an estimate of the population mean is William G. Cochran, Sampling Techniques 78 (3d ed. 1977). The formula in the textbook simplifies to n = t2S2/d2, where t is the value 1.96 of the t-distribution at the 95% level, S2 is the population variance, and d is the error tolerance.
An example of d in a wage and hour case could be plus or minus one hour in estimating average overtime for a class of employees. As was the case in Bell v. Farmers Insurance Exchange, 9 Cal. Rptr. 3d 544 (Ct. App. 2004), the judge may order the experts to create a sample sufficiently large to enable the estimate of average overtime to be within plus or minus one hour. S2, the population variance, is usually unknown. Obviously if S2 were known, sampling would not be required.
This complication aside, it is instructive to examine the assumptions behind the formula and the example Cochran presents in order to evaluate whether SRS and the sample size formula are being appropriately applied in a specific legal matter. Cochran's example on page 78 is represented here verbatim:
Example. In nurseries that produce young trees for sale it is advisable to estimate, in late winter or early spring, how many healthy trees are likely to be on hand, since this determines policy toward the solicitation and acceptance of orders. A study of sampling methods for the estimation of the total numbers of seedlings was undertaken by Johnson (1943). The data that follow were obtained from a bed of silver maple seedlings 1 ft wide and 430 ft long. The sampling unit was 1 ft of the length of the bed, so that N = 430. By complete enumeration of the bed it was found that Ῡ = 19, S2 = 85.6, these being the true population values.
With simple random sampling, how many units must be taken to estimate Ῡ within 10%, apart from a chance of 1 in 20? From (4.5) we obtain
n0 = t2S2 / r2Ῡ2 = (4 x 85.6) / (1.9)2 = 95.
Since n0 / N is not negligible, we take
n = 95 / (1 + (95/430)) = 78
Almost 20% of the bed has to be counted in order to attain the precision desired.
The formulas for n given here apply only to simple random sampling in which the sample mean is used as the estimate of Ῡ.
Keep in mind the purpose behind random sampling. We have a large collection of data (population) and we would like to answer some quantitative questions about the population with a high degree of precision without studying each and every element of the population. However, despite the name "simple random sampling," Cochran's example is highly theoretical. He found by complete enumeration or examining every element of the population the true population average and variance. He then used this information working backward to find a sample size he could have used to collect sample averages and variances that would produce estimates hopefully bracketing the corresponding true population average and variance.
Cochran's Ideal Setup for SRS
In a number of ways, Cochran's example represents the ideal setting in which to apply SRS techniques. First, the formula for the sample size is derived after information is found out about the population, i.e., Ῡ and S2. This information comes from a complete enumeration or a census. Notice that Cochran's example of deriving the appropriate sample size is circular. Cochran already knows the population mean and variance and uses that information to derive the sample size. As a practical matter, we would not have this information before choosing a sample size.
Second, the experimental setup is seedlings in a soil bed of size 430 ft. by 1 ft. In effect, the soil bed is on an open field that has uniform conditions both within the bed and surrounding area. The element or sampling unit is 1 sq. ft. of seedling bed. There are 430 seedling beds planted at the same time. Experimental conditions are carefully controlled. All sampling units are therefore homogeneous and subject to known and uniform environmental conditions.
Third, the question posed is straightforward: of the hundreds of seedlings planted before the winter, how many survived and are expected to be available for sale by the summer? The count of surviving seedlings is done in spring. If the live seedlings in each 1 sq. ft. sampling unit were counted, how many sampling units must be examined and averaged to obtain a close approximation to Ῡ = 19? Of course, taking an estimate based on examining 78 sampling units, say 18.5, and multiplying by 430 arrives at the expected total number of seedlings on hand for the summer market.
Reference Point
The salient features of this example should serve as a reference for judging the validity of a proposed statistical setup using random sampling: 1 ft. by 430 ft. entire population under study, uniform soil, silver maple seedlings (not a collection of various tree specimens), controlled environmental conditions, and dead or alive question. Proposed statistical results and conclusions should be evaluated based on how closely the underlying experimental conditions adhere to Cochran's example. Vagaries in experimental conditions applicable to specific sampling units will result in biased and unreliable estimates.
Of course, the number of surviving seedlings will vary naturally from one sampling unit to the next. Having homogeneous sampling units does not mean that measurements will not vary. However, suppose a portion of Cochran's seedling bed is located adjacent to a riverbed. This means that a subset of the sampling units will be relatively warmer than the remaining sampling units, biasing any measure of survivorship.
Experimental statisticians originally developed their work on sampling with populations, sampling frames, sampling units, and controlled conditions adhering as closely as practically possible to the theoretical assumptions underlying classical mathematical statistics. Under the most favorable circumstances, the developers of experimental statistics acknowledged their experiments were not perfect and would condition their results on how far experimental conditions departed from the ideal.
Let's make this plain. The mathematical statistics and results that allow us to use random sampling, compute various sample statistics, and project the results to the greater population are premised on data complying with strict technical assumptions, such as randomness, symmetric distributions, and normal distribution. Specifically, the central limit theorem or the law of large numbers will not rescue a statistical setup that diverges from the assumptions in Cochran's example. Plain and simple, it is stretching mathematical statistical reliability and validity using SRS under the most cleverly crafted experimental design and carefully controlled conditions. The reliability of applying SRS under loosely constructed designs without controlling relevant conditions is a serious problem. In order to solve these problems, statisticians turn to more complex sampling designs: stratified, proportionate, and cluster to name a few.
Unpaid Overtime: Questionable Application of SRS?
SRS has been used in several large wage and hour class actions to estimate the total number of overtime hours worked by alleged class members. Essentially, it is proposed that from an examination under oath by attorneys of 250–300 alleged class members, a precise estimate of average overtime hours worked for say 3,000, 5,000, or even 25,000 alleged class members over a three-, four-, or even six-year class period can be estimated.
To evaluate the reliability of this overtime setup, we can apply Cochran's 1 x 430 sq. ft. soil bed and attendant assumptions and conditions. Suppose the alleged class comprises employees across 20 job descriptions, 50 states, and six years. The question is on average how many hours of unpaid overtime did the alleged class members work?
Does this problem sound even vaguely like determining the number of seedlings that survived the winter in a 430 sq. ft. soil bed? In early spring, we count the number of seedlings alive in each 1 sq. ft. plot of soil or sampling unit. The count of unpaid overtime hours is much more complex. Alleged class members in the overtime matter involving insurance adjusters in Bell must recall hours worked in California up to six years ago.
My experience in reading depositions is that many deponents cannot recall the number of vacation days annually they currently have, let alone give an accurate accounting of the length of prior work weeks or how the time was spent. Do all of the alleged class members of insurance adjusters face identical employment conditions that are relevant to estimating unpaid overtime hours over the class period? There are a number of conditions that would appear to create heterogeneity among alleged class members that are relevant to estimating unpaid overtime: job titles, location, and tenure to name a few.
In the Bell case, a preliminary sample of 50 class members showed that the average overtime hours was 7.33 hours +/– 2.75 hours. This means that the true class average of unpaid overtime hours could be as low as 4.58 hours or as high as 10.18. Believing this range far too wide for calculating damages, the judge ordered the statisticians in Bell to increase the class sample. A larger sample typically yields a tighter range around the sample mean. However, instead of reducing the interval around the mean, the mean itself shifted upward 30%, to 9.42 hours, for a number of reasons left unexplored.
Returning again to Cochran's example, if a preliminary sample of 50 soil beds is selected, it is unlikely that adding data points would produce a comparable upward shift in the sample mean. The data collected in Bell relied on human memories up to six years into the past, and it is entirely possible that attorneys were unwittingly (or otherwise) generating interviewer bias. Bell was the first case using SRS for the purpose of estimating class-wide unpaid overtime hours and damages with attorneys "surveying" class members by questions in depositions. Plaintiff attorneys may have experienced a form of on-the-job training enabling them to elicit increasingly greater declarations of overtime hours worked as depositions progressed, as we found from regression analysis in a similar case involving insurance adjusters that was settled. Obviously, the seedlings experiment has none of the problems that seriously call into question the reliability of SRS in wage and hour cases.
Conclusion
A critical element of random sampling is the selection of a sample size. Cochran's widely cited silver maple seedlings example shows the process of estimating the optimal size of a sample to be selected and tested in order to make conclusions about the overall characteristics of the population, such as how many seedlings on average survive the winter. Cochran's random sampling method, and importantly the assumptions and conditions required to implement it, may serve as a benchmark for the use of random sampling in practice.
Keywords: litigation, expert witnesses, simple random sampling, wage and hour cases, class actions
Charles A. Diamond is a managing director at Alvarez & Marsal’s Global Forensic and Dispute Services practice in New York, New York.