SAMPLING.

1 SAMPLING ...
Author: Todd Fisher
0 downloads 2 Views

1 SAMPLING

2 Population Universe Any complete group of people e.g. Companies, stores, hospitals, e.t.c. The distinction between population and universe, it is on the basis of whether the group is finite (ppn) or infinite (universe) The term population element refers to an individual member of the ppn. A census is an investigation of all the individual elements that make up the ppn.

3 Why sampling? A. Pragmatic Reasons Budget and timeAccessibility - impossible to contact all people within a short time. Cut cost and labour requirements

4 B. Accurate and reliable ResultsSamples if properly selected are sufficiently accurate in most cases. When the population elements are highly homogeneous, samples are highly representative of the population. Even when populations have considerable heterogeneity, large samples provide data of sufficient precision to make most decisions.

5 Samples are accurate only when researchers have taken care to draw representative samples properly.A census maybe accurate than a census however, in a census of a large population there is a greater likelihood of non-sampling errors. Coding and tabulation of data can be more closely supervised than in a census.

6 Practical Sampling Sampling follows decisions as a series of sequential stages even though the order of decision does not always follow this particular. These decisions are highly interrelated. These issues are highlighted below:

7 Define the target populationSelect a sampling frame Determine if a probability or nonprobability sampling method will be chosen Plan procedure for selecting sampling units Determine sample size Select actual sampling units Conduct Fieldwork

8 Defining Target PopulationTarget population is the specific complete group relevant to the research project. What is relevant population? In many cases it is easier to identify, e.g. If you are concerned about motivational aspect in ZESA, your target population are all employees of ZESA. However, it might be difficult in other cases, e.g. In a industrial buyer behaviour survey using purchasing agents dealing with sales reps regularly conducted might be incorrect as engineers or production managers have substantial impact on buying decisions.

9 Answering questions about the critical characteristics of the population is the usual technique for defining the target population. The question “to whom do we want to talk?” must be answered. It may be users, nonusers, employees, e.t.c.

10 The Sampling Frame A sampling frame is a list of elements from which the sample maybe drawn. It is generally not feasible to compile a list that does not exclude some members of the population, e.g. If the student telephone directory is utilized as a sampling frame listing of your university’s student population, it may exclude those students who registered late, students without phones, e.t.c.

11 Also referred to as working population because it provides the list that can be operationally worked on. If a complete list of population elements is not accessible, materials such as maps or aerial photographs may be utilized as a sampling frame. Sampling frame error occurs when certain sample elements are exclude or when the entire population is not accurately represented in the sample frame.

12 E.g. In preparation of a new bond issue one manager used telephone numbers (randomly generated) as the basis for a sample survey dealing with attitudes toward capital improvements. When the bond issue failed, consultants pointed out the appropriate sampling frame would have been a list potential buyers. Thus by including respondents who should not have been listed as members of the population a sampling frame error occurred.

13 Population elements can also be overrepresented in a sampling frame.E.g. A savings and loan association defined its population as all individuals who had savings accounts, however, when it drew a sample from a list of accounts, rather than from a list of names of individuals, individuals who had multiple accounts were overrepresented in the sample.

14 Sampling Methods Determine whether probability or non- probability sampling can be applied for the type of research in progress. Forms of these are shown in the diagram below.

15 Non-probability SamplingMethods of Sampling Probability sampling Non-probability Sampling Simple Random sampling Stratified Sampling Systematic Sampling Cluster sampling Multi Stage Sampling Convenience sampling Judgemental/purposive Sampling Quota Sampling Snowball sampling

16 Sampling Units A sampling unit is a single element or group of elements subject to selection in the sample. For e.g. If an airline wishes to sample passengers, every twenty-fifth name on a complete list of passengers may be taken. In this case the sampling unit is the same as the element.

17 Alternatively, the airline could first select flights as the sampling unit, then select certain passengers on the previous selected flights. In this case the sampling unit contains many elements. If the target population has first been divided into units, such as airline flights, additional terminology must be used i.e. Primary sampling units (PSU) and secondary sampling units or tertiary sampling units (if three stages are necessary).

18 Primary sampling unit (PSU) – a unit selected in the first stage of sampling.Secondary sampling unit - a unit selected in the second time.

19 Determining sample SizeSample size can be reached by using different methods, i.e. At least 5% of the population or by applying statistical approaches. One of the statistical approaches is to use the Cochran (1977) format. This approach makes an assumption that there is normal distribution of the estimated sample size.

20 If the population sizes (N) are known, the sample size can be computed as follows:

21 Where: n0 = first approximation of n= area under normal distribution = relative error S = standard error = sample mean

22 In order to populate these formulas the relevant mean and standard deviation statistics were specified. The assumption was made that the mean cost for the two electricity supply tariff classification would deviate from the mean by up to 10% with confidence level of 90%.

23 Select Actual sampling unitsProbability Vs Non-probability sampling In probability every element in population has a known nonzero probability of selection e.g simple random sample is the best known probability sample, in which each member of the population has an equal probability of being selected.

24 In a non-probability sampling the probability of any particular member of the population being chosen is unknown. The selection of sampling units is quite arbitrary, as researchers rely heavily on personal judgement. Note that: there are no appropriate statistical techniques for measuring random sampling error from a non-probability sample. Thus projecting the data beyond the sample is statistically inappropriate.

25 Nonprobability A. Convenience SamplingAll called haphazard or accidental sampling. Refers to the procedure of obtaining units or people who are most conveniently available. For example, it may be convenient and economical to sample employees in companies in a nearby area.

26 Researchers generally use convenience samples to obtain a large number of completed questionnaires quickly and economically. User of research that is based on a convenience sample should remember that projecting the results beyond the specific sample is inappropriate. Convenience samples are best utilised for exploratory research when additional research will subsequently be conducted with a probability sample.

27 b. Judgemental/purposiveIs a non-probability sampling technique in which an experienced researcher selects the sample based upon some appropriate characteristic of the sample members. The researcher select a sample to serve a specific purpose, even if this makes a sample less than fully representative. E.g. The consumer price index (CPI) is based on a judgement sample of market-basket items, housing costs and other selected goods and services selected to reflect a representative sample of items consumed by most Zimbabweans.

28 Judgemental sampling is often used in attempts to forecast election results.People often wonder how, say, a TV network can predict the results of an election with only 2% of the votes reported.

29 c. Quota Sampling A non-probability sampling procedure that ensures that certain characteristics of a population sample will be represented to the exact extent that the investigator desires. For e.g. Due to women emancipation a quota of every 25% should be women. Thus a company must make sure that for its recruitment of 20 employees at least 5 should be women.

30 However, quota samples have a tendency to include people who are easily found, willing to be interviewed. Field workers are given considerable leeway to exercise their judgement concerning selection of respondents. Interviewers often concentrate their interviewing in heavy pedestrian areas, such as shopping malls, employee lunchrooms and college campus.

31 d. Snowball sampling Refers to a sampling procedure in which initial respondents are selected by probability methods and then additional respondents are obtained from information provided by the initial respondents. This technique is used to locate members of rare populations by referrals.

32 For example if you want certain information about holding companies and how they relate to their subsidiaries, you can first select the Head Offices and then the HQ are the ones which gives you subsidiaries to conduct or visit for the desired information.

33 Reduced sample size and costs are a clear advantage of snowball sampling.Bias is likely to enter into the study, because a person who is known to someone (also in the sample)has a higher probability of being similar to the first person. If there are major differences between those who are widely known by others and those who are not, there may be serious problems with snowball sampling.

34 Probability Sampling A. Simple Random SamplingA sampling procedure that assures each element in the population an equal chance of being included in the sample. E.g. Include: drawing names from a heat/box Selecting the winning raffle from a large drum Using random numbers

35 If any of the processes are done thoroughly each person should have an equal chance of being selected. The sampling process is simple because it requires only one stage of sample selection.

36 Drawing names or numbers out of a fish bowl, using a spinner, rolling dice or turning a roulette wheel may be used to draw a sample from a small population. When population consist of large numbers of elements, however, tables of random numbers or computer- generated random numbers are utilized for sample selection.

37 b. Systematic Sampling A sampling procedure in which an initial starting point is selected by a random process and then every kth number on the list is selected. For e.g. Suppose you wish to take a sample of 100 from a list consisting of 2000 names of companies, using systematic every 20th name from the would drawn.

38 An initial starting point is selected by a random and then the factor of selection is determined by using the following formula: Although this procedure is not actually a random selection procedure, it yields random results if the arrangement of the items in the list is random in character.

39 The problem of periodicity occurs if a list has a systematic pattern, i.e is if the list is not randomly in character. E.g of periodicity bias might be in a list of contributors to a charity where the first 50 might be extremely large donors, if the sampling interval is every 100th name, this would cause a problem.

40 Stratified Sampling Is a probability sampling procedure in which subsamples are drawn from samples within different strata that are more or less equal on some characteristics. A sub sample is drawn utilizing a simple random sample within each stratum. The reason for taking a stratified sample is to have a more efficient sample than could be taken on the basis of simple random sampling.

41 E.g. Suppose urban and rural groups differ widely on attitudes towards energy conservation, yet members in each group hold very similar attitudes. Random sampling error is reduced because the groups are internally homogeneous but comparatively different. More technically, a smaller standard error may be the result of this stratified sample because the groups are adequately represented when strata are combined.

42 Another reason for stratified sample is that the sample will accurately reflect the population on the basis of criterion or criteria used for stratification. This is a concern because occasionally a simple random sample a disproportionate number of one group or another and the representativeness of the sample could be improved.

43 A researcher selecting a stratified sample will proceed as follows:1. First, a variable is identified as an efficient basis for stratification (exhaustive and non- inclussive). The criterion for a stratification for stratification variable is that it is a characteristic of the population elements known to be related to the dependent variable or other variables of interest. The variable chosen should increase homogeneity within each stratum and increase heterogeneity between strata.

44 2. For each strata, a list of population elements must be obtained.If a complete listing is not available, a true stratified probability sample cannot be selected. 3. Using a table of random numbers or some other device a separate simple random sample is taken within each stratum. The researcher must determine how large a sample must be drawn for each stratum.

45 Proportional vs Disproportional StrataProportional stratified sample is a stratified sample in which the number of sampling units drawn from each stratum is in proportion to the relative population sizes of that stratum. Sometimes a disproportionate sample can be selected to ensure an adequate number of sampling units in every stratum. Sampling more heavily in a given stratum than its relative population sizes warrants is not a problem if the primary purpose of the research is to estimate some characteristic separately for each strata and if the researchers are concerned about assessing the difference among strata.

46 Disproportional stratified sample is a stratified sample in which the sample size for each stratum is allocated according to analytical considerations. The logic for this relates to the general argument for sample size. As variability increases, sample sizes must increase to provide accurate estimates. Thus the strata that exhibit the greatest variability are sampled more heavily in order to increase sample efficiency that is smaller random sampling error.

47 Optimal Allocation Stratified sample is a sampling in which both the size and the variability of each stratum are considered when determining sample size for each stratum. Thus the optimal sample size for each stratum may be determined. Complex formulas have been developed to determine the sample sizes for each stratum.

48 A simplified rule of thumb for understanding the concept of optimal allocation is that stratum sample size increases for strata of larger sizes with the greatest relative variability. When disproportional stratified sampling is utilized, the estimated means for each stratum have to be weighted according to the number of elements in each stratum in order to calculate the total population mean.

49 Cluster Sampling Is an economically efficient sampling technique in which the primary sampling unit is not the individual element in the population but a large cluster of elements. In a cluster sampling the primary sampling unit is no longer the individual element (households) in the population but a large cluster of elements (Villages). The area sample is the most popular type of cluster sampling.

50 Area sample is a cluster sample in which the primary sampling unit is a geographic area.A grocery researcher, e.g may randomly choose several geographic areas as the primary sampling units and then interview all (or a sample) within the geographic cluster. Interviews are confined to these clusters; no interviews occur in other clusters

51 Cluster sampling is classified as a probability sampling technique either because of the random selection of clusters or the random selection of elements within each cluster. Ideally a cluster should be as heterogeneous as the population itself; a mirror image of the population. Therefore a problem may arise with cluster sampling if the characteristics and attitudes of the elements within the cluster are too similar.

52 For e.g , a geographic neighbours tend to have residents of the same socioeconomic status, students at a university tend to share similar beliefs. Demonstration

53 Multistage Stage SamplingSampling that involves using a combination of other probability sampling techniques. Many steps are engaged to achieve a representative sample. For e.g an NGO may target to provide food handouts in Guruve area. The starting point might be dividing it into wards, then villages, then households and then a simple random sampling can be done on these households.

54 What is the Appropriate Sample Design?A researcher who must make a decision concerning the most appropriate sample design for a specific project will identify a number of sampling criteria and evaluate the relative importance of each criterion before selecting a sample design. The most common criteria are:

55 a. Degree of accuracy – selecting a representative is of course, important to all researchers.However the degree of accuracy required or the researcher’s tolerance for sampling and non-sampling error may vary from project to project, especially when cost savings or another consideration may be a trade-off for a reduction in accuracy.

56 b. Resources Cost associated with different sampling techniques vary tremendously. If the researcher’s financial and human resources are restricted, this will eliminate certain methods of sampling.

57 c. Time A researcher who need to meet a deadline or complete a project quickly will be more likely to select simple less time-consuming sample design.

58 d. Advance knowledge of the populationAdvance knowledge of population characteristics, such as the availability of lists of population members is an important criterion. Lack of adequate listing of population will rule out systematic sampling.

59 e. National versus Local projectGeographic proximity of population elements will influence sample design. When population elements are unequally distributed geographically, a cluster sampling may become much more attractive.

60 f. Need for Statistical AnalysisThe need for statistical projections based on the sample is often a criterion. Non-probability sampling techniques do not allow the researcher to utilize statistical analysis to project the data beyond the sample.

61 Random Sampling error and Non-sampling errorRandom sampling error is the difference between the result of a sample and the result of a census conducted using identical procedures; A statistical fluctuation that occurs because of chance variation in the elements selected for the sample. Sampling units selected though properly selected according to sampling theory may not perfectly represent the population but they are generally reliable estimates.

62 When there is a slight difference between the true population value and the sample value, there is a small random sampling error. Random sampling error is a function of sample size. As sample size increases, sampling error decreases.

63 Systematic sampling error- result from non- sampling factors, primarily the nature of the study’s design and the correctness of execution. Error that comes from such sources as sample bias, mistakes in recording responses and non- responses from persons not contacted or refusing to participate. These errors are not due to chance fluctuations. For e.g in mail survey highly educated respondents are more likely to cooperate than poorly educated ones for whom filling out forms is a more difficult and intimidating task

64 Sampling frame error – emanate from including some respondents who should not be listed as members of the population Non-response error – the statistical difference between a survey that includes only those who responded and a survey that also includes those who failed to respond.