HyperAI

Sampling

samplingIt is a commonly used inferential statistical method, which refers to extracting a part of individuals from the target population (Population, or parent population) as a sample (Sample), observing one or some attributes of the sample, and making an estimate of the quantitative characteristics of the population with a certain reliability based on the data obtained, so as to achieve an understanding of the population.

The main stages of the sampling process:

  • Define the population (parent population.

  • Determine the sampling frame.

  • Determine the sampling method.
  • Determine sample size.
  • Implement the sampling plan.
  • Sampling and data collection.
  • Review the sampling process.

Common sampling methods

1) Simple random sampling, also called pure random sampling.

Randomly select n units from the total N units as a sample, so that each sample of size has the same probability of being selected.

The characteristics are: the probability of each sample unit being selected is equal, each sample unit is completely independent, and there is no certain correlation and exclusion between each other. Simple random sampling is the basis of various other sampling forms. This method is usually only used when the difference between the overall units is small and the number is small.

2) Systematic sampling, also known as equidistant sampling.

Arrange all units in the population in a certain order, randomly select a unit within the specified range as the initial unit, and then determine other sample units according to the pre-defined rules. First, randomly select a number r from the numbers 1 to k as the initial unit, and then select r+k, r+2k, etc. units in turn. This method is easy to operate and can improve the accuracy of the estimate.

3) Stratified sampling.

The sampling units are divided into different layers according to certain characteristics or rules, and then samples are independently and randomly drawn from different layers, thus ensuring that the structure of the samples is close to that of the population, thereby improving the accuracy of the estimate.

4) Cluster sampling.

Combine several units in the population into groups, directly select the group when sampling, and then conduct a survey on all units in the selected group. When sampling, only the sampling frame of the group is needed, which can simplify the workload. The disadvantage is that the estimation accuracy is poor.

References

【1】https://zh.wikipedia.org/wiki/sampling