Sampling

Random sampling**:** This is the most common probability sampling technique as every single sample is selected randomly from the population data set. This gives an opportunity for each record in the data set an equal chance (probability) to be chosen to be a part of the sample. For example, the HR department wants to conduct a social event. Therefore, it wants to select 50 people out of 300. To provide an equal opportunity to everyone, HR picks the names randomly from a jar containing all the names of employees.
Systematic sampling**:** The systematic sampling method chooses the sample at regular intervals. For example, your analysis requires transactional data. This can be collected over 30-day intervals with a size of 1,000 out of a 10,000 population daily. Over a period of 30 days, the total sample collected is around 30,000.
Cluster sampling: The entire population data is divided into different clusters, and samples are collected randomly from each cluster. For example, you are creating a predictive model for your organization. The organization is spread across the globe in 30 different countries. Your sample data should consist of data from all these different (30) clusters for your model to provide an accurate prediction. If the data is taken from only one geographical location, then the model is biased and does not perform well for other geographical locations.