â–¶What is the difference between probability and non-probability sampling, and when should I use each?
Probability sampling (simple random, stratified, cluster) gives every unit a known, non-zero chance of selection, which enables generalization to the population with quantifiable error margins. Use probability sampling when your goal is to estimate population parameters (e.g., 'What % of voters support candidate X?') or when you have a well-defined sampling frame. Non-probability sampling (convenience, purposive, snowball) selects units based on judgement or feasibility; generalization is not statistically justified but may be plausible if the sample is diverse and reasoning is transparent. Use non-probability sampling for exploratory/qualitative research, rare populations, or when a sampling frame doesn't exist. Each has trade-offs: probability sampling is more rigorous but often more costly and time-consuming; non-probability sampling is faster and feasible for hard-to-reach groups but introduces selection bias.
â–¶How do I calculate an appropriate sample size for my field study?
Sample size depends on: your effect size (smallest difference or association you want to detect), desired statistical power (usually 80%), significance level (α = 0.05), and study design (individual vs. clustered). For descriptive studies (estimating a proportion), sample size depends on the expected proportion and margin of error. Use power calculators (G*Power, online tools, software-specific functions) to determine sample size. For cluster sampling (surveying households within villages), account for design effects; you'll need more observations than simple random sampling. Oversample by 10–20% to account for expected non-response or dropout. Underpowered studies waste resources and often fail to detect real effects; adequately powered studies are an ethical obligation to participants.
â–¶What is a sampling frame and why do researchers worry about coverage error?
A sampling frame is the list of units from which you draw your sample—a voter registry for an election poll, a school roster for a student survey, a map grid for a biodiversity survey. Coverage error occurs when units in your population are missing from the frame or the frame includes ineligible units. For example, a phone-based survey misses people without phones (coverage error); a study recruiting from a university excludes non-students. Large coverage error biases estimates. To minimize it: use the most complete frame available (consider combining multiple sources), verify the frame against your population definition, and acknowledge coverage limitations in your report. For populations without a sampling frame (homeless individuals, undocumented immigrants), adapted strategies like respondent-driven sampling or time-space sampling may help.
â–¶How do I ensure data quality and consistency in field data collection?
Data quality assurance requires: standardized instruments (if using surveys, exact wording, fixed response options), standardized procedures (interviewers trained identically, same setting/time if feasible), validation checks (range checks: age can't be 999; logical checks: if 'No children' then skip questions about children), and supervisory oversight (spot-check completed forms, conduct regular interviews with data collectors for quality assurance). Use digital data collection (tablets, apps) with built-in validation rather than paper; it reduces entry errors and allows real-time monitoring. Randomly re-interview 5–10% of participants to check reliability. Document all decisions and deviations. Poor data quality undermines analysis regardless of sample size; invest in training and monitoring.
â–¶What is stratified sampling and when is it better than simple random sampling?
Stratified sampling divides the population into subgroups (strata) based on a characteristic relevant to your research (e.g., income level, age group, geographic region), then randomly samples within each stratum. This ensures representation of important subgroups and reduces sampling variance for overall estimates—you get a more representative sample and tighter confidence intervals than simple random sampling of the same size. Use stratification when: subgroups differ substantially in outcomes (rural/urban income patterns differ), you want subgroup-specific estimates, or you want to oversample rare subgroups (stratified random sampling with unequal allocation). Example: to estimate vaccine uptake by ethnicity, stratify by ethnicity; this ensures each group is adequately represented. Stratification costs little more than simple random sampling and improves precision—nearly always worthwhile when you have relevant strata.
â–¶How do I handle non-response in my study and what counts as adequate response rate?
Non-response occurs when selected participants refuse, are unreachable, or don't complete the survey. Non-response bias arises when non-responders differ from responders. To minimize non-response: use multiple contact attempts (phone, email, in-person), offer incentives, make participation convenient (online, phone, or in-person options), and communicate the study's importance. Document reasons for non-response (refusal, unreachable, language barrier). Response rates vary by study type: mail surveys often achieve 40–50%, phone surveys 60–70%, in-person interviews 80–90%. Higher is better, but a lower rate with detailed non-response analysis is better than a higher rate with no documentation. For analysis, use inverse probability weighting or other methods to adjust for non-response if it's informative (i.e., non-responders likely differ from responders).
â–¶What are the challenges of field data collection in resource-limited or remote settings?
Common challenges include: poor/no internet (limiting digital data collection), transportation costs/time (slowing fieldwork and straining budgets), language barriers (requiring translators and back-translation of instruments), cultural sensitivity (requiring trained enumerators familiar with local context), security concerns (in conflict-affected areas), and participant availability (e.g., farmers are busy at harvest). Strategies: invest in local staff recruitment and training (they navigate cultural and practical barriers), use offline-capable apps that sync when connectivity returns, pilot extensively to catch unforeseen barriers, build flexibility into timelines, and maintain safety protocols. Partnerships with local organizations often unlock access and credibility. Remote and resource-limited settings require patient, adaptive fieldwork; rushing leads to poor data quality and ethical risks.