Data, Data Types & Methods of Collection – An Expanded Reference
1. What Is Data?
Data are raw symbols, measurements, or observations that capture attributes of people, objects, events or concepts. When processed, analysed and contextualised, data yield information, and ultimately, knowledge.
2. Typologies of Data
2.1 By Source
Primary Data – gathered firsthand by the current investigator to answer the contemporary problem.
Secondary Data – pre-existing data originally compiled for another purpose.
2.2 By Measurement Scale
Nominal (categories without order), e.g., blood type.
Qualitative – textual, visual, or audio descriptions capturing richness of context.
3. Meaning, Features & Distinction
3.1 Primary Data
Generated directly via fieldwork or experimentation. Researchers control definitions, timing, sampling, and measurement instruments, ensuring high relevance.
3.2 Secondary Data
Obtained from published or internal records: census tables, hospital registers, transaction logs, remote-sensing images, social-media APIs, etc. Valuable for historical analyses, benchmarking and exploratory work.
3.3 Comparison at a Glance
Aspect
Primary
Secondary
Objective Fit
Tailor-made to study question
May only partially fit
Collection Cost / Time
High
Low – moderate
Currency
Current, real-time possible
May be outdated
Control Over Quality
Full (instrument, sampling, definitions)
Indirect; depends on original collector
Geographic / Temporal Breadth
Usually focused, short-term
Often broad and longitudinal
Examples
Online survey, lab experiment, mobile ethnography
Census 2021, World Bank databank, enterprise ERP logs
4. Selecting a Data-Collection Method – Decision Factors
The optimal method balances research objectives (depth vs breadth, causal vs descriptive), required precision, budget, timeline, respondent accessibility, ethical constraints, and the analytical strategy (e.g., multivariate statistics require quantifiable variables). A prudent rule is “use secondary data first”; if inadequate, design primary collection. Mixed-method triangulation often yields the most credible insights.
Efficiency Spotlight Why Online Surveys Often Win Fast deployment, auto-coding, real-time dashboards, and negligible marginal cost make online surveys the most cost-efficient for large literate populations, provided internet penetration and response incentives are adequate.
6. Secondary Data Sources & Evaluation
6.1 Typical Sources
Government census, vital statistics, economic indicators.
Explanatory Sequential: Quantitative → qualitative follow-up for “why”.
Concurrent: Collect both strands simultaneously; merge in analysis.
10. Key Takeaways
Primary data offer precision and current relevance but demand higher resources; secondary data provide speed and breadth but require critical evaluation.
Method choice is a multi-criteria optimisation of accuracy, cost, time, ethics, and analytic needs.
Online surveys are generally the most cost-efficient for large literate populations, while passive sensors are unrivalled for fine-grained behavioural measures.
Triangulation mitigates individual method weaknesses and strengthens credibility.