Monday, May 31, 2021

Introduction to Statistics - Types of Data

Hey readers, hope you all are doing safe and strong in COVID-19 pandemic time. Since my post on Introduction to Statistics - Measurement scales and statistical tools, here in my today's post I will describing and summarizing types of data.

What we call data in Statistics - the values of different objects collected in a survey or web services or databases, flat files, other sources such as RSS feeds or recorded values of an experiment over a time period taken together constitute what we call data in Statistics. Each value in the data is known as observation.

Classifying statistical data below,

  1. based on the ways of obtaining the data
    1. Primary data
    2. Secondary data
  2. based on the characteristic
    1. Quantitative data
    2. Qualitative data
  3. based on the nature of the characteristic
    1. Discrete data
    2. Continuous data
  4. based on level of measurement
    1. Nominal data
    2. Ordinal data
    3. Interval data
    4. Ratio data
  5. based on time component
    1. Time series data
    2. Cross sectional data

Lets get a brief concept of each type of data.

Primary data

Data which are directly collected from the main source by an investigator or survey or questionnaires or agency or by anyone and these people are first to use these data. Primary data example, suppose a class teacher wants to know the mean weight of students from class eight of a particular school. If he collects data related to the weight of each students of class eight of that particular school by contacting each students personally then data so obtained by the class teacher is an example of primary data for the same class teacher.

Secondary data

Secondary data collected by an investigator or survey or questionnaires or agency or by anyone from a source which is already exists. That is, these data were originally collected by an entity or person and has been used by them at least once. And now, these data are going to be used at least second time. Secondary data example, considering the same example as discussed in case of primary data. If the class teacher collects the weight of the students from the record of that particular school, then the data thus obtained is an example of secondary data.

Note: In both the cases (primary data and secondary data) data remain the same, only way of collecting the data differs.

Quantitative data

Data are said to be quantitative data if a numerical quantity is associated with each observation. Here interval or ratio scales are used as a measurement of scale in case of quantitative data. Data based on the following characteristics generally gives quantitative type of data. Such as weight, height, ages, length, area, volume, money, temperature, humidity, size, etc. Quantitative data example, weights in kilogram of students of a class.

Qualitative data

Qualitative data is related to the quality of an object/thing, i.e. if the characteristics or attribute under study is such that it is measured only on the bases of presence/absence then the data thus obtained is known as qualitative data. Nominal and ordinal scales are generally used as a measurement of scale in case of qualitative data. For example, if a company want to do a survey for a newly launched product and if the characteristic under study is 'satisfaction' then the objects can be divided into five categories as Highly satisfied, Satisfied, Neutral, Dissatisfied, Highly dissatisfied.

Discrete data

In discrete data, if the nature of the characteristic under study is such that values of observations may be at most countable between two certain limits then corresponding data are known as discrete data. Discrete data example, number of employee present in an office in a particular day may be 80 or 150 or 500 and so on, but cannot be 80.34, 150.54, 500.67, etc. 

Continuous data

Data are said to be continuous if the measurement of the observations of a characteristic under study may be any real value between two certain limits. Continuous data example, data obtained by measuring weights of the students of a class also form continuous data because weights of students may be 42.676 kg, 39.585 kg, 45.238 kg, etc.

Nominal data

Data collected using nominal scale is called nominal data.

Ordinal data

Similarly data collected using ordinal scale is called ordinal data.

Interval data

Similarly data collected using interval scale is called interval data.

Ratio data

Similarly data collected using ratio scale is called ratio data.

For more details with examples of nominal data, ordinal data, interval data and ratio data follow this post Measurement scales and statistical tools.

Time series data

If the purpose of data collection has its connection with time then it is known as time series data. In time series data, time is one of the main variables and the data collected usually at regular interval of time related to the characteristic(s) under study show how characteristic(s) changes over the time. Time series data example, yearly expenditure of a family on different items for last three years. 

In time series data, if the purpose of the data collection has its connection with geographical location then it is known as Spatial data. For example, number of goals saved by a goalkeeper in different matches in Europa League 2021 versus different teams. 

And if the purpose of the data collection has its connection with both time and geographical location then it is known as Spacio Temporal Data. For example, data related to audience of different matches in Europa League in 2010 and 2018 will be Spacio Temporal Data.

Cross sectional data

Type of data which is collected at one point in time is known as cross sectional data. Cross sectional data example, such as income or expenditure of a family, salaries of all employees of an organization.

Summary

In this article, I reviewed the use and types of data. I also showed different examples. Thanks for reading. I hope this article helped you to understand the use and types of data. We covered classification of statistical data based on the ways of obtaining the data, based on the characteristic, based on the nature of the characteristic, based on level of measurement and based on time component.

As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion.


No comments:

Post a Comment

Thank you. Please subscribe to our blog for more.

Be Seen, Be Local: Why Local SEO is a Game Changer for North East Indian Businesses

Intro Imagine a customer in Shillong searching online for "best momos near me." Do you want your restaurant to be the first thing ...