Exploring Different Types of Datasets: Numerical, Categorical, Time-Series, and Spatial.

Shashank Gollapalli
4 min readApr 1, 2023

--

Data is the driving force behind decision-making in today’s world. But as Gregory Fell and Mike Barlow would argue, not all data is created equal. Data comes in different forms, and understanding the different types of data sets is crucial to making sense of it. In this article, I’ll walk you through the major types of datasets, including numerical, categorical, time-series, and spatial, and explain how they can be both structured and unstructured.

Overview Image

Before we move ahead with the article, lets quickly understand what structured and unstructured data is.

Structured data is data that can be easily organized and analyzed, with a clearly defined format and data types. Examples of structured data include data stored in databases or spreadsheets.

Unstructured data, on the other hand, does not have a defined format and is not easily searchable or analyzable. Examples of unstructured data include social media posts, emails, and images to name a few.

Numerical datasets contain quantitative data that can be measured on a numerical scale. This type of dataset is the most common and includes data such as age, height, weight, and income. Numerical datasets can be further divided into discrete and continuous data. Discrete data are values that can be counted and are represented by integers, such as the number of cars in a parking lot. Continuous data are values that can take any value within a range and are represented by decimals or fractions, such as temperature or weight.

Image Source: Image source: Cleland, J., Scott, N., Harrild, K., & Moffat, M. (2013). Using databases in medical education research: AMEE Guide №77. Medical Teacher, 35(5), e1103-e1122. doi:10.3109/0142159X.2013.785632
 https://www.researchgate.net/publication/336864581/figure/tbl1/AS:819152335949830@1572312544176/An-example-of-a-data-set-with-numerical-attributes.png
Numerical dataset (Source: Using databases in medical education research (Cleland et al., 2013))

Categorical datasets contain qualitative data that describes characteristics and cannot be measured on a numerical scale. This type of dataset includes data such as gender, race, and religion. Categorical datasets can be further divided into nominal and ordinal data. Nominal data is data that can be named, but cannot be ordered, such as colors or types of fruit. Ordinal data is data that can be ordered, such as levels of education or ratings on a scale.

Image Source: Cleland, J., Scott, N., Harrild, K., & Moffat, M. (2013). Using databases in medical education research: AMEE Guide №77. Medical Teacher, 35(5), e1103-e1122. doi:10.3109/0142159X.2013.785632  https://www.researchgate.net/profile/Mandy-Moffat/publication/236456448/figure/tbl3/AS:667856693911563@1536240852489/Examples-of-quantitative-data-and-categorical-data.png
Categorical Data (Source: Using databases in medical education research (Cleland et al., 2013))

Time-series datasets are used to track changes over time. This type of dataset includes data such as stock prices, weather patterns, and website traffic. Time-series datasets are structured, meaning that the data is ordered chronologically, with a timestamp or date for each observation. This structure makes it possible to identify trends and patterns over time.

Image Source: Example 1: Weather conditions  https://www.influxdata.com/what-is-time-series-data/
Time Series Data (Source: what-is-time-series-data by influxdata)

Spatial datasets contain location-based data, such as maps or GPS data. This type of dataset includes data such as the location of a store, the population density of a city, or the spread of a disease across different regions. Spatial datasets can be either structured or unstructured, depending on the level of detail and the format of the data.

Image Source: https://devdatalab.medium.com/open-access-geospatial-data-for-india-b9dceb7196bb
Spatial Dataset

In the above image sourced from Development Data Lab’s article, we can see the spatial data for India which they are constructing from multiple data sources.

In conclusion, data is not just a collection of random numbers or words, but rather it comes in different forms and shapes. Being able to recognize and understand the different types of datasets is essential to draw meaningful conclusions and insights from it. We’ve seen that numerical, categorical, time-series, and spatial datasets are the major types of datasets, each with its unique characteristics and applications. Moreover, we also saw that data can be both structured and unstructured, which has a significant impact on how we analyze and interpret it. As we move forward in the era of big data, having a good understanding of these concepts will be increasingly crucial for making informed decisions and driving innovation.

Let’s connect on LinkedIn!

--

--

Shashank Gollapalli
Shashank Gollapalli

Written by Shashank Gollapalli

MSc Big Data Analytics for Business | On a mission to make data science accessbile

No responses yet