Magazine

What is big data?

The basic idea behind the phrase ‘big data’ is that everything we do is increasingly leaving a digital footprint (data) which can subsequently be used and analysed.

Exemplifying the sheer scale of data we are producing on an ongoing basis is a quote from Eric Schmidt, of Google “From the dawn of civilisation until 2003, humankind generated five Exabytes of data. Now we produce five Exabytes every two days… and the pace is accelerating.” In fact, over 90% of all the data in the world was created in the past two years. Putting this into perspective, from the research of Bernard Marr “Every minute we send 204 million emails, generate 1.8 million Facebook likes, send 278,000 tweets and upload 200,000 photos to Facebook.”

Big data is both transforming the way we do business and also impacting many parts of our personal lives. It can be described by the following characteristics:

Volume: The quantity of generated and stored data. The size of the data determines the value and potential insight- and whether it can actually be considered big data or not.

Variety: The type and nature of the data. This helps people who analyse it to effectively use the resulting insight.

Velocity: In this context, the speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development.

Variability: Inconsistency of a data set can hamper processes to handle and manage it.

Veracity: The quality of captured data can vary greatly, affecting accurate analysis.

The challenge organisations now have is to make sense of this big data to produce actionable insight. If we can get this right, we can use the data to better target and understand customers, understand and optimise business processes, and improve business performance.

Big data is not just about the size of the data, but is also about the value within the data.  Any business that doesn’t seriously consider the implications of big data, and their ability to make use of the ever-increasing volumes of data that exist, runs the potential risk of being left behind when compared with their peers.