4 V’s of Big Data- Insights on Implementation, Data and Analysis
Posted in Operations & IT Articles, Total Reads: 1136
, Published on 27 January 2015
Definition: According to the information technology research and advisory firm Gartner –“Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”
Big Data in today’s world
Software and Information Technology has made it possible to generate extremely large amount of data which is generated in real time, typically these data sets are of sizes which cannot be processed using traditional data base management systems or other data processing software. In the language of big data sizes we do not deal with gigabytes rather the data size ranges from hundreds of petabytes to Exabytes (1 million terabytes) of data. Now the biggest questions are –how can an organization make effective use of this data to predict upcoming scenarios and trends and how can it use this raw data to optimize its processes in order to increase efficiency, reduce costs and approach unexplored avenues.
Mining Terabytes or Petabytes of raw unprocessed data is a very complex task and then drawing out meaningful relevant insights is an even more complex task, the skills needed to mine this data is what matters in today’s fast changing world. Experts have classified these skills into three key categories but the writer of this article feels one more category is relevant.
Modeling, coding and analyzing data is not enough in today’s business perspective big data requires “Intuition the human angle which is the overlapping area between rationality and analytical thinking”.
Major components of Big Data
According to industry experts Big data can be categorized into four main components when applying Big Data in real life business operations –, known as the “Four V’s”:
The 4 V’s of Big Data
When we talk of big data it is implied that we are dealing with enormous volumes of data, this data is typically generated by business processes, automated machines and social networks so the volume of data to be analyzed is massive in size. If we take social networking sites as an example where posts, twitter messages, photos, video clips, etc. are being generated and shared every second, then we deal with data of the order of Zettabytes or Exabytes. In a research it has been found that all the data generated in the world between the beginning of time and 2009 is equal to the data, which is being generated in a single day after 2010.
Thus data sets are becoming more and more complex and too large to store therefore analyzing data using traditional database technology is often not possible.
Variety in Big Data jargon refers to the multiple different sources from where data is generated and types of data which can be structured or unstructured. Earlier the majority of data was structured data that was stored in relational databases, e.g. financial data, sales data.
But as technologies have evolved in recent years at a breakthrough pace, almost 80% of data is unstructured in the form of emails, photos, audios, videos, PDFs and social media updates.
With proper implementation of big data technology we can now make efficient utilization of unstructured data and bring it together with structured data.
Advanced technologies have increased the rate at which new data is generated, this data is not localized to any particular region rather it moves around the whole world at a tremendous speed, this flow of data is massive in volumes and continuous.
e.g. a social media post or a videos on YouTube which goes viral in seconds.
Big Data offers the capability to analyze this real time data and allows businesses and industries to make strategic decisions in real time.
The data being generated is often raw and unstructured so Veracity refers to the trust worthiness of data, the information which is gathered from various networking sites, business processes or automated machines differs vastly in each instance it becomes difficult to control the quality and accuracy of data .So it is important to differentiate which data is to be mined to draw meaningful insights for analyzing a problem.
Analyzing Big Data
How to benefit from Big Data?
In digital economy it is very important to capture and analyze the data so that businesses can improve their decision making abilities and performance by focusing on results derived from Big Data analysis. In order to cultivate a Big Data analysis based process businesses must make some crucial changes like redefining workflows and establishing new guidelines for their employees. Analyzing Big Data requires four key components –the source of a data must be trustworthy, real time monitoring and feedback should be given to analysts, results derived from analysis must be incorporated before moving on to next process and high quality training must be provided to data scientists and analysts.
The rate at which data is being generated is exponentially increasing and it is nearly impossible to analyze this data with traditional data analysis, Big Data has enabled it to look at data in terms of flows and processes and then make decisions accordingly.
As new technologies and tools to analyze data in Big Data field come up the structure of information systems in changing which involves sharing information, communicating results and garnering new insights for businesses. Thus for an organization to benefit from big data, it is very important to learn how to use data and analysis to support its business processes and decisions.
Importance of Intuition in Big Data Analysis
Facts vs. Intuition: can one exist without another
To test the relevancy of Intuition in big data, let us take two examples which are each at least 5,000 years old and are based on observation and intuition and have been proved by facts and experiments in modern times .
1:Factual data as proved by experiments: the oscillating universe theory proposed by Albert Einstein in 1930 theorized a universe following an infinite, self-sustaining series of oscillations, each beginning with a big bang and ending with a big crunch; in the interim, the universe would expand for a period of time before the gravitational attraction of matter causes it to collapse back in and undergo a bounce.
Intuition: Turning to the Indian ancient view on this subject, Mahabharata says (Adi-Parva, 1st Chapter, 40-41): "This beginning less and endless time cycle (Kal-Chakra) moves externally like a perpetual flow in which beings take birth and die but there is never birth or death for this. The creation of gods is briefly indicated as one complete cycle of universe."
2: Factual data as proved by experiments: When we talk of gravity, Newton comes to our mind, but in the text Surya Sidhantha dated around 400 AD, Bhaskaracharya described it stated. "Objects fall on the earth due to one force. The Earth, planets, constellations, moon and sun are held in orbit because of that one force".
These examples provide insights that the results derived from facts and data which are based wholly on analysis and a scientific enquiry into the fact can also be obtained from intuition in which observation may also have had some part."
Thus we observe that by technological applications of knowledge and by laying special emphasis on intuition, similar results are obtained."
Summary and Future Prospects:
The purpose of this article is to provide insights not only into big data, or even about technologies in big data, it is important to understand that the big data is constituted of a lot of little data and therefore in order for a business to succeed it must utilize this constellation of small data to make better decisions and increase their efficiency on a daily basis. With the proper implementation of technologies in big data a business can tap into a constant stream of innovation.
It is expected that by collecting and analyzing big data we would be able to forecast with greater certainty on what will happen in the future based upon past trends and the world can be made a better place.
This article has been authored by Vishwadeep Mishra from IIFT Delhi
Economist Intelligence Unit: Report on Lessons from leaders
The compelling economics and technologies of Big Data computing: www.4syth.com Forysth communications
Big Data : The next frontier for innovation, competition and productivity : Mckinsey Global Institute
If you are interested in writing articles for us, Submit Here