Posted in Information Technology & Systems, Total Reads: 467
Hadoop is an open source java based framework that is used to process large datasets (which can be both structured and unstructured). It is used to store and process large chunk of continuous flowing data which is also known as big data. This is done in a distributed computing environment known as HDFS (Hadoop distributed file system).
Hadoop ecosystem consists of Hadoop kernel, HDFS, MapReduce and some other components like (Pig, Hive, HBase etc.). HDFS is a distributed file system which is used for storage of data (generally Big Data). MapReduce algorithm helps in processing large sets of data in parallel.
Hadoop has become the most popular technology to analyse big data. With the growing volume and velocity of different variety of data Hadoop has become a key technology for many organizations. Along with its high computing power Hadoop is also low cost and scalable which makes it a suitable solution for big data analysis.