ADDRESSING BIG DATA WITH HADOOP?
TWINKLE ANTONY?, SHAIJU PAUL?
Journal Title:International Journal of Computer Science and Mobile Computing - IJCSMC
Nowadays, a large volume of data from various resources such as social media networks, sensory devices and other information serving devices are produced. This large collection of unstructured, semi structured data is called big data. The conventional databases and data ware houses can’t process this data. So we need new data processing tools. Hadoop addresses this need. Hadoop is an open source platform that provides distributed computing of big data. Hadoop composed of two components. A storage model called hadoop distributed file system and computing model called MapReduce. Map reducer, is a programming model for handling large complex task by doing two steps called map and reduce. In map stage the master node partition the problem into sub problems and distribute the task into worker nodes. The worker nodes pass the result to master node after solving the problem. In the reduce phase the master node reduce the answers of the sub problem to a final solution.