Privacy in Map Reduce Based Systems: A Review
Rosmy C Jose, Shaiju Paul?
Journal Title:International Journal of Computer Science and Mobile Computing - IJCSMC
Today, every organisation ge ne r a te s and adds huge amount of data to the cloud. This vast amount of data which cannot be effectively captured, processed and analysed by traditional database and search tools is called Big Data. The processing of big data is made possible by using Map Reduce, a programming model and an associated implementation, introduced by Google. MapReduce process d a t a , w h i c h ar e located at different data nodes. It pushes computations to where the data r e s i d e s rather than t h e oppos i t e. So, Map Reduce Framework or source codes may leak sensitive data during computation process. In current im p l em e n t a t i o n (Airavat) Mapper code is written by user and Reducer code is selected from a list provided by the system. If these codes are given by the system itself, usability may become low. Therefore, in the proposed s ys t em both Map and Reduce codes can be written by the user. So usability wi l l be high. A Computation System ensures the privacy leak through storage channels (network connections, files) or privacy leak through the output of the computation is stopped. Use SELinux is used to prevent storage channel leaks. Leaks through the output o f the computations are checked by using differential privacy mechanisms.