What the above graph shows that how a map-reduce is running in Hadoop.
(1) An application starts a master instance and worker instances for the Map phase and, later, worker instances for the Reduce phase.
(2) The master partitions the input data in segments.
(3) Each Map instance reads its input data segment and processes the data. During process, if any of Map instance fails, Master will assign its work to other workers from latest periodic checkpoint.
(4) The results of the processing are stored on the local disks of the servers where the Map instances run. After the Map workers finish, the location of data will be returned to master so that master can partition these data to Reduce workers.
(5) When all Map instances have finished processing their data, the Reduce instances read the results of the first phase and merge the partial results.
(6) The final results are written by the Reduce instances to a shared storage server.
(7) The master instance monitors the Reduce instances and, when all of them report task completion, the application is terminated