0
7.4kviews
Explain "Shuffle & Sort" phase and "Reducer Phase" in Map Reduce.
1 Answer
1
549views

enter image description here

  • In map phase the task tracker performs the computation on local data and output is generated.

  • The output is called as intermediate results and are stored on temporary local storage.

  • After the map phase is over, all the intermediate values for a given intermediate key are combined together into a list.

  • The list is given to a reducer.

  • There may be single or multiple reducers.

  • All values associated with a particular intermediate key are guaranteed to go to the same reducer.

  • The intermediate keys, and their value lists, are passed to the reducer in sorted key order.

  • This step is known as ' shuffle and sort'.

  • The reducer outputs zero or more final key valve pairs.

  • These are written to HDFS.

  • The reducer usually emits a single key/valve pair for each input key.

  • The job tracker starts a reduce task on any one of the nodes and instruct to grab the intermediate data from the completed map task.

  • The reduce performs final computation and o/p is written to HDFS.

  • The client reads the output from file and job completes.

Please log in to add an answer.