In large scale Hadoop cluster,good task scheduling strategy is important to improve data locality,reduce network transmission overhead,reduce job execution time and improve job throughput. In view of the low data locality problem of reduce task in Hadoop architecture,this paper put forward a reduce task scheduling optimization algorithm based on delay scheduling policy,which reduced the job execution time and improved the job throughput by improving the data locality of the reduce task. In the shuffle early...