Hello guys: I have setup a 6 nodes hosts with MapR, I want to see is it really 3 times fater than hadoop. But when I run the 1TB tera sort. THere is only one reducer.So it tasks me a whole day to complete my job. I am using redhat 5 , and each node with an additional 7200 900GB disk. I got the rpms from http://package.mapr.com/redhat. Do you test tera sort, which part MapR is 3 times faster than hadoop?
Answer by MC Srivas · Jul 16, 2011 at 10:10 PM
The variable that controls when reducers start with respect to mappers is controlled by mapred.reduce.slowstart.completed.maps, which is a number between 0 and 1. At 0, the reducers start when mappers start. At 0.75, reducers start when 75% of mappers have finished. In stock hadoop, it is set to 0.2, while MapR sets it to 0.95. You can find the setting in mapred-site.xml, and change the behavior via the command-line as follows:
hadoop jar ... -Dmapred.reduce.slowstart.completed.maps=0.95 ...
In particular, the command line switches need to come before the input and output dirs are specified, eg,
hadoop jar examples.jar terasort -Dmapred.reduce.tasks=30 /teragen_input_dir /teragen_output_dir
BTW, the terasort benchmark as run by Yahoo! sets output replication to 1. In order to do that with MapR, you need to make /teragen_output_dir into its own volume, and set the volume's replication factor to 1.
As far as I know, the only new thing in MapR's hadoop-0.20.0-dev-examples.jar is the additional file system type called "maprfs" that's recognized. Would be happy to send you the source code, but to where?
Answer by MC Srivas · Jul 16, 2011 at 08:46 PM
The number of reducers is specified on the command line. By default it is set to 1. Take a look at the job-trackers web page when the system is idle, and it will list the total number of reduce slots that are available across the cluster. Then run the terasort with the following option on the command-line:
bin/hadoop jar ... -Dmapred.reduce.tasks=<num reduce tasks>
replacing < num reduce tasks > with the total number of reduce slots in the cluster.
Answer by Ted Dunning · Jul 16, 2011 at 08:51 PM
Yes. Terasort is part of our standard testing. You can see some results in Srivas' slides from the Hadoop Summit.
Do you have a list of the commands that you used to run tera gen and terasort?
Try using the -Dmapred.reduce.tasks= option to use more reducers. The best number to use depends on the size of the data you are sorting.
The basic issue here is that Hadoop's default is to use only one reducer and we preserve that behavior to stay compatible.
Answer by xxqonline · Jul 16, 2011 at 08:55 PM
I am using -Dmapred.reduce.tasks=30 for testing now. I have another querstion, in standard hadoop, the reducer is started when mapper has some out put, but in mapr, the reducer is started after mapper all finish, what is the reason for this part?
Answer by xxqonline · Jul 16, 2011 at 09:14 PM
Is there soure code for MapR hadoop examples(hadoop-0.20.2-dev-examples.jar)? I have seen it is 187526 Bytes, but the standandard hadoop-0.20.2-example.jar is 142466 bytes. And in standard hadoop running time, I do see the reducer started before mapper finished. I think you guys change the hadoop. Because I use the standard hadoop-0.20.2-example.jar run in mapR, it does not start reducer until mapper finish. The reason I use the standard hadoop example is for fair play, I need to ensure the test app is the same, else the result is useless for comparation. My command is 1. Generate with teragen 1TB data in /teragen_1TB 2. bin/hadoop jar hadoop-0.20.2-example.jar terasort /teragen_1TB /terasort_1TB_formal -Dmapred.reduce.tasks=30 But the reducer is still 1.
Do I have to use your hadoop-0.20.2-dev-examples.jar?
Answer by xxqonline · Jul 17, 2011 at 12:35 AM
Do you really test it 3 times faster than Hadoop? My hadoop can finish in 2 hours. But we with -Dmapred.reduce.slowstart.completed.maps=0.95 enabled, the mapper become very slow after it reach 95%. And it is more than 2 hours. Now I remove the -Dmapred.reduce.slowstart.completed.maps=0.95.
The only enhance is the dfs io, it really improved a lot in writting. But for tera sort, what is your test result?
Answer by xxqonline · Jul 17, 2011 at 03:07 AM
The work hang there. And the reducer copy thread is haning at for 1 hour task_201107151459_0022_r_000000 33.07% reduce > copy (3697 of 3726 at 8.59 MB/s) >
Also an exception: 1/07/17 03:41:04 INFO mapred.JobClient: map 95% reduce 0% 11/07/17 03:41:22 INFO mapred.JobClient: Task Id : attempt_201107151459_0022_m_002799_0, Status : FAILED on node db06b12 java.lang.Throwable: Child Error at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:261) Caused by: java.io.IOException: Task process exit with nonzero status of 137. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:248)
Terasort failing / hanging 1 Answer