|
(I manged to run TaskTracker properly as in http://answers.mapr.com/questions/1728/jobtracker-does-not-start-any-tasks and found another issue then.) I installed M3 version and tried the example wordcount job. It seems working fine now but tasktracker logs some wired errors(?) everytime like the following. 2012-02-23 18:17:58,298 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 56800: readAndProcess threw exception java.io.IOException: Connection reset by peer. Count of bytes read: 0
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
at sun.nio.ch.IOUtil.read(IOUtil.java:175)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at org.apache.hadoop.ipc.Server.channelRead(Server.java:1653)
at org.apache.hadoop.ipc.Server.access$2600(Server.java:93)
at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1024)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:480)
at org.apache.hadoop.ipc.Server$Listener.run(Server.java:389)
This is regarded as INFO. can be ignored ? |
|
Those are info messages and harmless. TaskTracker sends a kill signal to the jvm after it has completed the task. At the same time, there may be a local rpc (ping) from jvm to tasktracker. So by the time rpc thread inside tasktracker replies to this local rpc, jvm has died hence you will see this exception even if task is completed successfully. You can ignore these messages. OK, Thank you for the comments. Is it a MapR specific issue ? (I think MapR does not modify TaskTracker code at all.) I haven't seen the log before in normal hadoop.
(24 Feb '12, 17:02)
feeblefakie
|
|
This is the complete log. I can't tell if it completed successfully, but the results are written to the output file. (haven't checked if its the correct answer.) 2012-02-23 18:17:46,067 INFO org.apache.hadoop.mapred.TaskTracker: LaunchTaskAction (registerTask): attempt_201202231806_0002_m_000002_0 task's state:UNASSIGNED 2012-02-23 18:17:46,068 INFO org.apache.hadoop.mapred.TaskTracker: Trying to launch : attempt_201202231806_0002_m_000002_0 which needs 1 slots 2012-02-23 18:17:46,068 INFO org.apache.hadoop.mapred.TaskTracker: In TaskLauncher, current free slots : 12 and trying to launch attempt_201202231806_0002_m_000002_0 which needs 1 slots 2012-02-23 18:17:46,950 INFO org.apache.hadoop.mapred.JvmManager: In JvmRunner constructed JVM ID: jvm_201202231806_0002_m_92336420 2012-02-23 18:17:46,950 INFO org.apache.hadoop.mapred.JvmManager: JVM Runner jvm_201202231806_0002_m_92336420 spawned. 2012-02-23 18:17:46,951 INFO org.apache.hadoop.mapred.TaskLog: MapR: Setup Cmds: export JVM_PID= |
|
"Received 'KillJobAction' for job: job_201202231806_0002 2012-02-23 18:17:57,929 INFO org.apache.hadoop.mapred.UserLogCleaner: " Did you kill the jobs ? No i didn't.
(23 Feb '12, 02:53)
feeblefakie
|
|
JobTracker log seems OK. 2012-02-23 20:05:35,760 INFO org.apache.hadoop.mapred.PoolManager: Adding small job on express lane job_201202231957_0002 2012-02-23 20:05:35,761 INFO org.apache.hadoop.mapred.JobTracker: Job job_201202231957_0002 added successfully for user 'hiroyuki' to queue 'default'2012-02-23 20:05:35,761 INFO org.apache.hadoop.mapred.AuditLogger: USER=hiroyuki IP=192.168.110.1 OPERATION=SUBMIT_JOB TARGET=job_201202231957_0002 RESULT=SUCCESS 2012-02-23 20:05:35,762 INFO org.apache.hadoop.mapred.JobTracker: Initializing job_201202231957_00022012-02-23 20:05:35,762 INFO org.apache.hadoop.mapred.JobInProgress: Initializing job_201202231957_0002 2012-02-23 20:05:35,800 INFO org.apache.hadoop.mapred.JobInProgress: jobToken generated and stored with users keys in /var/mapr/cluster/mapred/jobTracker/system/job_201202231957_0002/jobToken 2012-02-23 20:05:35,807 INFO org.apache.hadoop.mapred.JobInProgress: Input size for job job_201202231957_0002 = 52520. Number of splits = 1 2012-02-23 20:05:35,807 INFO org.apache.hadoop.mapred.JobInProgress: tip:task_201202231957_0002_m_000000 has split on node:/default-rack/myhost 2012-02-23 20:05:35,807 INFO org.apache.hadoop.mapred.JobInProgress: job_201202231957_0002 LOCALITY_WAIT_FACTOR=1.0 2012-02-23 20:05:35,807 INFO org.apache.hadoop.mapred.JobInProgress: Job job_201202231957_0002 initialized successfully with 1 map tasks and 1 reduce tasks. 2012-02-23 20:05:35,934 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201202231957_0002_m_000002_0' to tip task_201202231957_0002_m_000002, for tracker 'tracker_myhost:myhost/127.0.0.1:52543' 2012-02-23 20:05:36,241 INFO org.apache.hadoop.mapred.FairScheduler: Moving job from express lane job_201202231957_0002 2012-02-23 20:05:38,372 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201202231957_0002_m_000002_0' has completed task_201202231957_0002_m_000002 successfully. 2012-02-23 20:05:39,589 INFO org.apache.hadoop.mapred.JobTracker: Adding task (MAP) 'attempt_201202231957_0002_m_000000_0' to tip task_201202231957_0002_m_000000, for tracker 'tracker_myhost:myhost/127.0.0.1:52543' 2012-02-23 20:05:39,590 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-local task task_201202231957_0002_m_000000 2012-02-23 20:05:41,415 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201202231957_0002_m_000000_0' has completed task_201202231957_0002_m_000000 successfully. 2012-02-23 20:05:42,024 INFO org.apache.hadoop.mapred.JobTracker: Adding task (REDUCE) 'attempt_201202231957_0002_r_000000_0' to tip task_201202231957_0002_r_000000, for tracker 'tracker_myhost:myhost/127.0.0.1:52543' 2012-02-23 20:05:46,589 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201202231957_0002_r_000000_0' has completed task_201202231957_0002_r_000000 successfully. 2012-02-23 20:05:46,591 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_CLEANUP) 'attempt_201202231957_0002_m_000001_0' to tip task_201202231957_0002_m_000001, for tracker 'tracker_myhost:myhost/127.0.0.1:52543' 2012-02-23 20:05:47,809 INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201202231957_0002_m_000001_0' has completed task_201202231957_0002_m_000001 successfully.2012-02-23 20:05:47,810 INFO org.apache.hadoop.mapred.JobInProgress: Job job_201202231957_0002 has completed successfully.2012-02-23 20:05:47,810 INFO org.apache.hadoop.mapred.JobInProgress$JobSummary: jobId=job_201202231957_0002,submitTime=1329995135701,launchTime=1329995135807,firstMapTaskLaunchTime=1329995139588,firstReduceTaskLaunchTime=1329995142024,firstJobSetupTaskLaunchTime=1329995135934,firstJobCleanupTaskLaunchTime=1329995146591,finishTime=1329995147810,numMaps=1,numSlotsPerMap=1,numReduces=1,numSlotsPerReduce=1,user=hiroyuki,queue=default,status=SUCCEEDED,mapSlotSeconds=3,reduceSlotsSeconds=4,clusterMapCapacity=18,clusterReduceCapacity=8 2012-02-23 20:05:47,813 INFO org.apache.hadoop.mapred.JobHistory: Moving file:/opt/mapr/hadoop/hadoop-0.20.2/logs/history/myhost_1329994674210_job_201202231957_0002_hiroyuki_word+count to maprfs://192.168.110.1:7222/var/mapr/cluster/mapred/jobTracker/history/done 2012-02-23 20:05:47,814 INFO org.apache.hadoop.mapred.JobInProgress: Cleaning up staging dir maprfs://192.168.110.1:7222/var/mapr/cluster/mapred/jobTracker/staging/hiroyuki/.staging/job_201202231957_0002 2012-02-23 20:05:47,815 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201202231957_0002_m_000000_0' 2012-02-23 20:05:47,815 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201202231957_0002_m_000001_0' 2012-02-23 20:05:47,815 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201202231957_0002_m_000002_0'2012-02-23 20:05:47,815 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201202231957_0002_r_000000_0' 2012-02-23 20:05:47,821 INFO org.apache.hadoop.mapred.JobHistory: Moving file:/opt/mapr/hadoop/hadoop-0.20.2/logs/history/myhost_1329994674210_job_201202231957_0002_conf.xml to maprfs://192.168.110.1:7222/var/mapr/cluster/mapred/jobTracker/history/done2012-02-23 20:05:47,828 INFO org.apache.hadoop.mapred.JobHistory: Copying jobhistory at /myvolume/out/_logs/history/myhost_1329994674210_job_201202231957_0002_hiroyuki_word+count Also, there are no error messages in the logs under userlogs. only tasktracker gives some error messages. |
|
I'd suggest looking into the stderr and stdout files inside the tasktracker attempts to check what went wrong. If its a failed task, the stderr might contain some exception or similar. If its successful, most likely you would see an empty stderr. That could point you into some direction. |
|
They are all empty. Sorry for late. Can you try running the jobs again ?
(23 Feb '12, 04:58)
Nabeel ♦♦
I have run several times and they were all the same results.
(23 Feb '12, 05:07)
feeblefakie
What is the final completion message for the job ?
(23 Feb '12, 05:18)
Nabeel ♦♦
|
|
You mean this ? $ hadoop jar /opt/mapr/hadoop/hadoop-0.20.2/hadoop-0.20.2-dev-examples.jar wordcount /myvolume/in /myvolume/out 12/02/23 20:05:35 INFO fs.JobTrackerWatcher: Current running JobTracker is: zushi.tkl.iis.u-tokyo.ac.jp/192.168.110.1:9001 12/02/23 20:05:35 INFO input.FileInputFormat: Total input paths to process : 1 12/02/23 20:05:35 WARN snappy.LoadSnappy: Snappy native library is not available 12/02/23 20:05:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/02/23 20:05:35 WARN snappy.LoadSnappy: Snappy native library not loaded 12/02/23 20:05:35 INFO mapred.JobClient: Running job: job_201202231957_0002 12/02/23 20:05:36 INFO mapred.JobClient: map 0% reduce 0% 12/02/23 20:05:42 INFO mapred.JobClient: map 100% reduce 0% 12/02/23 20:05:47 INFO mapred.JobClient: map 100% reduce 100% 12/02/23 20:05:47 INFO mapred.JobClient: Job complete: job_201202231957_0002 12/02/23 20:05:47 INFO mapred.JobClient: Counters: 25 12/02/23 20:05:47 INFO mapred.JobClient: Job Counters 12/02/23 20:05:47 INFO mapred.JobClient: Launched reduce tasks=1 12/02/23 20:05:47 INFO mapred.JobClient: Aggregate execution time of mappers(ms)=3779 12/02/23 20:05:47 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/02/23 20:05:47 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/02/23 20:05:47 INFO mapred.JobClient: Launched map tasks=1 12/02/23 20:05:47 INFO mapred.JobClient: Data-local map tasks=1 12/02/23 20:05:47 INFO mapred.JobClient: Aggregate execution time of reducers(ms)=4347 12/02/23 20:05:47 INFO mapred.JobClient: FileSystemCounters 12/02/23 20:05:47 INFO mapred.JobClient: MAPRFS_BYTES_READ=83893 12/02/23 20:05:47 INFO mapred.JobClient: MAPRFS_BYTES_WRITTEN=44704 12/02/23 20:05:47 INFO mapred.JobClient: FILE_BYTES_WRITTEN=91981 12/02/23 20:05:47 INFO mapred.JobClient: Map-Reduce Framework 12/02/23 20:05:47 INFO mapred.JobClient: Map input records=278 12/02/23 20:05:47 INFO mapred.JobClient: Reduce shuffle bytes=0 12/02/23 20:05:47 INFO mapred.JobClient: Spilled Records=1120 12/02/23 20:05:47 INFO mapred.JobClient: Map output bytes=65941 12/02/23 20:05:47 INFO mapred.JobClient: CPU_MILLISECONDS=3000 12/02/23 20:05:47 INFO mapred.JobClient: Combine input records=3420 12/02/23 20:05:47 INFO mapred.JobClient: SPLIT_RAW_BYTES=115 12/02/23 20:05:47 INFO mapred.JobClient: Reduce input records=560 12/02/23 20:05:47 INFO mapred.JobClient: Reduce input groups=560 12/02/23 20:05:47 INFO mapred.JobClient: Combine output records=560 12/02/23 20:05:47 INFO mapred.JobClient: PHYSICAL_MEMORY_BYTES=448618496 12/02/23 20:05:47 INFO mapred.JobClient: Reduce output records=560 12/02/23 20:05:47 INFO mapred.JobClient: VIRTUAL_MEMORY_BYTES=2772979712 12/02/23 20:05:47 INFO mapred.JobClient: Map output records=3420 12/02/23 20:05:47 INFO mapred.JobClient: GC time elapsed (ms)=191 Nope, In the jobtracker log, do you see anything like INFO org.apache.hadoop.mapred.JobInProgress: Job job_201202231957_0002 has completed successfully
(23 Feb '12, 05:36)
Nabeel ♦♦
Please see the above log. There is a "Job job_201202231957_0002 has completed successfully" message.
(23 Feb '12, 05:41)
feeblefakie
Oh sorry..missed that the jobs in the two logs were different. Looks like this went fine. Do you find any issue in the output ?
(23 Feb '12, 05:54)
Nabeel ♦♦
I tested with a very small file which i can count words, and made sure the output is correct.
(23 Feb '12, 06:07)
feeblefakie
|