|
Hi, After installing MapR on three node cluster, services on one of the slave nodes always shows as NOT CONFIGURED. This problem is seen always, after installation. One node is critical always. On checking the mfs log, MFS server was shutdown and it complains about Storage Pool is offline and createsystemvolume log has below errors: FileServer hadoop6.eng.narus.com:5660 has not heartbeated with CLDB for 1329351197 2012-02-16 00:13:38.417 hadoop6 createsystemvolumes.sh(21633) Install CreateLocalVolumeDirectories:178 CreateLocalVolume: Retrying after 20 seconds. RetryCnt: 12 |
|
do you have warden running on the node in question? Yes the warden is running on the node.
(15 Feb, 18:26)
ghousia
Could you paste warden.log from that node here?
(15 Feb, 18:33)
yufeldman ♦
|
|
Warden logs at the failed node: Header: hostName: hadoop6.eng.narus.com, Time Zone: Greenwich Mean Time, processName: warden, processId: 25542, MapR Build Version: 1.2.0.12140GA 2012-02-16 03:39:24,274 INFO com.mapr.warden.WardenMain [main]: Log dir: /opt/mapr/hadoop/hadoop-0.20.2/logs 2012-02-16 03:39:24,321 INFO com.mapr.warden.WardenManager [main-EventThread]: Process path: null. Event state: SyncConnected. Event type: None 2012-02-16 03:39:24,321 INFO com.mapr.warden.WardenManager [main]: Connected to ZK: 172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181 2012-02-16 03:39:24,338 INFO com.mapr.warden.WardenServer [main]: Connected to ZK: 172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181 2012-02-16 03:39:24,339 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: null. Event state: SyncConnected. Event type: None 2012-02-16 03:39:24,370 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged 2012-02-16 03:39:24,373 ERROR com.mapr.warden.WardenManager createConfigList [main]: Error while parsing out service and count: [Ljava.lang.String;@5b976011. Continue with next service parsing 2012-02-16 03:39:24,374 INFO com.mapr.warden.WardenManager [main]: Configured services: [] 2012-02-16 03:39:24,374 INFO com.mapr.warden.WardenManager [main]: warden.conf content: {service.command.hbregion.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start regionserver, service.command.nfs.monitorcommand=/etc/init.d/mapr-nfsserver status, nodes.mincount=1, service.command.hbmaster.heapsize.min=64, service.command.hbmaster.heapsize.percent=4, service.command.hbregion.heapsize.percent=25, service.command.cldb.stop=/etc/init.d/mapr-cldb stop, service.command.jt.heapsize.max=5000, service.command.warden.heapsize.percent=1, hoststats.port=5660, service.command.hbregion.heapsize.max=4000, service.command.cldb.monitorcommand=/etc/init.d/mapr-cldb status, service.command.mfs.type=BACKGROUND, kvstore.port=5660, service.command.hbregion.monitorcommand=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh status regionserver, hbmaster.port=60000, service.command.hbmaster.monitorcommand=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh status master, service.command.webserver.start=/opt/mapr/adminuiapp/webserver start, service.command.tt.heapsize.max=325, mfs.port=5660, service.command.jt.type=BACKGROUND, service.command.hbmaster.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start master, service.command.hbmaster.heapsize.max=512, zookeeper.servers=172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181, service.command.kvstore.monitor=server/mfs, service.command.jt.heapsize.percent=10, services.retries=3, service.command.hbregion.type=BACKGROUND, service.command.cldb.heapsize.min=256, service.command.mfs.start=/etc/init.d/mapr-mfs start, service.command.cldb.monitor=com.mapr.fs.cldb.CLDB, service.command.jt.monitor=org.apache.hadoop.mapred.JobTracker, service.command.cldb.start=/etc/init.d/mapr-cldb start, service.command.hbregion.monitor=org.apache.hadoop.hbase.regionserver.HRegionServer start, service.command.mfs.stop=/etc/init.d/mapr-mfs stop, service.command.hbmaster.monitor=org.apache.hadoop.hbase.master.HMaster start, service.command.hbmaster.type=BACKGROUND, service.command.tt.type=BACKGROUND, service.command.jt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop jobtracker, service.command.mfs.monitor=server/mfs, service.command.warden.heapsize.min=64, service.command.webserver.heapsize.min=512, mapr.home.dir=/opt/mapr, service.command.jt.monitorcommand=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh status jobtracker, service.command.tt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start tasktracker, nodeTotalMemory=24678244, service.command.webserver.type=BACKGROUND, service.command.hbregion.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop regionserver, service.command.cldb.heapsize.max=4000, service.command.kvstore.type=BACKGROUND, service.command.nfs.type=BACKGROUND, service.command.hoststats.start=/etc/init.d/mapr-hoststats start, service.command.zk.heapsize.min=256, service.command.os.heapsize.min=256, service.command.tt.monitor=org.apache.hadoop.mapred.TaskTracker, service.command.tt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop tasktracker, service.command.hbmaster.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop master, service.command.webserver.heapsize.percent=3, service.command.tt.heapsize.percent=2, currentHostID=2b30274fdff08b81, service.command.warden.heapsize.max=750, service.command.webserver.heapsize.max=750, jt.http.port=50030, service.command.nfs.heapsize.min=64, service.command.zk.heapsize.percent=1, service.command.hoststats.type=BACKGROUND, service.command.webserver.stop=/opt/mapr/adminuiapp/webserver stop, service.command.kvstore.monitorcommand=/etc/init.d/mapr-mfs status, service.command.nfs.heapsize.percent=3, jt.port=9001, service.command.mfs.heapsize.min=512, service.command.mfs.heapsize.percent=20, services=, service.command.kvstore.stop=/etc/init.d/mapr-mfs stop, service.command.nfs.start=/etc/init.d/mapr-nfsserver start, service.command.kvstore.start=/etc/init.d/mapr-mfs start, service.command.cldb.heapsize.percent=8, service.command.hoststats.monitorcommand=/etc/init.d/mapr-hoststats status, service.command.cldb.type=BACKGROUND, service.command.jt.heapsize.min=256, service.command.nfs.stop=/etc/init.d/mapr-nfsserver stop, service.command.hbregion.heapsize.min=1000, service.command.os.heapsize.percent=3, service.command.nfs.monitor=server/nfsserver, service.command.mfs.monitorcommand=/etc/init.d/mapr-mfs status, nodeTotalFreeMemory=19249716, service.command.zk.heapsize.max=1500, service.command.os.heapsize.max=750, service.command.webserver.monitorcommand=/opt/mapr/adminuiapp/webserver status, service.nice.value=-10, service.command.jt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start jobtracker, currentHost=hadoop6.eng.narus.com, cldb.port=7222, service.command.tt.heapsize.min=64, service.command.tt.monitorcommand=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh status tasktracker, service.command.hoststats.stop=/etc/init.d/mapr-hoststats stop, service.command.nfs.heapsize.max=1000} 2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Total node memory: 24678244 2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Total consumed memory: 1220 2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Still available free memory: 22879 2012-02-16 03:39:26,005 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged 2012-02-16 03:39:26,011 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged It looks like you did not run configure.sh on that node and/or did not install any mapr-* packages there besides mapr-core.
(15 Feb, 21:29)
yufeldman ♦
please also paste here first few lines of your /opt/mapr/conf/warden.conf
(15 Feb, 21:45)
yufeldman ♦
rpm -qa gives these rpms. mapr-pig-1.2.0.12140GA-1 mapr-zk-internal-1.2.0.12140GA.v3.3.2-1 mapr-core-1.2.0.12140GA-1 mapr-hive-1.2.0.12140GA-1 mapr-hive-internal-1.2.0.12140GA.v0.7.1-1 mapr-tasktracker-1.2.0.12140GA-1 mapr-zookeeper-1.2.0.12140GA-1 mapr-fileserver-1.2.0.12140GA-1 mapr-pig-internal-1.2.0.12140GA.v0.9.0-1 Configure command: /opt/mapr/server/configure.sh -c -C 172.31.2.204 -Z 172.31.2.204,172.31.2.205,172.31.2.206
(15 Feb, 22:13)
ghousia
You are configuring client "-c", instead of cluster node, please rerun configure.sh without "-c" option. /opt/mapr/server/configure.sh -C 172.31.2.204 -Z 172.31.2.204,172.31.2.205,172.31.2.206
(15 Feb, 22:37)
yufeldman ♦
Thanks a lot. It worked. But strange behaviour is it works on other slave nodes, Only one machine has this problem
(15 Feb, 23:14)
ghousia
I doubt it. You might think it worked, because most likely configure.sh was run w/o "-c" at least once, and subsequent runs with "-c" were not just changing anything.
(15 Feb, 23:24)
yufeldman ♦
showing 5 of 6
show all
|
|
services= |