Hi,

After installing MapR on three node cluster, services on one of the slave nodes always shows as NOT CONFIGURED. This problem is seen always, after installation. One node is critical always. On checking the mfs log, MFS server was shutdown and it complains about Storage Pool is offline

and createsystemvolume log has below errors: FileServer hadoop6.eng.narus.com:5660 has not heartbeated with CLDB for 1329351197 2012-02-16 00:13:38.417 hadoop6 createsystemvolumes.sh(21633) Install CreateLocalVolumeDirectories:178 CreateLocalVolume: Retrying after 20 seconds. RetryCnt: 12

asked 15 Feb, 16:50

ghousia's gravatar image

ghousia
1681117
accept rate: 0%


do you have warden running on the node in question?

link

answered 15 Feb, 18:19

yufeldman's gravatar image

yufeldman ♦
1.2k27
accept rate: 31%

Yes the warden is running on the node.

(15 Feb, 18:26) ghousia

Could you paste warden.log from that node here?

(15 Feb, 18:33) yufeldman ♦

Warden logs at the failed node:

Header: hostName: hadoop6.eng.narus.com, Time Zone: Greenwich Mean Time, processName: warden, processId: 25542, MapR Build Version: 1.2.0.12140GA 2012-02-16 03:39:24,274 INFO com.mapr.warden.WardenMain [main]: Log dir: /opt/mapr/hadoop/hadoop-0.20.2/logs 2012-02-16 03:39:24,321 INFO com.mapr.warden.WardenManager [main-EventThread]: Process path: null. Event state: SyncConnected. Event type: None 2012-02-16 03:39:24,321 INFO com.mapr.warden.WardenManager [main]: Connected to ZK: 172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181 2012-02-16 03:39:24,338 INFO com.mapr.warden.WardenServer [main]: Connected to ZK: 172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181 2012-02-16 03:39:24,339 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: null. Event state: SyncConnected. Event type: None 2012-02-16 03:39:24,370 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged 2012-02-16 03:39:24,373 ERROR com.mapr.warden.WardenManager createConfigList [main]: Error while parsing out service and count: [Ljava.lang.String;@5b976011. Continue with next service parsing 2012-02-16 03:39:24,374 INFO com.mapr.warden.WardenManager [main]: Configured services: [] 2012-02-16 03:39:24,374 INFO com.mapr.warden.WardenManager [main]: warden.conf content: {service.command.hbregion.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start regionserver, service.command.nfs.monitorcommand=/etc/init.d/mapr-nfsserver status, nodes.mincount=1, service.command.hbmaster.heapsize.min=64, service.command.hbmaster.heapsize.percent=4, service.command.hbregion.heapsize.percent=25, service.command.cldb.stop=/etc/init.d/mapr-cldb stop, service.command.jt.heapsize.max=5000, service.command.warden.heapsize.percent=1, hoststats.port=5660, service.command.hbregion.heapsize.max=4000, service.command.cldb.monitorcommand=/etc/init.d/mapr-cldb status, service.command.mfs.type=BACKGROUND, kvstore.port=5660, service.command.hbregion.monitorcommand=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh status regionserver, hbmaster.port=60000, service.command.hbmaster.monitorcommand=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh status master, service.command.webserver.start=/opt/mapr/adminuiapp/webserver start, service.command.tt.heapsize.max=325, mfs.port=5660, service.command.jt.type=BACKGROUND, service.command.hbmaster.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start master, service.command.hbmaster.heapsize.max=512, zookeeper.servers=172.31.2.204:5181,172.31.2.205:5181,172.31.2.206:5181, service.command.kvstore.monitor=server/mfs, service.command.jt.heapsize.percent=10, services.retries=3, service.command.hbregion.type=BACKGROUND, service.command.cldb.heapsize.min=256, service.command.mfs.start=/etc/init.d/mapr-mfs start, service.command.cldb.monitor=com.mapr.fs.cldb.CLDB, service.command.jt.monitor=org.apache.hadoop.mapred.JobTracker, service.command.cldb.start=/etc/init.d/mapr-cldb start, service.command.hbregion.monitor=org.apache.hadoop.hbase.regionserver.HRegionServer start, service.command.mfs.stop=/etc/init.d/mapr-mfs stop, service.command.hbmaster.monitor=org.apache.hadoop.hbase.master.HMaster start, service.command.hbmaster.type=BACKGROUND, service.command.tt.type=BACKGROUND, service.command.jt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop jobtracker, service.command.mfs.monitor=server/mfs, service.command.warden.heapsize.min=64, service.command.webserver.heapsize.min=512, mapr.home.dir=/opt/mapr, service.command.jt.monitorcommand=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh status jobtracker, service.command.tt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start tasktracker, nodeTotalMemory=24678244, service.command.webserver.type=BACKGROUND, service.command.hbregion.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop regionserver, service.command.cldb.heapsize.max=4000, service.command.kvstore.type=BACKGROUND, service.command.nfs.type=BACKGROUND, service.command.hoststats.start=/etc/init.d/mapr-hoststats start, service.command.zk.heapsize.min=256, service.command.os.heapsize.min=256, service.command.tt.monitor=org.apache.hadoop.mapred.TaskTracker, service.command.tt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop tasktracker, service.command.hbmaster.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop master, service.command.webserver.heapsize.percent=3, service.command.tt.heapsize.percent=2, currentHostID=2b30274fdff08b81, service.command.warden.heapsize.max=750, service.command.webserver.heapsize.max=750, jt.http.port=50030, service.command.nfs.heapsize.min=64, service.command.zk.heapsize.percent=1, service.command.hoststats.type=BACKGROUND, service.command.webserver.stop=/opt/mapr/adminuiapp/webserver stop, service.command.kvstore.monitorcommand=/etc/init.d/mapr-mfs status, service.command.nfs.heapsize.percent=3, jt.port=9001, service.command.mfs.heapsize.min=512, service.command.mfs.heapsize.percent=20, services=, service.command.kvstore.stop=/etc/init.d/mapr-mfs stop, service.command.nfs.start=/etc/init.d/mapr-nfsserver start, service.command.kvstore.start=/etc/init.d/mapr-mfs start, service.command.cldb.heapsize.percent=8, service.command.hoststats.monitorcommand=/etc/init.d/mapr-hoststats status, service.command.cldb.type=BACKGROUND, service.command.jt.heapsize.min=256, service.command.nfs.stop=/etc/init.d/mapr-nfsserver stop, service.command.hbregion.heapsize.min=1000, service.command.os.heapsize.percent=3, service.command.nfs.monitor=server/nfsserver, service.command.mfs.monitorcommand=/etc/init.d/mapr-mfs status, nodeTotalFreeMemory=19249716, service.command.zk.heapsize.max=1500, service.command.os.heapsize.max=750, service.command.webserver.monitorcommand=/opt/mapr/adminuiapp/webserver status, service.nice.value=-10, service.command.jt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start jobtracker, currentHost=hadoop6.eng.narus.com, cldb.port=7222, service.command.tt.heapsize.min=64, service.command.tt.monitorcommand=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh status tasktracker, service.command.hoststats.stop=/etc/init.d/mapr-hoststats stop, service.command.nfs.heapsize.max=1000} 2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Total node memory: 24678244

2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Total consumed memory: 1220 2012-02-16 03:39:24,400 INFO com.mapr.warden.WardenManager [main]: Still available free memory: 22879 2012-02-16 03:39:26,005 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged 2012-02-16 03:39:26,011 INFO com.mapr.warden.WardenServer [main-EventThread]: Process path: /servers. Event state: SyncConnected. Event type: NodeChildrenChanged

link

answered 15 Feb, 19:53

ghousia's gravatar image

ghousia
1681117
accept rate: 0%

It looks like you did not run configure.sh on that node and/or did not install any mapr-* packages there besides mapr-core.

(15 Feb, 21:29) yufeldman ♦

please also paste here first few lines of your /opt/mapr/conf/warden.conf

(15 Feb, 21:45) yufeldman ♦

rpm -qa gives these rpms.

mapr-pig-1.2.0.12140GA-1 mapr-zk-internal-1.2.0.12140GA.v3.3.2-1 mapr-core-1.2.0.12140GA-1 mapr-hive-1.2.0.12140GA-1 mapr-hive-internal-1.2.0.12140GA.v0.7.1-1 mapr-tasktracker-1.2.0.12140GA-1 mapr-zookeeper-1.2.0.12140GA-1 mapr-fileserver-1.2.0.12140GA-1 mapr-pig-internal-1.2.0.12140GA.v0.9.0-1

Configure command: /opt/mapr/server/configure.sh -c -C 172.31.2.204 -Z 172.31.2.204,172.31.2.205,172.31.2.206

(15 Feb, 22:13) ghousia

You are configuring client "-c", instead of cluster node, please rerun configure.sh without "-c" option.

/opt/mapr/server/configure.sh -C 172.31.2.204 -Z 172.31.2.204,172.31.2.205,172.31.2.206

(15 Feb, 22:37) yufeldman ♦

Thanks a lot. It worked. But strange behaviour is it works on other slave nodes, Only one machine has this problem

(15 Feb, 23:14) ghousia

I doubt it. You might think it worked, because most likely configure.sh was run w/o "-c" at least once, and subsequent runs with "-c" were not just changing anything.

(15 Feb, 23:24) yufeldman ♦
showing 5 of 6 show all

services=
service.command.jt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start jobtracker service.command.tt.start=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh start tasktracker service.command.hbmaster.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start master service.command.hbregion.start=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh start regionserver service.command.cldb.start=/etc/init.d/mapr-cldb start service.command.kvstore.start=/etc/init.d/mapr-mfs start service.command.mfs.start=/etc/init.d/mapr-mfs start service.command.nfs.start=/etc/init.d/mapr-nfsserver start service.command.hoststats.start=/etc/init.d/mapr-hoststats start service.command.webserver.start=/opt/mapr/adminuiapp/webserver start service.command.jt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop jobtracker service.command.tt.stop=/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop-daemon.sh stop tasktracker service.command.hbmaster.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop master service.command.hbregion.stop=/opt/mapr/hbase/hbase-0.90.4/bin/hbase-daemon.sh stop regionserver service.command.cldb.stop=/etc/init.d/mapr-cldb stop service.command.kvstore.stop=/etc/init.d/mapr-mfs stop service.command.mfs.stop=/etc/init.d/mapr-mfs stop service.command.nfs.stop=/etc/init.d/mapr-nfsserver stop service.command.hoststats.stop=/etc/init.d/mapr-hoststats stop service.command.webserver.stop=/opt/mapr/adminuiapp/webserver stop

link

answered 15 Feb, 22:17

ghousia's gravatar image

ghousia
1681117
accept rate: 0%

edited 15 Feb, 22:20

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×9

Asked: 15 Feb, 16:50

Seen: 163 times

Last updated: 15 Feb, 23:24

powered by OSQA