I was able to install Mahout. However, I am having issue with path resolution.

I followed instructions from here http://www.mapr.com/doc/display/MapR/Mahout

When I follow the steps through

$ cd $MAHOUT_HOME
$ ./examples/bin/build-20news-bayes.sh

I receive the following output. (I added echos to the script in order to doublecheck path resolution ====DEBUG====)

root@ip-10-73-34-76:/opt/mapr/mahout/mahout-0.5# ./examples/bin/build-20news-bayes.sh 
Running on hadoop, using HADOOP_HOME=/opt/mapr/hadoop/hadoop-0.20.2/
No HADOOP_CONF_DIR set, using /opt/mapr/hadoop/hadoop-0.20.2//conf 
12/04/10 22:24:28 WARN driver.MahoutDriver: No org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on classpath, will use command-line arguments only
12/04/10 22:24:32 INFO driver.MahoutDriver: Program took 4379 ms
Running on hadoop, using HADOOP_HOME=/opt/mapr/hadoop/hadoop-0.20.2/
No HADOOP_CONF_DIR set, using /opt/mapr/hadoop/hadoop-0.20.2//conf 
12/04/10 22:24:36 WARN driver.MahoutDriver: No org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on classpath, will use command-line arguments only
12/04/10 22:24:38 INFO driver.MahoutDriver: Program took 2725 ms
====DEBUG==== resolving path
====DEBUG==== cwd:
/opt/mapr/mahout/mahout-0.5
====DEBUG==== train input:
/opt/mapr/mahout/mahout-0.5/examples/bin/work/20news-bydate/bayes-train-input
Running on hadoop, using HADOOP_HOME=/opt/mapr/hadoop/hadoop-0.20.2/
No HADOOP_CONF_DIR set, using /opt/mapr/hadoop/hadoop-0.20.2//conf 
12/04/10 22:24:42 INFO bayes.TrainClassifier: Training Bayes Classifier
12/04/10 22:24:43 INFO bayes.BayesDriver: Reading features...
12/04/10 22:24:43 INFO fs.JobTrackerWatcher: Current running JobTracker is: ip-10-2-65-155.ec2.internal/10.2.65.155:9001
12/04/10 22:24:43 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/04/10 22:24:44 INFO mapred.JobClient: Cleaning up the staging area maprfs://10.2.65.155:7222/var/mapr/cluster/mapred/jobTracker/staging/root/.staging/job_201204102116_0011
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: /user/root/examples/bin/work/20news-bydate/bayes-train-input
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:225)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:236)
    at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1005)
    at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:997)
    at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:914)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:867)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Unknown Source)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1109)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:867)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:841)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1276)
    at org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureDriver.runJob(BayesFeatureDriver.java:63)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesDriver.runJob(BayesDriver.java:47)
    at org.apache.mahout.classifier.bayes.TrainClassifier.trainNaiveBayes(TrainClassifier.java:54)
    at org.apache.mahout.classifier.bayes.TrainClassifier.main(TrainClassifier.java:162)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:186)

asked 10 Apr '12, 15:27

tristanls's gravatar image

tristanls
1222
accept rate: 0%


Most likely, you don't have one of the necessary environment variables set.

Please make sure that you have all of the following environment variables set:

export MAHOUT_HOME=/opt/mapr/mahout/mahout-0.5
export HADOOP_HOME=/opt/mapr/hadoop/hadoop-0.20.2

Also, set JAVA_HOME accordingly.
RedHat/CentOS example: export JAVA_HOME=/usr/java/jdk1.6.0_24 
Ubuntu example: export JAVA_HOME=/usr/lib/jvm/java-6-sun

Note that the input path:
/user/root/examples/bin/work/20news-bydate/bayes-train-input
is the path on the distributed filesystem, not the local filesystem

By default, you should already have the /user directory on your mapr filesystem:
hadoop fs -ls /user

You may or may not have /user/root
No matter, the program will create /user/root/examples/bin/work/20news-bydate/bayes-train-input and copy over the input files.

link

answered 13 Apr '12, 16:09

ThanhMapR's gravatar image

ThanhMapR ♦♦
11
accept rate: 0%

edited 13 Apr '12, 16:10

Re:
---
By default, you should already have the /user directory on your mapr filesystem:
hadoop fs -ls /user

You may or may not have /user/root No matter, the program will create /user/root/examples/bin/work/20news-bydate/bayes-train-input and copy over the input files.
---

I see that this is not true on the MapR Ubuntu VMs for download from the MapR site. You can use this as a workaround:

Manually create the necessary directory structure on the mapr-fs:

# hadoop fs -mkdir /user/root/examples/bin/work/20news-bydate/bayes-train-input

Manually copy over the input files:

# hadoop fs -copyFromLocal /opt/mapr/mahout/mahout-0.5/examples/bin/work/20news-bydate/bayes-train-input/* /user/root/examples/bin/work/20news-bydate/bayes-train-input/

Then, you should be able to run /opt/mapr/mahout/mahout-0.5/examples/bin/build-20news-bayes.sh

link

answered 26 Apr '12, 01:57

ThanhMapR's gravatar image

ThanhMapR ♦♦
11
accept rate: 0%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×5
×1
×1

Asked: 10 Apr '12, 15:27

Seen: 584 times

Last updated: 26 Apr '12, 01:57

powered by OSQA