I wanted to avoid exporting output files from a map-reduce program so I set mapred.output.dir to an file:// location that points to an NFS mounted partition.

Unfortunately, this means that the output files are owned by the hadoop user (using CDH or stock Hadoop). What can I do?

I would like the resulting output dir to belong to "user" so that downstream processing works correctly. I can't run all jobs as the hadoop user.

asked 22 Jun '11, 08:41

FAQ's gravatar image

FAQ ♦
147363739
accept rate: 0%


This situation doesn't even apply with MapR because you can just use the standard Hadoop output mechanisms to create the files in maprfs and then access them via NFS. This preserves ownership in the way that you would like and provides the access that you would like while maintaining the speed of normal map-reduce and distribution file system (maprfs in this case). This strategy of using NFS to access map-reduce output also allows you to use traditional Linux, Mac or Windows tools running on single nodes where such simpler tools suffice. No export or import is needed in these cases.

link

answered 30 Jun '11, 18:38

TedDunning's gravatar image

TedDunning ♦♦
2.4k315
accept rate: 28%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×70
×48
×10
×3

Asked: 22 Jun '11, 08:41

Seen: 821 times

Last updated: 30 Jun '11, 18:38

powered by OSQA