I am using bzip to compress large files. My custom InputFormat returns true for isSplitable but I get failures with large input files.

BZip2 splitting support was added to Apache Hadoop in the 0.21.0 release but I am running 0.20. Is that my problem?

asked 22 Jun '11, 08:57

FAQ's gravatar image

FAQ ♦
147363739
accept rate: 0%


WIth MapR, compression is built in and is entirely transparent.

This means that your compressed inputs are split just like any other file. No action is required on your part.

link

answered 30 Jun '11, 18:51

TedDunning's gravatar image

TedDunning ♦♦
3.6k322
accept rate: 23%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×22
×6
×2

Asked: 22 Jun '11, 08:57

Seen: 1,228 times

Last updated: 30 Jun '11, 18:51

powered by OSQA