Is there anything fundamentally wrong in having the DFS block size to a low value say 64K? Given that the "NameNode" is completely distributed should this be an issue? Does this complicate the container reports in anyway? In short are there any issues doing this.

asked 24 Jul '11, 22:31

John's gravatar image

John
1888
accept rate: 0%

retagged 10 Aug '11, 09:00

Andrew%20Wells's gravatar image

Andrew Wells
3416814


Container size (and hence report) is not tied to file block size. Making block size small will only fit more blocks in a container. On the other hand having too low block size will have poor performance problem for your Map/Reduce job. Map/Reduce framework by default splits input into blocks and hands out blocks to mappers. Blocks are typically 64-128M so that mappers have enough data to process. If you do make block size small, you would have to change your input split calculation to over come this problem.

link

answered 24 Jul '11, 22:42

Lohit's gravatar image

Lohit ♦♦
2.1k313
accept rate: 44%

This is an issue only for Map/Reduce workload. If MR is taken out of the picture lower block size doesn't affect the system's behavior in any way or form?

(24 Jul '11, 23:05) John

Block size does affect how many containers a file can live in. If you have a 1MB file with a block size of 2MB, that file is going to live in one container. If you decrease the block size to 100KB, then you can have that file in 10 containers. This means that reads of the file can be parallelized.

Of course, with such small reads it may be difficult to detect the difference.

(24 Jul '11, 23:08) TedDunning ♦♦

Of course there's also overhead. If the system is asked to move every 64K to a new spot in the cluster, there's corresponding meta-data overhead involved, triplicated, of about 300 bytes of disk per chunk. Plus the number of rpc's to read the file increases.

(24 Jul '11, 23:26) MC Srivas ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×77
×6
×5
×4

Asked: 24 Jul '11, 22:31

Seen: 3,037 times

Last updated: 10 Aug '11, 09:00

powered by OSQA