|
We currently have 100 maps available under a working cluster of 20 nodes. However when we start a hive job, only 14 maps are allocated. How can we control this and allow more maps per job? |
|
Typically, the number of maps for a job is determined the size of the input data. More specifically, the number of chunks. The chunksize is 256 MB by default. See the documentation for changing chunksize: http://mapr.com/doc/display/MapR/hadoop+mfs This is more of a scheduling question. Despite having 100 maps available, we only ever are allowed 14maps concurrent per hive job. We can however run additional jobs to take up the remaining maps available, However what we'd like to do is have some control over this limit.
(01 May '12, 17:07)
thurman
|