I'm currently trying to parse a large number of sentences and this requires using a parsing model that uses about 2g-4g of ram per parser. As of right now, i've extended the heap size for each mapreduce job by using mapred.map.child.java.opts. But i'm seeing a lot of jobs crash due to a lack of available physical ram.
After inspecting the MapR UI, i see that most of my nodes are using 60%-80% of the ram. Does that percentage include ram available for mapreduce tasks? Or am i limited to the remaining ram for my mapreduce tasks? If the ram MapR is use does include the memory for each task, which parameter in warden.conf should I change? would it be one of the service.command.* options?
asked 26 Jul '11, 11:51
MapR auto configures memory per service on a node. For MapReduce tasks, memory reserved is total memory minus the sum of memory reserved for other services and the operating system. This information is available on TaskTracker web UI. http://tasktrackernode:50060 More information: TaskTracker on critical nodes and Reducing slots on TaskTrackers
Reduce the number of MapReduce slots based on memory reserved for MapReduce tasks and memory requirement per task. For example, if TaskTracker has 14G memory reserved and tasks require 3G then you would want 3 map and 1 reduce slot(s). Please update the following parameters in mapred-site.xml on all nodes: