Hi All - This one is regarding Hadoop Cluster configuration.. Since every flavour of linux(ubuntu/cent OS/RedHat) comes with its own configurations, data processing speeds, etc.. Lets say we have a 2-node cluster. One node has RedHat linux and another one has CentOS installed.. But since these two operating systems are different in some way like processing blocks of data, etc.. Is there any possibility of getting into 'speculative execution' every time we start execution of MapReduce Job on our 2-node cluster?

asked 02 May '12, 05:36

Saurabh%20Deshpande's gravatar image

Saurabh Desh...
1111
accept rate: 0%


Are you saying you always want every task of a job to run twice, once on each type of machine?

link

answered 02 May '12, 06:08

MC%20Srivas's gravatar image

MC Srivas ♦♦
2.6k1517
accept rate: 35%

no.. no.. whatever operations are there are as usual like the input file will be divided into 64mb blocks and then given to datanodes for processing.. now suppose if these datanodes (in our case assuming its a 2-node cluster) have different operating systems installed on them, then, is there any difference in the processing time taken by these datanodes on the input file due to difference in operating system..? Means the block processed by datanode having redhat completes its job considerably earlier than the datanode having centOS..

link

answered 02 May '12, 08:17

Saurabh%20Deshpande's gravatar image

Saurabh Desh...
1111
accept rate: 0%

It is quite conceivable that you will have very different performance on these two nodes, but not so much due to difference in operating system, but for the simple and practical reason that if you have different operating systems, it is likely that the two nodes were built at different times on different hardware with potentially differing (mis)configurations.

On identical hardware, running equivalent versions with identical configuration, without any hardware failures or degradations, Redhat and Centos should produce essentially identical performance. If any of these conditions are violated, then any degree of difference could be observed. Absent misconfiguration and within reasonable tolerances for hardware, the two nodes should run very nearly the same.

(02 May '12, 11:23) TedDunning ♦♦
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or __italic__
  • **bold** or __bold__
  • link:[text](http://url.com/ "title")
  • image?![alt text](/path/img.jpg "title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×85
×1

Asked: 02 May '12, 05:36

Seen: 557 times

Last updated: 02 May '12, 11:23

powered by OSQA