|
Hi Guys, this is a follow up question from this post. We're having problems accessing an mapr filesystem on a remote network with the mapr client. It looks like the the client can connect to the CLDB on port 7222. But then the connection to a service on port 5660 fails, because it uses the internal-IP of it. This is similar to a standard hadoop-distribution where on can connect to the namenode, but the namenode gives out the internal ips of the datanodes. However, is there a way to configure MapR in a way that it uses the external ip for the 5660 services ? best regards Johannes |
|
Hi Johannes, On your client machine you will have to instruct MapR client to use only internal IPs exposed by FileSystem. You can do this by exporting environment variable MAPR_SUBNETS . You can add it to your bashrc or /etc/profile. Format is similar to subnet notation (a.b.c.d/shift) and you could specify list of upto 4 subnets MAPR_SUBNETS=1.2.3.4/12, 5.6/24 Let us know if that helps you. |
|
I wonder how this should work. To describe my problem more exactly.. The client is on my laptop and the mapr cluster is on ec2. So my laptop doesn't know the internal ips or subnets of the ec2 instances. The same problem exists with plain hadoop. There i could workaround the problem with a custom SocketFactory which translates internal-ips into external ones (the mapping of those i retrieve from the amazon webservice). Any ideas ? Johannes Now I understand the problem. thanks for explaining. You would see the same problem with MapR as well. Would your custom SocketFactory not work with MapR? There is a REST API to get list of all nodes with their IPs, would that help?
(10 Jan '12, 06:54)
Lohit ♦♦
No, setting hadoop.rpc.socket.factory.class.default has no effect. Probably because most of the client code is native anyway !?
(10 Jan '12, 09:28)
oae
|