Quick Network Test for MapR Cluster.
- Below is script which pick one set of nodes as clients and uses utility such as rpctest to open a single connection to another node ( from set of nodes) as a server and push data as fast as possible. In below script i am mentioning both nodes of the cluster in server as well as client so that after first iteration the roles are reversed to connect in the opposite direction and push data. This is done for every single node in the cluster for both sending and receiving data.
- Note : if nodes have multiple interfaces then you need to test each interface .
In below script add all hostnames/Ip of the cluster in both Array's(half1 and half2)
_______________________________________________________________________________________
# Define array of server hosts (half of all hosts in cluster)
half1=(10.10.70.109 10.10.70.110)
for node in "${half1[@]}"; do
#ssh -n $node /root/iperf -s -i3& # iperf alternative test, requires iperf binary pushed out to all nodes like rpctest
ssh -n $node /opt/mapr/server/tools/rpctest -server &
done
echo Servers have been launched
sleep 10 # let the servers set up
# Define 2nd array of client hosts (other half of all hosts in cluster)
half2=(10.10.70.109 10.10.70.110)
for clientnode in "${half2[@]}"; do
for servernode in "${half1[@]}"; do
echo "Launching RPC test:$clientnode-$servernode"
ssh -n $clientnode "rm -rf rpctest-$clientnode-to-$servernode.log"
ssh -n $clientnode "/opt/mapr/server/tools/rpctest -client 50000 $servernode > rpctest-$clientnode-to-$servernode.log" & # Concurrent mode, if using sequential mode remove "&"
done
done
echo Clients have been launched
wait $! # comment out for Sequential mode
sleep 20
echo "The network bandwidth (mb/s is MB/sec) between nodes i.e clientnode-to-servernode (Baseline 1GbE=125MB/s, 10GbE=1250MB/s)"
tmp=${half2[@]}
clush -w ${tmp// /,} 'grep -i -e ^Rate -e error rpctest*log'
tmp=${half1[@]}
clush -w ${tmp// /,} pkill rpctest #Kill the servers_______________________________________________________________________________________
For the script to run you need passworless SSH setup and Clush shell to retrieve the results. Please refer to below blogs for details on setup.
[root@node10 ~]# ./networktest.sh
Servers have been launched
Launching RPC test:10.10.70.109-10.10.70.109
Launching RPC test:10.10.70.109-10.10.70.110
Launching RPC test:10.10.70.110-10.10.70.109
Launching RPC test:10.10.70.110-10.10.70.110
Clients have been launched
The network bandwidth (mb/s is MB/sec) between nodes i.e clientnode-to-servernode (Baseline 1GbE=125MB/s, 10GbE=1250MB/s)
10.10.70.109: rpctest-10.10.70.109-to-10.10.70.109.log:Rate: 977.29 MB/s, time: 53.6491 sec, #rpcs 800031, rpcs/sec 14912.3
10.10.70.109: rpctest-10.10.70.109-to-10.10.70.110.log:Rate: 763.52 MB/s, time: 68.6703 sec, #rpcs 800031, rpcs/sec 11650.3
10.10.70.110: rpctest-10.10.70.110-to-10.10.70.109.log:Rate: 828.08 MB/s, time: 63.3162 sec, #rpcs 800031, rpcs/sec 12635.5
10.10.70.110: rpctest-10.10.70.110-to-10.10.70.110.log:Rate: 788.81 MB/s, time: 66.468 sec, #rpcs 800031, rpcs/sec 12036.3
Above few lines give us a good idea on the rate we can expect across 2 different nodes in the cluster and weed out slacker nodes in the cluster which can be cause of bottleneck in the cluster.