Thursday, January 5, 2012

Summary of main new upgrades in VMware 5.x

 

When VMware released vSphere 5 last July (2011) the talk of the town was about the associated costs with the new memory licensing..

  
However there are many new features not available in version 4. 
  • In vSphere 5, storage resource management greatly improved with the introduction of Storage Distributed Resource Scheduler (DRS) as well as Profile-Driven Storage.  
  • Network I/O Control can help administrators prioritize VM traffic
  • Mechanisms were redesigned for Storage vMotion
  • There's now a new method for network packet receive processing improve efficiency.
Here are some of the main (more detailed) differences in this major release:

Improved storage resource management

In vSphere 5, storage resource management greatly improved with the introduction of Storage Distributed Resource Scheduler (DRS) and Profile-Driven Storage.

Storage DRS load balances automatically storage disks and selects optimized placement for VMs based on the available disk space and I/O loads. 

These new capabilities fix the issues with DRS and Storage I/O Control in vSphere 4
DRS only considers CPU and memory usage when load balancing, and Storage I/O Control can prioritize and limit I/O on data stores, but it doesn't allow you to redistribute I/O.

Storage DRS can also use Storage vMotion to load balance data stores, based on storage space utilization, I/O metrics and latency.

Another feature is Storage Profiles, which allows you to define classes of storage so VMs are provisioned and migrated to the proper storage type. Many infrastructures have multiple storage data stores with different performance characteristics. Storage Profiles makes sure a VM stays in a class of storage that the administrator specifies.


Fault Domain Manager

VMware High Availability (HA) has been completely overhauled, however it's now quite complicated! 

Previously, VMware HA relied on up to 5 primary nodes to maintain the cluster settings and node states. The other hosts were secondary nodes and sent their states to the primary nodes. Communication between the primary and secondary nodes involved heartbeats, which could detect outages.

In the new HA architecture, each host runs a special Fault Domain Manager agent that's independent of the vpxd agent, which is used to communicate with vCenter Server.
It also uses a master/slave concept, with one host elected as a master and the other hosts as the slaves. The election uses an algorithm to determine the master and it occurs at several stages: when HA is enabled, when a master fails or is shut down, or when a problem occurs with the management network.

Perhaps one of the best changes to HA is that it no longer relies only on the management network to monitor the heartbeats. HA can now use a storage subsystem for communication,(Heartbeat Datastores - which are used as a communication only when the management network is lost. VCenter Server automatically chooses two data stores to use for monitoring)

ESX is a gonner, as well as the Service Console!

Finally, after years of talk, the only hypervisor is now ESXi.
There are two major differences between ESX and ESXi: installation and command-line management. 

Manually installing ESXi is easier, and the wizard is simple (compared to ESX). 
For automatically deploying ESXi, new auto deploy options can PXE (Preboot Execution Environment) boot and load images for ESXi installations.

As far as CLI, there's no longer have a full service console. Most management can be done remotely with  vSphere CLI and vMA. The esxcli command has been expanded quite a bit in vSphere 5 to provide more manageability and it's supposed to eventually replace the existing vicfg-* management commands.

Memory based licenses

VSphere 5 licenses come with restrictions on the amount of CPU sockets and memory that you can allocate to virtual machines (VMs), although VMware has lifted the limitation on the number of CPU cores that can be used. 

A Standard license allows you to allocate 16GB of RAM and 1 CPU socket to powered-on virtual machines (VMs), it doesnt care how much physical memory a host has. 

An Enterprise license allows 32GB allocation of memory and 1 socket, and 

Enterprise Plus allows 48GB allocation of memory and 1 CPU socket. 

For example, if a host has Enterprise Plus licenses for two, physical processors, you can allocate 96GB of RAM to divide among VMs.

It's going to be very expensive to scale up hosts by adding large amounts of memory.  Preventing VM sprawl and sizing the resources of individual VMs is much more important now, with these new licensing mechanisms.


Upgraded vCenter Server and Web client

With this new release, vCenter can be deployed as a Linux virtual appliance.  The appliance maintains all the regular vCenter Server features, except for Linked Mode 
VCenter now comes packaged with a DB2 Express database. It also supports only Oracle or DB2 external databases. 
VMware has also updated the Web client which can be used for various administration tasks. The old Web interface was not very useable.  I didnt know anyone that really used it.


Take a look at my article on Adding a VM to a VMware 5 

Tuesday, January 3, 2012

Switch a RedHat cluster from Broadcast mode to Multicast Mode


Switch a cluster from Broadcast mode to Multicast Mode
(this is for RHEL 5.2 to 5.6, not tested on 6.x)

Redhat Cluster bonded interfaces eth0/eth2

1. Edit /etc/cluster/cluster.conf and change version number
[root@ computer ~]#vi /etc/cluster/cluster.conf

Look at line 2 for the version number:

<?xml version="1.0"?>
<cluster config_version="11" name="ttrs">

Change it to version 12 so it looks like this:

<cluster config_version="11" name="ttrs">



 
2. Change cluster Join Delay

When the cluster is quorate and the fence domain is first created (by a fence daemon being started), any nodes not yet in the cluster will be fenced.  By default there's a delay of 6 seconds in this case to allow any nodes unnecessarily flagged for fencing to join the cluster an avoid being fenced.
This delay can be increased by setting post_join_delay in cluster.conf:
  <fence_daemon post_join_delay="30">

Change line 3 to look like this:

<fence_daemon post_fail_delay="0" post_join_delay="30"/>


 
3. Continue Editing /etc/cluster/cluster.conf and change communication mode

Line 26 should look like this:

<cman broadcast="yes" expected_votes="1" two_node="1"/>

Change it to look like this:

<multicast addr="239.199.10.1" interface="bond1"/> (Get an unused Multicast IP this is just an example )

You need to add this line for every clusternode in the cluster and in the cman section so it looks like this:

<?xml version="1.0"?>
<cluster alias="clustertest" config_version="55" name="rhcluster">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="30"/>
     
     <clusternodes>
                <clusternode name="rhcluster01-priv" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="pdu_01a" option="off" port="13"/>
                                        <device name="pdu_02b" option="off" port="13"/>
                                        <device name="pdu_01a" option="on" port="13"/>
                                        <device name="pdu_02b" option="on" port="13"/>
                                </method>
                        </fence>
                        <multicast addr="239.199.10.1" interface="bond1"/>
                </clusternode>
                <clusternode name="rhcluster02-priv" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="pdu_13a" option="off" port="14"/>
                                        <device name="pdu_13b" option="off" port="14"/>
                                        <device name="pdu_13a" option="on" port="14"/>
                                        <device name="pdu_13b" option="on" port="14"/>
                                </method>
                        </fence>
                        <multicast addr="239.199.10.1" interface=" bond1"/>
                </clusternode>
        </clusternodes>

        <cman expected_votes="1" two_node="1">
                <multicast addr="239.199.10.1" interface=" bond1"/>
        </cman>



4. stop the cluster on both nodes
[root@server1.net ~]#service rgmanager stop
[root@server1.net ~]# service fenced stop
[root@server1.net ~]# service cman stop

[root@server2.net ~]#service rgmanager stop
[root@server2.net ~]# service fenced stop
[root@server2.net ~]# service cman stop


 

5. Start the cluster on both nodes, quickly (within 30 seconds)

[root@server1.net ~]#service rgmanager start
[root@server1.net ~]# service fenced start
[root@server1.net ~]# service cman start

[root@server2.net ~]#service rgmanager start
[root@server2.net ~]# service fenced start
[root@server2.net ~]# service cman start


6. look at log file at /var/log/messages

You should see something like this below, in relevance to the server and app you are working on.
Highlighted are important markers in these messages:


[root@server-01-01 cluster]# tail -100 /var/log/messages 
Feb 10 16:08:00 server-01-01 ccsd[21438]:  Built: Jul 28 2010 19:18:39
Feb 10 16:08:00 server-01-01 ccsd[21438]:  Copyright (C) Red Hat, Inc.  2004  All rights reserved.
Feb 10 16:08:00 server-01-01 ccsd[21438]: cluster.conf (cluster name = rhclustertest, version = 56) found.
Feb 10 16:08:01 server-01-01 openais[21448]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version 0.80.6'
Feb 10 16:08:01 server-01-01 openais[21448]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and contributors.
Feb 10 16:08:01 server-01-01 openais[21448]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.
Feb 10 16:08:01 server-01-01 openais[21448]: [MAIN ] AIS Executive Service: started and ready to provide service.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] token hold (386 ms) retransmits before loss (20 retrans)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge (200 ms)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] seqno unchanged const (30 rotations) Maximum network MTU 1402
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] window size per rotation (50 messages) maximum messages per rotation (17 messages)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] send threads (0 threads)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] RRP token expired timeout (495 ms)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] RRP token problem counter (2000 ms)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] RRP threshold (10 problem count)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] RRP mode set to none.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] heartbeat_failures_allowed (0)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] max_network_delay (50 ms)
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] HeartBeat is Disabled. To enable set heartbeat_failures_allowed > 0
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] The network interface [10.0.80.1] is now up.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Created or loaded sequence id 196.10.0.80.1 for this ring.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] entering GATHER state from 15.
Feb 10 16:08:01 server-01-01 openais[21448]: [CMAN ] CMAN 2.0.115 (built Jul 28 2010 19:18:41) started
Feb 10 16:08:01 server-01-01 openais[21448]: [MAIN ] Service initialized 'openais CMAN membership service 2.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais extended virtual synchrony service'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais cluster membership service B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais availability management framework B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais checkpoint service B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais event service B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais distributed locking service B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais message service B.01.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais configuration service'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais cluster closed process group service v1.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SERV ] Service initialized 'openais cluster config database access v1.01'
Feb 10 16:08:01 server-01-01 openais[21448]: [SYNC ] Not using a virtual synchrony filter.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Creating commit token because I am the rep.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Saving state aru 0 high seq received 0
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Storing new sequence id for ring c8
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] entering COMMIT state.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] entering RECOVERY state.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] position [0] member 10.0.80.1:
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] previous ring seq 196 rep 10.0.80.1
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] aru 0 high delivered 0 received flag 1
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Did not need to originate any messages in recovery.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] Sending initial ORF token
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] CLM CONFIGURATION CHANGE
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] New Configuration:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] Members Left:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] Members Joined:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] CLM CONFIGURATION CHANGE
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] New Configuration:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.1) 
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] Members Left:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] Members Joined:
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.1) 
Feb 10 16:08:01 server-01-01 openais[21448]: [SYNC ] This node is within the primary component and will provide service.
Feb 10 16:08:01 server-01-01 openais[21448]: [TOTEM] entering OPERATIONAL state.
Feb 10 16:08:01 server-01-01 openais[21448]: [CMAN ] quorum regained, resuming activity
Feb 10 16:08:01 server-01-01 openais[21448]: [CLM  ] got nodejoin message 10.0.80.1
Feb 10 16:08:02 server-01-01 ccsd[21438]: Initial status:: Quorate
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] entering GATHER state from 11.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] Creating commit token because I am the rep.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] Saving state aru c high seq received c
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] Storing new sequence id for ring cc
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] entering COMMIT state.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] entering RECOVERY state.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] position [0] member 10.0.80.1:
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] previous ring seq 200 rep 10.0.80.1
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] aru c high delivered c received flag 1
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] position [1] member 10.0.80.2:
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] previous ring seq 196 rep 10.0.80.2
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] aru a high delivered a received flag 1
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] Did not need to originate any messages in recovery.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] Sending initial ORF token
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] CLM CONFIGURATION CHANGE
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] New Configuration:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.1) 
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] Members Left:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] Members Joined:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] CLM CONFIGURATION CHANGE
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] New Configuration:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.1) 
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.2) 
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] Members Left:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] Members Joined:
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ]    r(0) ip(10.0.80.2) 
Feb 10 16:08:02 server-01-01 openais[21448]: [SYNC ] This node is within the primary component and will provide service.
Feb 10 16:08:02 server-01-01 openais[21448]: [TOTEM] entering OPERATIONAL state.
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] got nodejoin message 10.0.80.1
Feb 10 16:08:02 server-01-01 openais[21448]: [CLM  ] got nodejoin message 10.0.80.2
Feb 10 16:08:10 server-01-01 kernel: dlm: Using TCP for communications
Feb 10 16:08:11 server-01-01 kernel: dlm: connecting to 2
Feb 10 16:08:11 server-01-01 kernel: dlm: got connection from 2
Feb 10 16:08:11 server-01-01 clurgmgrd[21512]: <notice> Resource Group Manager Starting
Feb 10 16:08:17 server-01-01 clurgmgrd[21512]: <notice> Starting stopped service service:http_service
Feb 10 16:08:18 server-01-01 clurgmgrd[21512]: <notice> Service service:http_service started

 
7. Verify the cluster and app are running:


[root@server-01 ~]# clustat
Cluster Status for trrs @ Fri Feb 10 21:12:01 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 server01-priv                                            1 Online, Local, rgmanager
 server02-priv                                            2 Online, rgmanager

 Service Name                                                     Owner (Last)                                                     State        
 ------- ----                                                     ----- ------                                                     -----        
 service:TRRS_VIP                                                 server02-priv                                         started      


[root@server-02 ~]# clustat
Cluster Status for trrs @ Fri Feb 10 21:32:01 2012
Member Status: Quorate

 Member Name                                                     ID   Status
 ------ ----                                                     ---- ------
 Server01-priv                                            1 Online, rgmanager
 Server02-priv                                            2 Online, Local, rgmanager

 Service Name                                                     Owner (Last)                                                     State        
 ------- ----                                                     ----- ------                                                     -----        
 service:TRRS_VIP                                                 server-02-priv                                         started      




8. Done!