Thursday, 28 February 2013

Troubleshooting VMWare HA -Cannot complete the HA Configuration - See more at: http://www.vmwarearena.com/2012/07/troubleshooting-vmware-ha.html#sthash.SL5yyVrQ.dpuf


What the basic troubleshooting steps in case of HA agent install failed on hosts in HA cluster?

If you are facing any issues related to hosts in  the HA cluster , I would recommend to follow the  below basic 10 troubleshooting steps. Most of the time, This will resolve the issues.


Error message will be similar to the below one






1. Check your environment, if any temporary network problem exists

2. Check the DNS is configured properly

3. Check the vmware HA agent status in ESX host by using below commands

  service vmware-aam status

4. Check the  ESX networks are properly configured  and named exactly as other hosts in the cluster.              otherwise, you will get the below errors while installing or reconfiguring HA agent.


5. Check HA related ports are open in firewall to allow for the communication
    
     Incoming port: TCP/UDP 8042-8045
     Outgoing port: TCP/UDP 2050-2250

6. Try to restart /stop/start the vmware HA agent on the affected host using the below commands.
In addition, u can also try to restart vpxa and management agent in the Host.

service vmware-aam restart

service vmware-aam stop

service vmware-aam start

7. Right Click the affected host and click on "Reconfigure for VMWare HA" to re-install the HA agent that particular host.



8. Remove the affected host from the cluster. Removing ESX host from the cluster will not be allowed untill that host is put into maintenance mode.

9.Alternative solution for 8 step is, Goto cluster settings and uncheck the vmware HA to turnoff the HA in that cluster and re-enable the vmware HA to get the agent installed from the scratch.

10. For further troubleshooting , review the HA logs under /Var/log/vmware/aam directory. - See more at: http://www.vmwarearena.com/2012/07/troubleshooting-vmware-ha.html#sthash.SL5yyVrQ.dpuf

For any VMware troubleshooting use to visit http://www.vmwarearena.com. nice !

McAfee ePO Manual Agent Installation


25MAY/102

From time to time, across several infrastructures, I often get people report they cannot push anti-virus agents to workstations from McAfee's ePolicy Orchestrator. This can be time consuming and frustrating, and as you'll know ePO isn't the most user friendly or affective application. You can however manually install the McAfee Agent and then force it to comply with your anti-virus policies or client tasks. To do this, perform the following actions:
1. On the affected machine browse to the following location: \\EPOSERVERNAME\C$\Program Files\McAfee\ePolicy Orchestrator\DB\Software\Current\EPOAGENT3000\Install\0409\
2. Double click the FramePkg.exe file and let the agent install. Please note, you will need administrative rights over the workstation to perform this.
3. Open a command prompt window and type the following: cd "C:\Program Files\McAfee\Common Framework"
4. Once in the aforementioned directory, type the following at the command prompt and press return: CmdAgent.exe /s
5. You will now be presented with the McAfee Agent console, click "Collect and Send Props". This prompts the agent to advertise itself to the ePO server and enforce any policies or client tasks that maybe set, which in my case is usually the installation of the ant-virus product itself.
I hope this eases your ePO frustrations, it certainly did mine.

Active Directory Replication issue


This blog is in regards to the Active Directory replication concerns that we are having.

In a single domain we cannot have Active Directory Replication working only within a site and not between sites.

A domain controller holds replica’s of the domain, schema, configuration and application partition. The domain partition will be replicated between all the domain controller’s of a domain.
However schema and configuration partition are forest wide and will be replicated between all the domain controllers within a forest.

If we disable inbound/outbound replication for a domain controller then the domain controller will not inbound/outbound replicate with any other domain controller.
We can take the above steps in situations where in some Active Directory objects have been deleted accidently and you do not want those changes to be replicated throughout the domain.

Again to re-iterate the point we cannot have Active Directory replication working just within a site and not between sites.

Please let me know if you have any other concerns, I will be glad to assist you.

Server which requires persistent route change


Change steps:

Step 1: Login the below list servers
Step 2: Cmd prompt >route print. Identify the interface number for the primary VLAN
Step 3: For Datacenter1(SYDNEY)  servers Cmd prompt> route -p add 10.10.6.0 mask 255.255.254.0 10.10.5.4 if <interface number from step 2> 
              For Datacenter2(San Jose) servers Cmd prompt> route -p add 10.20.6.0 mask 255.255.254.0 10.20.5.4 if <interface number from step 2> 

No restart of the server/services required.

Roll back:
Step 1: Login the below list servers
Step 2: For Sydney servers Cmd prompt> route delete 10.10.6.0
              For San Jose servers Cmd prompt> route delete 10.20.6.0

RDP issue





 Following is the summary of the troubleshooting and the steps taken to resolve the RDP issue.

- Tried to RDP using admin switch to localhost.
Error: Generic error.

- Ran tasklist and netstat -ano. This is to check if 3389 port listening as Terminal Services listen on port 3389.

a. Start Registry Editor.

b. Locate the following subkey in the registry, and then click Parameters:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

c. Double-click the MaxUserPort value, type 50000 (decimal) in the Value data box,
and then click OK.

d. Locate the TcpTimedWaitDelay value under the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

e. Double-click the TcpTimedWaitDelay value, type 30 in the Value data box, and
then click OK.

f. Closed registry editor and opened command prompt

g. executed these commands

netsh int tcp set global chimney=disabled
netsh int tcp set global rss=disabled
netsh int ip set global taskoffload=disabled
netsh int tcp set global autotuninglevel=disabled
netsh int tcp set global congestionprovider=none
netsh int tcp set global ecncapability=disabled
netsh int tcp set global timestamps=disabled

i.                     Rebooted the server.

Unable to RDP to "XXXXX" server from internal client machines and getting generic error


1) Following is the summary of the troubleshooting and the steps taken to resolve the secure channel issue.

-Reset secure channel using nltest and reboot the server , after which the issue did not reoccur.

-We also noticed the netlogon issue mentioned that the rpc was unavailable because of which it was not able to establish a secure session. Increased the number of DNS servers in the network adapter.

-Also found that the network card drivers were outdated (v4.x). the latest were around (7.x). Have advised to update the network card drivers.

-We forced the licensing server discovery over IP so that it does not depend on DNS in an event of a dns failure.

- For the access denied problem, we confirmed that it was a issue by design after not limiting the user to a single session.

we found nothing in the event viewer that states an issue with a particular component. But since we know that the issue is with the terminal server only, we have focused our troubleshooting on the terminal server itself and have tried to look for the most probable  causes of the problem.




Ctrl+Alt+Del key issue in Integrated Management Module (IMM) on IBM Server System


It’s not an IMM issue.

It’s the presence of virtual keyboard used from the previous login/RDP session. So even if we restart the IMM in this case, It won’t fix the issue. So if we are going to press ctrl+alt+del option from IMM, there will be no response from the server. It’s like when you press and hold a certain key from a physical keyboard, we can’t ctrl+alt+del, or even use any other keys.

Here’s what I did:

1.        From the IMM, I opened the “On-Screen Keyboard” from the Ease of Access option.
2.       Once the virtual keyboard appears, press all the keys from your “laptop/desktop keyboard” one-by-one until you finish all the keys.  This is to calibrate the virtual KB and to release whatever key is giving the problem from the virtual KB.
3.       Close the virtual keyboard.
4.       Go to Macros > Click Ctrl+Alt+Del
5.       Done

I applied this steps to all other servers with the same issues and it worked.
I believe this issue can be resolved by restarting the server but it is not advisable to do so.