job.answiz.com
  • 4
Votes
name
name Punditsdkoslkdosdkoskdo

What are my HyperV VM's randomly lose connectivities?

I have a strange intermittent connectivity problem happening about once every two weeks.

First my configuration: I am running a HyperV failover cluster with two physical hosts (node01 and node02). The hosts are both running Windows Server 2008 R2 HyperV server (the free one) with SP1. On those hosts I am running two VM's each running Windows Server 2008 R2 Web edition with SP1. My storage server is Windows Storage Server 2008 connected via iSCSI. Both hosts as well as the storage server are running the latest network drivers downloaded directly from Intel's website.

Here's the problem: 99.99% of the time, everything works perfectly. About once every two - three weeks, the VMs will both simultaneously lose network connectivity, both incoming and outgoing. When this happens,

  1. I cannot RDP into either VM.
  2. I can RDP into either host.
  3. I can connect to either VM from the Failover Cluster Manager by right-clicking on the node and selecting 'Connect to Virtual Machine'
  4. Once I connect to the VM as described in #3 above, I cannot get to any websites or machines on the LAN. Disabling and re-enabling the virtual network connection inside the VM doesn't fix the problem.
  5. If I move the VM to a different node, that fixes the problem (for the next two weeks).
  6. If I reboot the host and move the VM back onto it, that fixes the problem (for the next two weeks).
  7. When this happens, the failover cluster does NOT automatically failover the VM.
  8. There are no unusual event log entries on any of the hosts or VMs.

This has happened about 5 times with the same symptoms as described above. I suspect a network driver or network hardware issue, but since I'm already running the latest drivers I'm not sure what to do about it.

This is a real head-scratcher ... any ideas?

Update

I found a very similar case here: Virutal Machine loses network connectivity on Hyper V Cluster

Update 7/29/2011

After installing hotfixes and updating network drivers, I am still experiencing the same problem. In response to the comment asking for hardware details, the server is an Intel SR1670HV, which is 1U chassis containing two independent S5500HV motherboards. Communication is via the motherboards' integrated NICs which are Intel 82574L. The network driver is version 16.2.49.0.

We had a weird issue where we randomly had virtual machines lose network connectivity from a Windows 2012 R2 Hyper-V host server. The VM’s would just lose connectivity from the network. The virtual servers were up, we could connect to the console but we couldn’t ping them and they couldn’t ping out. It’s rare to find an article that specifically outlines the EXACT issue that you are experiencing. Luckily for us, we found the following technet article which helped us resolve this issue!

According to the technet article below, when you have Hyper-V running on Windows Server 2012 R2 together with Broadcom NetXtreme 1-gigabit network adapters, you might experience VM’s randomly losing network connectivity. This was the exact scenario in our case.In our case, we had the latest Broadcom drivers already installed but still experienced the issue

  • 2
Reply Report

Consider the following scenario:

You’ve deployed your Virtual Machines on Hyper-V hosts that are running Windows Server 2012 or 2012 R2. Everything appears to be running swimmingly. However, you soon start to experience the following symptoms:

  • Virtual machines randomly lose network connectivity. The network adapter appears to be working in the virtual machine. However, you cannot ping or access network resources from the virtual machine. Restarting the virtual machine does not resolve the issue
  • You cannot ping or connect to a virtual machine from a remote computer, you are only able to connect via the Hyper-V console

When this occurs, the only fix is to restart the Hyper-V host. Restarting the VM does not resolve the issue. This also does not address the underlying issue as to why this is occurring in the first place.

  • 2
Reply Report

We used to have a problem like this where I'm at. I don't remember the exact details, but the final solution had to do with a conflicting mac address assigned dynamically to a virtual network adapter. Pinning those down to they weren't dynamic helped a lot. You normally don't want to do that because it can make it harder to move a virtual machine to a different host, but it helped us in this instance.

The other part is that the physical nics were made by broadcom and we also had a configuration error there, where a previous admin had tried incorrectly to use the broadcom utility to trunk the two nics together on the host for improved bandwidth/throughput. We removed that setup and configured one of the nics so it had no IP at all on the host machine, but could still be used for passthrough to virtual guests. Then we set each virtual machine to only use one nic or the other, balancing the load based on historical traffic. Of course that means no failover if an adapter or connection goes down, and we haven't followed through well to see whether traffic has remained balanced over time, but it's been rock solid stable since then.

  • 4
Reply Report