VMware Migration of VMs and RDMs From VNX to XtremIO – Part 1

In today’s digital age with virtualization leading the way, you will often find yourself in a situation dealing with VMs and RDMs. RDMs are Raw Device Mappings and it is a way to present a physical LUN to a VM directly as if it was accessing direct-attached storage. Often what proves to be a daunting task is the ability to migration these RDMs that are attached to VMs. I’m going to discuss how to identify which VMs have RDMs, which storage array they belong to, and map it back to the physical LUN on that storage array.

  • The first thing you will want to do is to scan vCenter for VMs with RDMs
    • You will need read access to vCenter and you should have VMware powerCLI installed on your desktop
    • Connect to vCenter through powerCLI
      • Connect-VIServer yourvcenterhostname.domain.local
    • Run a get-VM script selecting the VM hostname, raw device, NAA ID, and hard disk number
      • Get-VM | Get-HardDisk -DiskType “RawPhysical”,”RawVirtual” | Select Parent,Name,DiskType,ScsiCanonicalName,DeviceName | format-table | Out-File –FilePath “out-file-location-on-your-terminal”
  • Once the script completes, you should have a text file that can be imported into excel as text data delimted or fixed width
  • Use the data filter and sort by NAA or SCSIcanonicalname
  • Use this and the source array collects or logs to compare and identify which pertain to your migration
    • In my example, I am migrating from a VNX to XtremIO. I will be using the SCSI Canonical Name and comparing that to the LUN UID/WWN from the SP collect

Example:

RDMs-list.jpg

Once you have identified the VMs in the list that pertain to your migration, you are now ready to begin planning next steps. In my scenario, I am migrating VMs residing on a VNX to a XtremIO. There is a mixture of Virtual and Physical RDMs which means that along with Storage vMotion, I will be using SANcopy to create incremental sessions and pushing the physical RDMs to the XtremIO.

Other tools such as Open Migrator and PPME (if PowerPath is present) can be used as an alternative host-based migration approach, but each tool as its caveats and may still require a reboot to cut over. I will discuss SANcopy from VNX to XtremIO in a future post.

Mounting a NFS share from Exagrid to a VMware host

After endlessly searching through Google (by endlessly I mean various searches with only scanning the first or second page and spending roughly about 2 – 3 minutes on each page), we bit the bullet and called Exagrid support to assist us with mounting a NFS share in VMware. This is what I learned from that experience.

  • By default, Exagrid uses NFSv4 when using the directory path serverIPaddress:/NFSshare
  • To mount this in VMware you need to force it to use NFSv3
  • When trying to mount the share using simply /Backup we were receiving the following error:
    • NFS mount ip-address:mountpoint failed: The mount request was denied by the NFS server. Check that the export exists and that the client is permitted to mount it.

To mount the NFS share in VMware use the following path pre-fix in front of your share:

  • /home1/shares/
    • Example: my NFS share in Exagrid is Backup, so in VMware my path will be: /home1/shares/Backup

You should see that VMware is able to successfully add the NFS share as a datastore.

mapping-to-exagrid

Celerra NAS pool maxed out – manually deleting a filesystem

I recently ran into an issue that I will share with you since I was unable to find a solution online and resolved the issue myself. 

Issue: NAS pool maxed out and replications halted

When trying to issue a nas_fs -delete for a certain filesystem on a destination system, I received the following error: “file system has backups in use.” The reason you’re getting this error is either because the file system has a checkpoint schedule created or has replication checkpoints in use. In my case, it was the replication checkpoints preventing it from being deleted. Issue the following command to see the checkpoints associated with the filesystem:

fs_ckpt id=XX -list -all (where XX is the file system ID). Once you’ve identified the checkpoints that need to be deleted, issue the following command to delete them:

nas_fs -delete id=XX -o umount=yes -ALLOW_REP_INT_CKPT_OP (where XX is the checkpoint ID). Now, you should be able to go back and delete the file system with the “nas_fs -delete” command. If you go back to the source system and try to delete the replication, you will be returned an error that the destination side of the replication could not be found.

[nasadmin@NS480 ~]$ nas_task -i 648886
Task Id = 648886
Celerra Network Server = NS480
Task State = Failed
Movers =
Description = Delete Replication VNX5700_FS2 [ id=295_APM00110000_520_APM00130000].
Originator = nasadmin@cli.localhost
Start Time = Wed Jun 11 13:26:17 EDT 2014
End Time = Wed Jun 11 13:26:19 EDT 2014
Schedule = n/a
Response Statuses = Error 13160415862: The destination side of the replication session could not be found.

When deleting the replication session, use the “-mode source” flag and the replication session should now be deleted.

Recent Life Changes and EMC

It’s been a while since I’ve updated this blog, but I plan to update it on a regular basis now. Some major life changes have occurred for me over the time my last post was made. I have left my job at the community bank for which I was a systems administrator for and have joined the “storage guys” over at EMC! Currently, I am in Franklin, MA for EMC Global Services Associate Program (GSAP) training for an Associate Implementation Specialist (AIS) role. Just in the two days alone, I’ve realized just how big EMC, as a company, really is with new trainees flying in from all around the world. Also, the company demonstrates their commitment to the employee with the in-detail Myers Briggs Type Indicator personality training we received on the second day. They truly want you to understand yourself as a person and how to interact with others in your day-to-day interactions with customers and other employees. So far, I’m thoroughly impressed with the company, it’s dedication to their employees, and the passion for technology and storage!

The Ability to Think on Your Feet

Do you remember the scene from Apollo 13 where the NASA technicians were given the task of fixing the CO2 issue for the astronauts aboard the Apollo 13 using only the equipment they had in the space shuttle (a box, air filter, plastic bag, and duct tape)? Here’s the scene in case you haven’t seen it:

 

The ability to think quickly on your feet is crucial especially when under pressure and stress to deliver a fix in a certain time. Let me fill you in on how my Monday went.

I came in this morning and was immediately told that a Domain Controller/File Server at our office in Northern Virginia was offline. Sounds like a simple enough fix right? Get someone to power it on or remote into the Management Port (iLO2) and send the power on signal… tried that, but didn’t work. Instead a red health indicator LED flashed whenever the power button was pressed. Not cool! To top off the issue, this server also manages DHCP and the leases just coincidentally happened to expire for more than half of the users at this location… great…

After bouncing a few ideas off of my teammates, I came up with the idea of enabling DHCP on the switch stack at that location. Success! User’s were now able to obtain an IP address and access the company network/internet. As for the file server issue, once I arrived on-site with a server replacement I noticed that the hard drives in the current server were bigger (not in GB size, but in actual width and height).

The replacement server I brought with me was the EXACT same model as the server on-site that was down? This question may never be answered. Before I gave up all hope of repairing the issue on the same day, a light bulb turned on and the solution presented itself. I took the entire HDD enclosure bay out of the bad server, and placed it in the new server. I prayed for driver compatibility and on-board RAID management to successfully work when powering on this server and it did. After successfully logging into the server, verifying access to all drives and files, and ensuring DHCP was working on the server, I was able to stand down the temporary fix I implemented by enabling DHCP on the switch. I sent a quick email to the office asking users to reboot their PCs so access to their files could be restored and to retrieve a proper IP addressed issued through DHCP on the server. Once that task was done, I verified that everything was restored back to normal.

The ability to quickly think on my feet and common sense saved me. The resolution took about 3 hours and users were only partially impacted by not having access to their files. These victories give me a sense of accomplishment and further fuel my passion for the I.T. field!

Creating an Automated Server Disk Space Report

Difficulty: Intermediate

This guide will teach you how to take a powershell script, turn it into a scheduled task, and have the output of the script be emailed to you. The script I’m going to be demoing here is a simple disk space check that runs on a list of servers you define. For every server in the list with a drive that has less than 20% of storage left, the drive and amount of disk space left will be shown.

Before we begin working with the script, we must first prepare our server for powershell. Windows 2008 Server may already have powershell preinstalled. If it is, you will be able to find it in START > Accessories > Windows Powershell. If it is not there, then you will have to install it by going into control panel, programs and features, and turn windows features on or off, and enabling Windows Powershell.

Continue reading