Syncing NSX Managers across vCenters

TL;DR

This script will sync the distributed firewall configuration from one NSX Manager to another. This is useful when you have a DR site that you need to keep in sync with production. Currently the script only syncs ipsets, security groups, services, service groups, and the Layer3 portion of the firewall. It would be easy enough to add support for tags, MACpools, and the Layer2 portion of the firewall, but it wasn’t a requirement for this project.

Download the script here.

Note: This script was written for NSX version 6.1.3 and while I haven’t seen anything in the release notes that would indicate any compatibility issues I haven’t tested it either. I will update this post after I have updated NSX with any new issues I find.

Background

NSX is a network virtualization platform produced by VMWare.  Among many features is a distributed firewall that can be used to segment traffic based on a variety of factors. My company is using NSX for a new project where we need to micro-segment traffic in a VDI deployment to separate 3rd-party users from internal users. One major issue we ran into was the lack of a native way to synchronize the configuration on the NSX Manager down to our DR site. Currently VMWare offers no way to export the configuration of your NSX deployment and import it into another NSX Manager/vCenter. What VMWare does give you is a set of REST APIs, and since I am very comfortable using Powershell that was the route I decided to take.

The Process

Going through the NSX API guide I had hoped to simply pull the global firewall configuration out of the source NSX Manager and upload it onto the destination. After banging my head against the wall with some not too descriptive error responses and a couple of phone calls with VMWare I found out that each object in the NSX configuration has its own unique ObjectId (ipset-27, servicegroup-32, etc.). The manager uses these Ids to reference objects internally. In order to sync the firewall I also needed to sync each of the objects referenced in the firewall config and ensure that I am referencing the correct ObjectId.

One thing I quickly found was the NSX API guide has a fair amount of typos, inconsistencies, and errors. My two best pieces of advice (besides paying for SDK support with VMWare) are to search for similar scripts online that you can use as a reference and to try your URIs with and without a trailing slash. For some reason the API is very inconsistent with its use of trailing slashes so if a URI gives an error try removing or adding a trailing slash. I would also recommend the Advanced Rest Client for Chrome which was indispensable while troubleshooting.

I began the process of writing a script that will query the source NSX manager for each of the object types, then sync those objects to the destination. After all of the objects are synced I can then copy the firewall config. I originally couldn’t find a way to get the raw XML out of the native Powershell XML object in order to send it as the body in my POST and PUT calls. I was taking the XML that was returned by the GET call as a string and doing some regex to pull out the pieces I wanted. However, this led to a mess of comparisons and hash tables to track and match everything. I decided to go back to the drawing board and found that I had missed the .outerxml Powershell method which allowed me to get the raw XML back from the PSObject. This solved almost all of my issues and quickly got me on the right track.

The Solution

In the final form the script moves through each of the object types one at a time. The script queries each NSX Manager and then uses the output of the source as the base of the new destination configuration. The script then loops through each object in the source and looks for a match in the destination based on the ‘Name’ attribute. If it doesn’t find a match it creates a new object. If it does find a match it replaces the ObjectId from the source with the ObjectId of the destination. All extraneous objects on the destination are deleted. This method works quite well and turned out to be modular which is nice because it means there are only a few functions that I can use over and over again for each of the objects. I also think it will scale nicely as we begin to use more of the features of NSX over time.

Note: This script is read-only on the source. Changes are only ever made to the destination NSX Manager, it’s basically a master-slave setup.

FUNCTIONS

I’ll talk briefly about each of the functions in the script.

First off we need to tell Powershell to ignore cert errors since NSX Managers use self-signed certs.

This is a simple wrapper for GET calls to query the NSX managers. I return these to a variable cast as XML.

I use the Compare-Object cmdlet to diff the two XML objects based on the ‘Name’ attribute. Then I loop through those differences and create or delete objects on the destination as needed. When new objects are created they are created as empty objects with only the name configured. This will generate a new ObjectId that we can use later to get everything synced up.

We have to query both NSX Managers again to get the new configurations now that all of the adds and deletes have been done. Now we can replace all of the ObjectIds on the source objects.

Now that we have all of the objects we want to sync with ObjectIds known to the destination NSX Manager we can push everything. If you are watching the console as this runs it will spit out some useful diagnostic info in the event the server returns an error code. It will also write all the changes to the log file along with the URI and return code.

Service group membership requires its own special function because it has to be done in a second pass with some logic that doesn’t fit into the other functions. It is essentially a microcosm of the rest of the script.

Finally we get back to the firewall rules. Originally I was going to sync each individual firewall section but unfortunately that had a tendency to write the rules out of order. NSX has no method to reorder the rules of the distributed firewall. Instead this method gets everything synced up in terms of ObjectIds and then pushes the entire global firewall configuration onto the destination.  A few things to note here. The API guide states that a new rule can have an Id of 0. I haven’t found this to be the case and I suggest removing the Id attribute all together when creating new rules. Second, I am only worried about the Layer3 rules so the Layer2 section is just enough to sync the ObjectIds so that the source will accept the configuration.

And that’s it. The script will take quite a while to run, somewhere around 10 minutes in my experience. Most of that is due to the large number of services, almost 490 come standard. When it’s done the log file should give you a nice snapshot of all of the changes made. If you want to see what is happening on the NSX Managers themselves then connect using SSH and run

show manager log follow

Unfortunately, it is both quite verbose and ambiguous in its error messages, not a great combination.

I will say this has been a great experience for me and has definitely improved my Powershell and REST skills quite a bit, especially in the XML manipulation department. I’d like to give props to my co-worker Martin Pugh who was an indispensable source of Powershell expertise as well as the folks over at VMWare who were quite helpful in decoding some of the stranger errors I encountered.

Cheers.

2 thoughts on “Syncing NSX Managers across vCenters

  1. Pingback: VMware NSX 6.2: Support for Multiple vCenters | Lostdomain.org

  2. Pingback: VMware NSX 6.2: Support for Multiple vCenters VMGuru

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s