Today, we are making an important announcement about two new open-source projects that we are releasing as part of the launch of our new CSC Open Source Program: Hanlon and the Hanlon Microkernel. These projects are the next-generation versions of two projects that some of you might already be familiar with, Razor and Razor-Microkernel.
For those of you who don’t know me, my name is Tom McSweeney and I am now working as a Senior Principal in the Office of the CTO at CSC. I joined CSC last November; since then I’ve been leading the team that has been defining the processes and procedures behind a new Open Source Program at CSC. I am also one of the co-creators of the Razor and Razor-Microkernel projects, which Nick Weaver and I wrote together when we were at EMC – projects that we open-sourced through Puppet Labs almost exactly two years ago today.
So, with that announcement, I’m sure that those of you who have been following the Razor and Razor-Microkernel projects from the beginning have a number of questions for us. I’ll take my best shot at answering a few of them here. If there are others that you have, you know how to reach me…
What’s in a name?
To start, many of you might be asking yourselves: ”Why the name change – from Razor to Hanlon – if it is basically the same project?” There are really two explanations for the name change that both had equal weight when we were making this decision. First, we decided to use a different name for “our Razor” in order to avoid confusion with the existing (Puppet Labs) Razor project. Without a name change we would always be left with a discussion of “our Razor” and “their Razor” (or worse, the “original Razor” and the “new Razor”). A simple change of names for our project removes that confusion completely.
Second, we felt that a name change would quickly highlight that “our Razor” was taking a new approach to solving the same problem as the “original Razor” that we released two years ago. We haven’t changed our emphasis on using an automated, policy-based approach for the discovery and provisioning of compute nodes, nor have we changed the basic structure of the interface: for example, we still talk of slices and we still support a RESTful API along with a CLI.
What has changed, however, is the structure and organization of the underlying codebase, along with how the RESTful API and CLI are implemented. There is a long tradition in many cultures of using name changes to highlight significant changes in the life of an individual, or in this case a project, and we felt that a name change needed to be made to signify this shift in how our server did what it did.
The next question that might come to mind might be “Why Hanlon?” Of all of the possible names we could have chosen for these projects, why would we pick the last name of an American writer from Scranton, PA? To put it quite simply, we felt that the name we chose for the project should be tied to the original name (Razor) in some way, shape, or form. As those of you who have been with us from the beginning might recall, the original name (Razor) was chosen because the journey that Nick and I set out on when we wrote the original Razor was very much inspired by Occam’s (or Ockham’s) Razor, which for us was best represented by the concept that, when you are seeking an explanation or solution to a problem, “Everything should be made as simple as possible, but no simpler”. Unfortunately, we couldn’t use the name Occam (or Ockham), because that name had already been trademarked and we didn’t want to start out CSC’s first foray into the world of open-source by contributing two new projects who’s names had to be changed shortly after they were released. After giving a bit of thought to many possible names for these two projects, we decided that we could easily link this project to the original Razor project by choosing another “Razor” from the many “Razors” that have been written down (in both modern and ancient times), and “Hanlon’s Razor” seemed to be a good fit.
Finally, many of you may be asking yourselves the following question: “If these two projects are really just the next-generation versions of Razor and the Razor-Microkernel why didn’t you just contribute your changes to the existing Puppet Labs projects?” The answer to this question is a bit more involved, and to provide an adequate answer, a bit of history is necessary.
In the beginning…
To say that Nick and I were pleasantly surprised by the reception that Razor received from the open-source community when we released the project two years ago would be an understatement. Nick and I were both familiar with using open-source software, but neither of us had spent much time contributing to open-source project much less creating software to release under an open-source license, so we really had no idea what we were getting ourselves into when we decided that Razor was something that should be released to the world as an open-source project. From the start, the response from the community to the open-source announcement was overwhelming. The first pull request for the Razor project was received a mere four hours after the announcement that we were open-sourcing the project and by the end of the first month we had almost 100 forks of the project and many more watchers. It quickly became obvious that, whatever the gaps or weaknesses in the project were, the community longed for a solution like the one we had put together.
Over the next six months, there were many changes in Razor. The community continued to build and we went through a major rewrite of the RESTful API to make it more RESTful and remove inconsistencies in the both the CLI and RESTful API that existed from slice to slice. The documentation for the project was greatly improved, and pull requests continued to pour in from the community. By the end of the year, we even had a pull request from the Chef community that added a Chef broker to Razor, although I have to say that the concept of providing support for both Puppet and Chef in a Puppet Labs open-source project did strike some users as odd, at least initially. Nick Weaver demonstrated a proof-of-concept implementation of changes he’d made to support Windows provisioning during his keynote presentation at PuppetConf 2012 but left EMC shortly after that take on a key (leadership) role on the Zombie team at VMware. At VMware, he and his team built an automation platform – Project Zombie – that is still being used today to automate the discovery and provisioning of servers for VMware’s VCloud Hybrid Services product. Deep down under the covers of that automation platform they are still using Razor to automate the discovery of servers added to their datacenters and to provision those servers with an ESX image so that they can be used to support customer workloads. I left EMC in early 2013, first to join Nick on the Zombie team at VMware and then to join Dan Hushon’s OCTO team at CSC. Throughout that time, in spite of the fact that we did not contribute much to the projects we had created due to issues with CLAs that hadn’t been signed by the companies we were now working for, we were pleased with how Razor continued to grow and evolve, with features that we’d only dreamed of (or hadn’t even imagined) being added by the community.
A turning point was reached
All of that began to change last summer. Last June, the Puppet Labs team sent Nick and me a brief email outlining the changes that they wanted to make to Razor in order to improve it. Almost from day one, the Puppet Labs team that supported the Razor project had expressed grave concerns over some of the components that Nick and I had selected for the project. Most of their concern centered around our use of MongoDB and Node.js, which made bundling of Razor into a commercially supported product difficult.
There was also a serious scalability issue that we were aware of when we launched Razor as an open-source project that was caused by the design of Razor’s Node.js-based RESTful API. That RESTful API actually handled requests by forking off Ruby processes that used the Razor CLI to handle those requests, something that we knew would be a performance bottleneck but that we had planned on fixing after Razor was released. Now, a year after the launch of Razor, the Puppet Labs team was proposing that these “unsupportable” components be removed from Razor (and replaced by components that were more easily supported as part of a commercial offering) and they were proposing that the call order be reworked so that the RESTful API was called by the CLI, instead of the CLI being called by the RESTful API.
While these changes were being made, the Puppet Labs team also suggested that a number of other improvements should be made to Razor, and while Nick and I agreed that some of these changes were necessary, there were others that we simply did not agree with. In the end, the Puppet Labs team decided to move on with their changes to Razor, with or without the support of the project’s creators, and since we couldn’t reach agreement on the changes Nick and I parted ways with the Puppet Labs team.
Since then, the Puppet Labs team has gone on to significantly rewrite the original Razor project under the name “razor-server” and it bears very little resemblance to the project Nick and I co-wrote two years ago. They’ve removed support for the underlying key-value store we were using to maintain Razor’s state and replaced it with a fixed-schema relational database. They’ve removed the state machines from our “model” slice and replaced them with an “installer” (which uses a simple templating mechanism to “control” the install process for a node). They removed the underlying Node.js server (something we applauded), and replaced it with a Torquebox instance (something we thought of doing a bit differently). In short, the Puppet Labs team made the Razor project into something that was much easier for them to include in and support as part of their Puppet Enterprise commercial offering, but Nick and I felt that with these changes they were leaving a significant portion of the Razor community behind.
CSC and Razor
About the time that I left EMC to join Nick at VMware, Dan Hushon left to join the CSC team as their new CTO. At CSC, Dan quickly became involved in discussions that led to the acquisition of InfoChimps by CSC. As part of that deal, Dan and his team were looking for a way to use DevOps-style techniques to automate the deployment of Big Data clouds and, naturally, they turned to Razor as part of that solution (Dan’s blog entry describing the Razor part of the solution that they built out can be found here).
And so a few seeds of change within CSC were planted. By using Razor, Dan and his team were able to quickly bootstrap the infrastructure they needed to build out Big Data clouds in an automated fashion, passing off the resulting systems to Puppet for final configuration as Hadoop clusters. The result of that groundbreaking work by Dan and his team last year and the interest that it generated, was that there was already a community of potential Razor users and developers in place when I joined CSC last November, and that community of users and developers has continued to build since we started work on the server we would come to call Hanlon.
The rebirth of Razor as Hanlon
So, how did we get where we are today (from Razor to a new project named Hanlon)? As is usually the case in these sorts of situations, it all started with knowledge and experience that was picked up as part of another, only partly related, project. During my brief sojourn as part of the Zombie team, it became all too apparent that there were a few tools and techniques that we were using as part of Project Zombie that could solve some of the issues we were having with Razor. Specifically:
- We used Rackup/Sinatra for the underlying server (rather than the Ruby daemon that we had used in building out Razor)
- We built a Grape::API-based RESTful API for that server (an interface provided by the grape gem), instead of trying to build that RESTful API using Node.js and then integrating that API with the underlying Ruby server
- We based the server we wrote on JRuby instead of Ruby, and
- We used the warbler gem to allow for distribution of that server as a WAR file to any servlet container we might want to use (including the Apache Tomcat server)
After a bit of thought, it wasn’t too hard to see how we could take this same set of tools and techniques and, with a bit of work, use them to redesign Razor and remove many of the issues we’d been struggling with over the previous 18 months, especially the performance issues we had struggled with from the beginning.
So, late last December, I set out to rewrite large chunks of Razor and, in the process, created the server that we would come to call Hanlon. The underlying Ruby daemon that we had used in Razor was removed, along with the associated Node.js image and server services. In their place, I constructed a Grape::API-based RESTful API for our new Rackup/Sinatra-based server. I also inverted the call order between our two APIs (the RESTful API and CLI) so that the CLI called the RESTful API, instead of the other way around. The dependencies on components that wouldn’t translate well to a JRuby-based server were removed (like the underlying reliance on native database drivers and the reliance on the daemon gem for some services) and the warbler gem was introduced to give us the ability to build a distributable version of the Hanlon server in form of a WAR file. In the end, what was left was a greatly simplified and much more performant codebase than we had started with in Razor.
Since the CSC Hanlon team was now building the Hanlon server as a WAR file, we also decided that we should do a bit of refactoring to separate out the parts of the codebase that were used by the CLI – a simple Ruby script – from the parts of the codebase that were used by the Rackup/Sinatra-based server. The result was a much simpler and significantly flatter directory structure for the project. Finally, we simplified the Hanlon server’s configuration file by removing many unused or redundant configuration parameters that were contained in the Razor server’s configuration file. In the end, we feel that we struck a good balance between reworking the codebase to make it more supportable and performant while maintaining the existing functionality from the old Razor project. In short, Hanlon should support the needs of most of the users of the original Razor project, with very little change needed.
…and of the Razor-Microkernel as the Hanlon-Microkernel
Of course, for any of these changes to work, we also had to make some changes to the Microkernel that we used to use with Razor to support our new Hanlon server; hence, a new Hanlon-Microkernel project. The biggest changes that we made to the Hanlon-Microkernel were changes to support the new URI structure used by the Hanlon server. We also made a few bug-fix type changes to properly support deployment of the Hanlon server as a WAR file to various servers (where the context of the RESTful API might change) and added support for a few new DHCP options the the Hanlon-Microkernel that were not supported in the old Razor-Microkernel project.
Finally, we added experimental support for gathering of BMC-related facts from the underlying node (if the node has a Baseboard Management Controller, of course). Our thought is that this will lead to changes to the node slice in Hanlon to support power-control of the node using that BMC-related meta-data, but that is a feature that will have to be added in the future; currently the facts are gathered, but the changes to the node slice have not yet been made. Of course, as was the case with the Hanlon project, the documentation for the Hanlon-Microkernel project in the project wiki was updated to reflect the changes that we had made.
We hope that those of you who have been using Razor to date will find Hanlon to be a preferable replacement. There are still a few rough edges to the project, but we have no doubt that with a bit of work most of the remaining gaps will be closed in short order.
The changes that we have made are a good start, but there are still other changes that are needed and that you, as the Razor community, can help with. Among them are the following:
- A script that can be used to migrate an existing Razor database (under either MongoDB or PostgreSQL) to a Hanlon database. Since the serialized objects in a Razor and Hanlon database contain the class names of the objects that were serialized, and since the root of that object hierarchy was changed when the root classes/modules were renamed (from Razor to Hanlon), an existing Razor database (and its objects) are not visible to a Hanlon server
- Changes to the node slice to support power-control of a node using the node’s BMC (and the BMC-related meta-data that is gathered from the node by the Hanlon Microkernel)
- Modifications to add support for the use of PostgreSQL for Hanlon’s underlying object store (up until now, our development and testing has been done with a MongoDB-based object store; the code to support the use of PostgreSQL is still in place, but we haven’t added in the appropriate non-native drivers to the project to support the use of PostgreSQL under JRuby).
- Adding support for provisioning of Windows using Hanlon
However, in spite of these gaps, we still feel that Hanlon is ready to release into the wild. We hope that you find it as useful as you found our initial foray – Razor – and we look forward to working with you to rebuild the formerly diverse and active Razor community around our two new CSC open-source projects: Hanlon and the Hanlon-Microkernel.