Cowboy in the desert.

How we are using SQL Server in Octopus 3.0

Paul Stovell

In a previous post I announced that we are switching from RavenDB to SQL Server for Octopus 3.0. That post talked about why we were leaving RavenDB, but didn't explain too much about how we plan to use SQL Server. In this post, I want to talk about how were using SQL Server, as well as discuss some minor breaking changes.

We've just finished most of the work involved in porting Octopus 3.0 to SQL Server. We have a suite of automated API tests that install and configure an Octopus server, register some Tentacles, and exercise the application using the REST API. These tests are now passing and running completely against SQL Server:

End to end tests passing

Editions

We are making sure Octopus works on:

  • SQL Server 2005, 2008, 2012, 2014 and above; any edition from Express to Enterprise
  • SQL Azure

To make the getting started experience smooth and easy, the Octopus installer will give you the option to automatically download and silently install SQL Server Express edition, which is free. Of course, you could always connect it to a clustered SQL Server Enterprise server instead, though the licensing costs for other SQL Server editions would be something you'd need to discuss with Microsoft ;-)

High availability

Today, Octopus actually uses a few different data stores:

  • Most data is stored in RavenDB
  • Deployment logs (which are always appended to) are stored on disk because it wasn't possible to append to attachments unless you are Schlemiel the Painter
  • State about in-progress deployments and other tasks was also stored on disk
  • NuGet packages in the built-in repository are stored on disk, with metadata in Lucene.NET indexes

And while we supported using an external (clustered) RavenDB instance, it's not something most customers are really able to set up and manage, so we nearly always use the embedded version of RavenDB. Because we also had data in so many places, we needed to build our own backup and restore features into the product.

For Octopus 3.0, we're going to make sure we have a great high availability story. Most enterprises are already familiar with setting up a clustered SQL Server instance, and have DBA's on site that can help to manage it. So our first design principle will be that everything (nearly) needs to be in SQL Server. Specifically:

  • All the documents we currently store in Raven will go to SQL Server
  • Deployment logs will be compressed (they compress very nicely) and also stored in SQL
  • In progress deployment state: we'll rely on this being in memory (see the breaking change section below)
  • NuGet packages will still be on disk (you'll be able to change where they are stored, and put them on a file share/SAN), but metadata will be stored in SQL

In addition, we're going to make sure that you can set up multiple Octopus Deploy servers, all pointing at the same SQL database/using the same packages directory. Installation wizards and command line tools will make it easy to set up a Siphonophore:

Octopus server load balanced

It won't exactly be web scale, but Stack Exchange have done a good job of demonstrating that you can get pretty far by scaling out application servers and scaling up the database.

Breaking change: There is one scenario that we won't be supporting any longer: restarting the Octopus server during a deployment.

Previously, you could kick off a long running deployment, then shut down the Octopus server, start it again, and there was a pretty good chance it would continue where it left off. I say "chance" because it's impossible to test all the scenarios, and we know some areas where it didn't work and deployments would be left in a weird state where they said they were running but actually weren't. We'll be able to simplify things and get far better performance by removing this feature, and since I don't think it ever fully worked reliably, it should be an OK change. If this affects you, let me know in the comments below!

SQL as a document store

The one feature we loved (and will miss) about using a document database like RavenDB was the ability to store and load large, deep object graphs without a ton of joins. For example, Octopus allows you to define variables, which are key/value pairs that can be scoped to many different fields. Some customers have thousands of these, and we snapshot them every release, so to model this with a traditional relational schema would make things very complicated. And we're never actually going to query against that data, we just need to load it all into memory during deployments.

Instead, we're treating SQL as a document store. Each document type gets its own table, and fields that we query on will be stored as regular columns. But all the fields and deep object graphs that we don't query on are stored as a JSON blob (a nvarchar(max)).

Storing documents in SQL

Since we don't do any joins, we don't need an ORM to help stitch object graphs together. Instead, we're staying close to the metal, essentially using some wrappers around SqlConnection/SqlCommand that use JSON.NET to deserialize the JSON blobs and then set the extra fields. A custom JSON.NET JsonContractResolver excludes properties that are mapped as table columns so the values aren't stored twice.

The only downside to this design is that there are a handful of places where we have to do LIKE %x% queries over tables - e.g., to find all machines tagged with a given role (the list of roles is stored as a pipe-separated nvarchar column on the Machine table). However in all of these cases we expect these tables to be < a few thousand items, so I really don't expect it to matter. If testing shows otherwise, we'll either use full text search or introduce a new table in a CQRS-like index table.

Backup, restore and maintenance

Since all of our data will either be in SQL Server or on a file share (NuGet packages), at this stage I expect to be able to remove our custom backup/restore features and to just rely on SQL Server backups. We'll provide some guidance on how to configure this, and some feedback in the Octopus UI if you have forgotten to do a SQL backup in some time, but in general I think SQL Server's built-in backup/restore features are better than anything we're likely to build.

Migration

The upgrade experience from 2.6 to 3.0 will be straightforward: you'll install 3.0, select/create a SQL Server database to use, and then choose an Octopus 2.6 backup to import. We'll convert the data as needed and then you'll be up and running in no time. It will feel much more like upgrading between 2.5 and 2.6 than upgrading from 1.6 to 2.0.

So far we've done nearly all of the conversion to SQL Server and haven't had to make any API changes, so any code against our 2.X REST API will work against 3.0.

Testing

We collect (opt-in) usage statistics, and there are some big Octopus installations out there - 300+ projects, 1000+ machines, with over 20,000 deployments. We'll be using this data to simulate similar environments and to ensure we don't release anything that is slower than what we already have.

We'll start by running our end-to-end tests and comparing the current 2.6 builds with the upcoming 3.0 builds to ensure that none of our current operations are any slower on smaller data sets. Then we'll move on to load testing to ensure that we can handle at least 5x larger installations than we currently have without crazy hardware requirements.

If anyone's interested in seeing some of these metrics, let me know in the comments and I'll do a third post in this series :-)


Tagged with: Architecture
Loading...