How we are using SQL Server in Octopus 3.0

In a previous post I announced that we are switching from RavenDB to SQL Server for Octopus 3.0. That post talked about why we were leaving RavenDB, but didn't explain too much about how we plan to use SQL Server. In this post, I want to talk about how were using SQL Server, as well as discuss some minor breaking changes.

We've just finished most of the work involved in porting Octopus 3.0 to SQL Server. We have a suite of automated API tests that install and configure an Octopus server, register some Tentacles, and exercise the application using the REST API. These tests are now passing and running completely against SQL Server:

End to end tests passing

Editions

We are making sure Octopus works on:

  • SQL Server 2005, 2008, 2012, 2014 and above; any edition from Express to Enterprise
  • SQL Azure

To make the getting started experience smooth and easy, the Octopus installer will give you the option to automatically download and silently install SQL Server Express edition, which is free. Of course, you could always connect it to a clustered SQL Server Enterprise server instead, though the licensing costs for other SQL Server editions would be something you'd need to discuss with Microsoft ;-)

High availability

Today, Octopus actually uses a few different data stores:

  • Most data is stored in RavenDB
  • Deployment logs (which are always appended to) are stored on disk because it wasn't possible to append to attachments unless you are Schlemiel the Painter
  • State about in-progress deployments and other tasks was also stored on disk
  • NuGet packages in the built-in repository are stored on disk, with metadata in Lucene.NET indexes

And while we supported using an external (clustered) RavenDB instance, it's not something most customers are really able to set up and manage, so we nearly always use the embedded version of RavenDB. Because we also had data in so many places, we needed to build our own backup and restore features into the product.

For Octopus 3.0, we're going to make sure we have a great high availability story. Most enterprises are already familiar with setting up a clustered SQL Server instance, and have DBA's on site that can help to manage it. So our first design principle will be that everything (nearly) needs to be in SQL Server. Specifically:

  • All the documents we currently store in Raven will go to SQL Server
  • Deployment logs will be compressed (they compress very nicely) and also stored in SQL
  • In progress deployment state: we'll rely on this being in memory (see the breaking change section below)
  • NuGet packages will still be on disk (you'll be able to change where they are stored, and put them on a file share/SAN), but metadata will be stored in SQL

In addition, we're going to make sure that you can set up multiple Octopus Deploy servers, all pointing at the same SQL database/using the same packages directory. Installation wizards and command line tools will make it easy to set up a Siphonophore:

Octopus server load balanced

It won't exactly be web scale, but Stack Exchange have done a good job of demonstrating that you can get pretty far by scaling out application servers and scaling up the database.

Breaking change: There is one scenario that we won't be supporting any longer: restarting the Octopus server during a deployment.

Previously, you could kick off a long running deployment, then shut down the Octopus server, start it again, and there was a pretty good chance it would continue where it left off. I say "chance" because it's impossible to test all the scenarios, and we know some areas where it didn't work and deployments would be left in a weird state where they said they were running but actually weren't. We'll be able to simplify things and get far better performance by removing this feature, and since I don't think it ever fully worked reliably, it should be an OK change. If this affects you, let me know in the comments below!

SQL as a document store

The one feature we loved (and will miss) about using a document database like RavenDB was the ability to store and load large, deep object graphs without a ton of joins. For example, Octopus allows you to define variables, which are key/value pairs that can be scoped to many different fields. Some customers have thousands of these, and we snapshot them every release, so to model this with a traditional relational schema would make things very complicated. And we're never actually going to query against that data, we just need to load it all into memory during deployments.

Instead, we're treating SQL as a document store. Each document type gets its own table, and fields that we query on will be stored as regular columns. But all the fields and deep object graphs that we don't query on are stored as a JSON blob (a nvarchar(max)).

Storing documents in SQL

Since we don't do any joins, we don't need an ORM to help stitch object graphs together. Instead, we're staying close to the metal, essentially using some wrappers around SqlConnection/SqlCommand that use JSON.NET to deserialize the JSON blobs and then set the extra fields. A custom JSON.NET JsonContractResolver excludes properties that are mapped as table columns so the values aren't stored twice.

The only downside to this design is that there are a handful of places where we have to do LIKE %x% queries over tables - e.g., to find all machines tagged with a given role (the list of roles is stored as a pipe-separated nvarchar column on the Machine table). However in all of these cases we expect these tables to be < a few thousand items, so I really don't expect it to matter. If testing shows otherwise, we'll either use full text search or introduce a new table in a CQRS-like index table.

Backup, restore and maintenance

Since all of our data will either be in SQL Server or on a file share (NuGet packages), at this stage I expect to be able to remove our custom backup/restore features and to just rely on SQL Server backups. We'll provide some guidance on how to configure this, and some feedback in the Octopus UI if you have forgotten to do a SQL backup in some time, but in general I think SQL Server's built-in backup/restore features are better than anything we're likely to build.

Migration

The upgrade experience from 2.6 to 3.0 will be straightforward: you'll install 3.0, select/create a SQL Server database to use, and then choose an Octopus 2.6 backup to import. We'll convert the data as needed and then you'll be up and running in no time. It will feel much more like upgrading between 2.5 and 2.6 than upgrading from 1.6 to 2.0.

So far we've done nearly all of the conversion to SQL Server and haven't had to make any API changes, so any code against our 2.X REST API will work against 3.0.

Testing

We collect (opt-in) usage statistics, and there are some big Octopus installations out there - 300+ projects, 1000+ machines, with over 20,000 deployments. We'll be using this data to simulate similar environments and to ensure we don't release anything that is slower than what we already have.

We'll start by running our end-to-end tests and comparing the current 2.6 builds with the upcoming 3.0 builds to ensure that none of our current operations are any slower on smaller data sets. Then we'll move on to load testing to ensure that we can handle at least 5x larger installations than we currently have without crazy hardware requirements.

If anyone's interested in seeing some of these metrics, let me know in the comments and I'll do a third post in this series :-)

TeamCity 9 plugin compatibility

We're big fans of TeamCity here at Octopus Deploy, so we're as excited as everybody else about their recent release of TeamCity 9.

Everybody is talking about the ability sync your project build settings with your version control system. It's something we've been thinking about a lot lately too so it's going to be interesting to see what sort of takeup the feature has.

The big question we've been getting asked over the last few days of course is compatibility between our TeamCity plugin and the new release from JetBrains.......

The answer, I'm happy to say, is...YES!

I've installed the plugin with a brand new, fresh out of the oven TeamCity 9 install and everything works just fine.

Happy building!

In Octopus 3.0, we're switching from RavenDB to SQL Server

Early beta versions of Octopus used SQL Server with Entity Framework. In 2012, just before 1.0, I switched to using RavenDB, and wrote a blog post about how we use the Embedded version of RavenDB.

For over two years we've been developing on top of RavenDB. In that time we've had over 10,000 installations of Octopus, which means we've been responsible for putting RavenDB in production over 10,000 times. And since most customers don't have in-house Raven experts, we've been the first (only) line of support when there are problems with Raven. We haven't just been kicking the tyres or "looking at" Raven, we bet the farm on it.

For Octopus 3.0, we are going to stop using RavenDB, and use SQL Server instead. Understandably many people are interested in "why". Here goes.

First, the good

RavenDB has a great development experience. Compared to SQL + EF or NHibernate, you can iterate extremely fast with RavenDB, and it generally "just works". If I was building a minimum viable product on a tight deadline, RavenDB would be my go-to database. We rewrote nearly all of Octopus in 6 months between 1.6 and 2.0, and I don't think we could have iterated that quickly on top of SQL + EF.

The bad

We handle most support via email/forums, but when there are big problems, we escalate them to a Skype/GoToMeeting call so we can help the customer. Usually that's very early in the morning, or very late at night, so minimizing the need to do them is critical to our sanity.

What's the cause of most of our support calls? Unfortunately, it's either Raven, or a mistake that we've made when using Raven. And it's really easy to make a mistake when using Raven. These problems generally fall in two categories: index/data corruption issues, or API/usage issues.

Above all else, a database needs to be rock-solid and perform reliably. Underneath Raven uses ESENT, and we've generally not lost any data from the transactional side of Raven. But indexes are based on Lucene.NET, and that's a different story. Indexes that have broken and need to be rebuilt are so common that for 1.6 we wrote a blog post explaining how people can reset their indexes. We sent this blog post to so many people that in 2.0 we built an entire feature in the UI to do it for them.

Repair RavenDB

When I said we'd never lost the transactional data, that's not quite right. It's really easy in RavenDB to add an index that causes big, big problems. Take this:

  Map = processes => from process in processes
                     from step in process.Steps
                     select {...}
  Reduce = results => from result in results
                      group result by ....

You can write this index, and it works fine for you, and you put it into production. And then you find a customer with 10,000 process documents, each of which have, say, 40 steps.

While Raven uses Lucene for indexing, it also writes index records into ESENT. I don't know the internals, but there are various tables inside the Raven ESENT database, and some are used for temporarily writing these map/reduce records. For every item being indexed, it will write a huge number of records to these tables. So we get a support issue from a customer: they start Octopus, and their database file grows at tens or hundreds of MB per second, until it fills up the disk. The database file becomes so large that they can't repair it. All they can do is restore from a backup. When we finally got a copy of one of these huge data files, and explored it using some UI tools for ESENT, these tables contained millions upon millions of records, just for 10,000 documents.

The RavenDB team realised this was a problem, because in 3.0 they added a new feature. If a map operation produces more than 15 output records, that document won't be indexed.

I mean, just read that paragraph again. You write some code, test it, and it works fine in development. You put it in production and it works fine there too, for everyone. And then you get a call from a customer: I just added a new process, and it's not appearing in the list. Only after many emails and a support call do you realise that it's because Raven decided that 15 is OK and 16 is not, and the item isn't being indexed. Your fault for not reading the documentation!

"Safe by default" is so painful in production

Raven has a "safe by default" philosophy, but the API makes it so easy to write "safe" code that breaks in production. For example:

session.Query<Project>().ToList();

Put this in production and you'll get a support call: "I just added my 129th project and it isn't showing up on screen". In order to protect you from the dreaded "unbounded result set" problem, Raven limits the number of items returned from any query. Be thankful it wasn't this:

DeleteExcept(session.Query<Project>().Where(p => !p.KeepForever).ToList())

Unbounded result sets are bad, sure. But code that works in dev, and in production, until it suddenly behaves differently when the number of records change, is much worse. If RavenDB believes in preventing unbounded result sets, they shouldn't let that query run at all - throw an exception when I do any query without calling .Take(). Make it a development problem, not a production problem.

You can only do 30 queries in a session. Result sets are bounded. Only 15 map results per item being mapped. When you work with Raven, keep these limits in your mind *every single time you interact with RavenDB, or you'll regret it.

These limits are clearly documented, but you'll forget about them. You only become aware of them when something strange happens in production and you go searching. Despite two years of production experience using Raven, these opinions still bite us. It frustrates me to see posts like this come out, advocating solutions that will actively break in production if anyone tries them.

Conclusion

RavenDB is great for development. Maybe the problems we're experiencing are our fault. All databases have their faults, and perhaps this is a case of the grass is always greener on the other side. Switching to SQL Server might seem like a step backwards, and might make development harder, but at this point I do feel like we will have less problems in production with SQL Server. It has been around for a long time, and the pitfalls are at least well known and predictable.

That's enough about why we're leaving RavenDB. Next week I'll share some details about how we plan to use SQL Server in Octopus 3.0.

(*) You can disable the unbounded result set protection thing by specifying unlimited items to be returned, if you know where to turn it off. But you still have to explicitly call .Take(int.MaxValue) every single time you write a query.

%TEMP% has different values for a Windows Service running as Local System

You probably already know that Environment variables can be defined at either machine scope, or user scope. The value at the user scope typically overrides the value defined at machine scope.

Environment variables

However, there's a special case for Windows Services that run as the SYSTEM account. Given the following Windows Service:

public partial class Service1 : ServiceBase
{
    public Service1()
    {
        InitializeComponent();
    }

    protected override void OnStart(string[] args)
    {
        File.WriteAllText("C:\\Temp\\Service.txt", 
            "Temp:        " + Environment.GetEnvironmentVariable("Temp") + Environment.NewLine +
            "Temp (User): " + Environment.GetEnvironmentVariable("Temp", EnvironmentVariableTarget.User) + Environment.NewLine);
    }
}

When the service runs as my user account, I get what I'd expect:

Temp:        C:\Users\Paul\AppData\Local\Temp
Temp (User): C:\Users\Paul\AppData\Local\Temp

However, run the service as the built-in SYSTEM (Local System) account, and you get different behavior:

Temp:        C:\WINDOWS\TEMP
Temp (User): C:\WINDOWS\system32\config\systemprofile\AppData\Local\Temp

It appears that for Windows Services that run under the SYSTEM account, even though there's a user-specific environment variable, a different %TEMP% is used.

This caused a bug in yesterday's 2.6 pre-release because we added a feature to automatically update environment variables prior to each script run (in case you've changed environment variables, and don't want to restart the Tentacle windows service). Of course, no good deed goes unpunished :-)

I can't find any documentation on this feature, but environment variables are inherited by processes from their parent. Services are owned by services.exe, which is owned by wininit.exe. Using Process Explorer, wininit.exe's environment variables set TEMP to C:\Windows\TEMP. My guess is this is probably a backwards compatibility feature for old Windows Services that relied on using C:\Windows\TEMP.

(We'll release a patch to 2.6 tomorrow with a fix for this)

What's new in Octopus Deploy 2.6

Octopus Deploy 2.6 is now in pre-release! For those who like to live on the edge, you can Download the Octopus Deploy 2.6 pre-release. And who wouldn't want to live on the edge when this release contains so many new features? Here are the highlights:

  • Lifecycles to control promotion and automate deployments
  • Automatic release creation from NuGet push
  • Running steps in parallel
  • Up to 5x faster package uploads
  • Skipping offline machines

Lifecycles

This heading just does not have enough fanfare for this feature. Imagine balloon's springing out, trumpeters trumpeting and confetti cannons making a mess everywhere.

Okay I will stop, but I do love this feature!

Lifecycles main

Lifecycles let you specify and control the progression of deployments to environments. You not only have the ability to order environments for deployment but you can:

  • set environments to auto deploy when they are eligible for deployment
  • gate your workflow to be sure that N QA environments have been deployed to before moving on
  • group multiple environments into a single stage
  • deploy to more than one environment at a time

Yep, deploy to more than one environment at a time!

A Lifecycle consists of phases and retention policies. Let's start with phases.

Lifecycle Phases

Lifecycle phases

A Lifecycle can consist of many phases. A phase can consist of many environments. Each phase will allow you to add environments to it. You can gate each phase, to be sure that N many environments have been released to before the next phase is eligible for deployment.

Lifecycle automagic

When selecting which environment to add to a phase you have the choice of if they will be a manually release or auto release. If they are set to auto release, when they get to their phase of the deployment chain, they will begin deployment.

Lifecycles and Retention Policies

Lifecycles have their own retention policies. Each has an overall retention policy that each phase will inherit. However you can overwrite this for each phase.

Lifecycle Retention Policy 1

This will mean for development where you have 1300 releases a week, you can have a very strict retention policy set to delete all but the last 3 releases. But for production you can keep everything forever. Or somewhere in the middle if your needs aren't so extreme

Lifecycles and Projects

Lifecycles are assigned to projects via the Process screen.

Lifecycles project process

You might notice that your Projects Overview screen has had a bit of an overhaul.

Lifecycles project overview

It will now display your most recent releases, where they are at per environment, and provide any deployment or promotion buttons. It allows you to see the most current, and previous deployments at a glance. Solid green being your most recent deployment, medium faded green as your previous deployment, and light faded green as all others.

Lifecycles, Releases, Deployments, and Promotions

Lifecycles releases

The release page now gives you a graphical tree of where a deployment is currently, which phase and what's deploying. You may notice a few things here. The deploy/promote button got a bit smarter. It knows whats next in the chain. It also allows you to deploy to any environment that has already been deployed to.

Lifecycles two at once

You can now release to multiple environments at the click of one button. Yep.

Lifecycles multiple environments

Or you can use this select box, and choose them all!

Lifecycles smart promote

And when you have finished a deployment, the promote button knows what's next and gives you the option to promote to the next environment.

Lifecycle deployment

The deployment screen got a little simpler too. Much easier to find the deploy now button. But don't worry everything is available under Advanced, and if you are so inclined you can tell it to remember to show the advanced settings all the time.

Lifecycles and Blocking Deployments

If you have a bad release that has just done some bad stuff (tm) to your QA server, you might want to block that release from being released further down the chain until the issue is resolved. Now you can block a deployment.

block deployment

Step 1 block the deployment for reasons.

show blocked deployment

On your release screen you can see that the promote button has vanished and your Lifecycle deployment tree has a red icon for those environments it cannot promote to.

name

You will also now see a warning marker on your overview, and will no longer see promotion buttons for that release. In fact all promotion buttons are gone for it. You can only at this point deploy to environments that have already been deployed to. Unblocking when the issue is resolved will give you full access to deploy the release.

Lifecycles and Automatic Release Creation

Yes, there is still more!

Lifecycles automagic settings

On a project process screen, you can define the name of a package, that when pushed or uploaded to the internal repository will automatically create a release for you.

lifecycles automagic create

And if you have your first environment in the Lifecycle setup to automatically deploy, this means you can push a NuGet package to the internal repository and have it automatically create a release and deploy it! We are looking at you TFS users!

As you can see Lifecycles is a feature that has it's hands in many areas of Octopus, it's a large feature, that we are very proud of. We have used this opportunity to try to listen to your feedback and suggestions in UserVoice to to add in more value. We really hope you like it as much as we do!

Run Steps in Parallel

Another feature in 2.6 allows you to setup multiple project steps to run in parallel.

Project step trigger

You can select on a project step for it to run in parallel with the previous step.

project step together

The process page has been updated to show these steps grouped together. If they run on the same machine, they will still queue unless you configure the project to allow multiple steps to run in parallel

Retention Policies have moved

As seen above, retention policies have moved into Lifecycles. You will no longer find a Retention Policy tab under Configuration. They are also not able to be set for Project Groups any more. This leaves setting the retention policy for the internal package repository.

repository retention settings

This has been moved to where the packages live, under Library -> Packages.

Package Upload Streaming

In 2.6 when Octopus downloads a package and then sends it to the Tentacles, it will now do so via streaming. We have seen a 5x speed increase. This will also alleviate some of the overhead memory that Tentacle was using to store the package chunks before saving.

SNI Support

As well as fixing up some issues with SSL bindings, we added SNI support.

SNI support

Skip Offline Machines

Currently, when doing a deployment to very large environments, offline machines can get in your way. We now give the ability to continue with the deployment but skip the offline machines.

show offline machines

We now show offline machines (displayed in red) on the deploy screen. This will allow you to go back and check the machines connection. Or you can use the "ignore offline machines" feature.

ignore offline machines

This will automatically list all machines but the offline machines.

This ends the tour of what's new in 2.6. We have only mentioned the big features in this release, but there were quite a few smaller changes and bug fixes made, so please check the release notes for more details on these smaller items. We hope you are excited about Lifecycles as we are!

Download the Octopus Deploy 2.6 Pre-release now!

Invoking an executable from PowerShell with a dynamic number of parameters

Calling an executable from PowerShell is easy - most of the time, you just put an & in front. To illustrate, let's take this C# executable:

static void Main(string[] args)
{
    for (int i = 0; i < args.Length; i++)
    {
        Console.WriteLine("[" + i + "] = '" + args[i] + "'");
    }
}

If we call it like this:

& .\Argsy.exe arg1 "argument 2"

We get:

[0] = 'arg1'
[1] = 'argument 2'

PowerShell variables can also be passed to arguments:

$myvariable = "argument 2"
& .\Argsy.exe arg1 $myvariable

# Output:
[0] = 'arg1'
[1] = 'argument 2'

Note that the value of $myvariable contained a space, but PowerShell was smart enough to pass the whole value as a single argument.

This gets tricky when you want to conditionally or dynamically add arguments. For example, you might be tempted to try this:

$args = ""
$environments = @("My Environment", "Production")
foreach ($environment in $environments) 
{
    $args += "--environment "
    $args += $environment + " "
}

& .\Argsy.exe $args

However, you'll be disappointed with the output:

[0] = '--environment My Environment --environment Production '

The right way

The way to do this instead is to create an array. You can still use the += syntax in PowerShell to build the array:

$args = @() # Empty array
$environments = @("My Environment", "Production")
foreach ($environment in $environments) 
{
    $args += "--environment"
    $args += $environment
}
& .\Argsy.exe $args

Which outputs what we'd expect:

[0] = '--environment'
[1] = 'My Environment'
[2] = '--environment'
[3] = 'Production'

You can also mix regular strings with arrays:

& .\Argsy.exe arg1 "argument 2" $args

# Output:
[0] = 'arg1'
[1] = 'argument 2'
[2] = '--environment'
[3] = 'MyEnvironment'
[4] = '--environment'
[5] = 'Production'

Edge case

There's is a very odd edge case to what I said above about passing a single string with all the arguments. Take this example, which is similar to the one above:

$args = "--project Foo --environment My Environment --environment Production"
& .\Argsy.exe $args

# Output: 
[0] = '--project Foo --environment My Environment --environment Production'

To make it work as intended, just put a quote around the first argument, and the behaviour changes completely! (The backticks are PowerShell's escape characters)

$args = "`"--project`" Foo --environment My Environment --environment Production"
& .\Argsy.exe $args

# Output: 
[0] = '--project'
[1] = 'Foo'
[2] = '--environment'
[3] = 'My'
[4] = 'Environment'
[5] = '--environment'
[6] = 'Production'

The behavior doesn't change if the first argument isn't quoted:

$args = "--project `"Foo`" --environment My Environment --environment Production"
& .\Argsy.exe $args

# Output: 
[0] = '--project Foo --environment My Environment --environment Production'

Ahh, PowerShell. Always full of surprises!

Dynamically setting TeamCity version numbers based on the current branch

When you are using TeamCity to build a project with multiple branches, it's desirable to have different build numbers depending on the branch. For example, instead of simple TeamCity build numbers like 15, 16, and so on, you might have:

  • Branch master: 1.6.15
  • Branch release-1.5: 1.5.15 (major/minor build from branch name)
  • Branch develop: 2.0.15 (different minor build)
  • Branch feature-rainbows: 2.0.15-rainbows (feature branch as a tag)

Here's how it looks:

TeamCity builds with build numbers based on the branch

Handling a branching workflow like GitFlow, and using these version formats, turns out to be pretty easy with TeamCity, and in this blog post I'll show you how. Your own versioning strategy is likely to be different, but hopefully this post will get you started.

Background

First, there are two built-in TeamCity parameters that we care about:

  • build.counter - this is the auto-incrementing build counter (15 and 16 above)
  • build.number - this is the full build number. By default it is %build.counter%, but it can be more complicated

The format of build.number and value of build.counter is defined in the TeamCity UI:

Build number and build counter in TeamCity

However, you can also set it dynamically during the build, using service messages. That is, your build script can write the following text to stdout:

##teamcity[buildNumber '1.1.15']

This will override the build number, and the new value will then be passed to the rest of the steps in the build.

Putting it together

Depending on whether the branch name is master or develop, we will use different major/minor build numbers. To do this, we're going to define two parameters in TeamCity. These need to be "system" parameters in TeamCity so that they are available to build scripts.

Adding the major/minor build number parameters

To dynamically set the build number based on the branch name, I'm going to add a PowerShell script step as the first build step in my build:

Using a PowerShell script build step to set the build number

Finally, here's the PowerShell script:

# These are project build parameters in TeamCity
# Depending on the branch, we will use different major/minor versions
$majorMinorVersionMaster = "%system.MajorMinorVersion.Master%"
$majorMinorVersionDevelop = "%system.MajorMinorVersion.Develop%"

# TeamCity's auto-incrementing build counter; ensures each build is unique
$buildCounter = "%build.counter%" 

# This gets the name of the current Git branch. 
$branch = "%teamcity.build.branch%"

# Sometimes the branch will be a full path, e.g., 'refs/heads/master'. 
# If so we'll base our logic just on the last part.
if ($branch.Contains("/")) 
{
  $branch = $branch.substring($branch.lastIndexOf("/")).trim("/")
}

Write-Host "Branch: $branch"

if ($branch -eq "master") 
{
 $buildNumber = "${majorMinorVersionMaster}.${buildCounter}"
}
elseif ($branch -eq "develop") 
{
 $buildNumber = "${majorMinorVersionDevelop}.${buildCounter}"
}
elseif ($branch -match "release-.*") 
{
 $specificRelease = ($branch -replace 'release-(.*)','$1')
 $buildNumber = "${specificRelease}.${buildCounter}"
}
else
{
 # If the branch starts with "feature-", just use the feature name
 $branch = $branch.replace("feature-", "")
 $buildNumber = "${majorMinorVersionDevelop}.${buildCounter}-${branch}"
}

Write-Host "##teamcity[buildNumber '$buildNumber']"

Now that %build.number% is based on the branch, your TeamCity build has a consistent build number that can then be used in the rest of your build steps. If you are using OctoPack, for example, the build number can be used as the value of the OctoPackPackageVersion MSBuild parameter so that your NuGet packages match the build number.

Azure VM extension for Octopus Deploy

Today ScottGu announced that the Octopus Deploy Tentacle agent is now available as an extension for Azure VM's:

Octopus simplifies the deployment of ASP.NET web applications, Windows Services and other applications by automatically configuring IIS, installing services and making configuration changes. Octopus integration of Azure was one of the top requested features on Azure UserVoice and with this integration we will simplify the deployment and configuration of octopus on the VM.

Of course, even before this extension, you could always install Tentacle either manually or automatically via scripts. The extension just puts a pretty UI around that. Under the hood, the extension uses our open source PowerShell DSC resource for Tentacles.

The extension on Azure

Why Tentacles on Azure VMs?

There are many different ways to host applications on Microsoft Azure: websites, cloud services, or as regular .NET applications running on a virtual machine.

When you provision a VM on Azure, out of the box you get a running operating system, a remote desktop connection, and a PowerShell remoting connection. And that's about it. If you want to deploy, configure and re-deploy applications on the machine, you'll either need to do it manually, or write custom scripts to copy files, update configuration files, and so-on.

Of course, these are all problems that Octopus Deploy solves, and solves. By adding the Tentacle agent to your Azure VM, you can then immediately start to deploy to it just like any other machine in Octopus.

For more information on using the extension, or adding the extension from the command line via PowerShell, check out our documentation.

Docker on Windows and Octopus Deploy

Today, the Gu announced that Microsoft is partnering with Docker to bring Docker to Windows.

Microsoft and Docker are integrating the open-source Docker Engine with the next release of Windows Server. This release of Windows Server will include new container isolation technology, and support running both .NET and other application types (Node.js, Java, C++, etc) within these containers. Developers and organizations will be able to use Docker to create distributed, container-based applications for Windows Server that leverage the Docker ecosystem of users, applications and tools.

How exciting! I've spent the last few hours drilling into Docker and what this announcement might mean for the future of .NET application deployments. Here are my thoughts so far.

Containers vs. virtual machines

Apart from Scott's post I can't find much information about the container support in Windows Server, so I'll prefix this by saying that this is all speculation, purely on the assumption that they'll work similar to Linux containers.

Once upon a time, you'd have a single physical server, running IIS with a hundred websites. Now, with the rise of virtualization and cloud computing, we tend to have a single physical server, running dozens of VM's, each of which runs a single application.

Why do we do it? It's really about isolation. Each application can run on different operating systems, have different system libraries, different patches, different Windows features (e.g., IIS installed), different versions of the .NET runtime, and so on. More importantly, if one application fails so badly that the OS crashes, or the OS needs to restart for an update, the other applications aren't affected.

In the past, we'd start to build an application on one version of the .NET framework (say, 3.5,), only to be told there's no way anyone is putting 3.5 on the production server because there are 49 other applications on that server using 3.0 that might break, and it will take forever to test them all. Virtualization has saved us from these restrictions.

From a deployment automation perspective, a build server compiles code, and produces a package ready to be deployed. The Octopus Deploy server pushes that package to a remote agent, the Tentacle, to deploy it.

Deployment today with Octopus on virtual machines

So, isolation is great. But the major downside is that we effectively have a single physical server, each running many copies of the same OS kernel. Which is a real shame, since that OS is a server-class OS designed for multitasking. In fact, assuming you run one main application per virtual machine, your physical box is actually running more OS's than it is running primary applications!

Containers are similar, but different: there's just one kernel, but each container remains relatively isolated from each other. There's plenty of debate about just how secure containers are compared to virtual machines, so VM's might always be preferred for completely different customers sharing the same hardware. However, assuming a basic level of trust exists, containers are a great middle ground.

The What is Docker page provides a nice overview of why containers are different to virtual machines. I've not seen much about how the containers in Windows Server will work, but for this post I'll assume they'll be pretty similar.

Where Docker fits

Docker provides a layer on top of these containers that makes it easier to build images to run in containers, and to share those images. Docker images are defined using a text-based Dockerfile, which specifies:

  • A base OS image to start from
  • Commands to prepare/build the image
  • Commands to call when the image is "run"

For a Windows Dockerfile, I imagine it will look something like:

  • Start with Windows Server 2014 SP1 base image
  • Install .NET 4.5.1
  • Install IIS with ASP.NET enabled
  • Copy the DLL's, CSS, JS etc. files for your ASP.NET web application
  • Configure IIS application pools etc. and start the web site

Since it's just a small text file, your Dockerfile can be committed to source control. From the command line, you then build an "image" (i.e., execute the Dockerfile), which will download all the binaries and create a disk image that can be executed later. You can then run instances of that image on different machines, or share it with others via Docker's Hub.

The big advantage of Docker and using containers like this isn't just in memory/CPU savings, but in making it more likely that the application you're testing in your test environment will actually work in production, because it will be configured exactly the same way - it is exactly the same image. This is a really good thing, taking building your binaries once to the extreme.

What it means for Octopus

First up, remember that Octopus is a deployment automation tool, and we're especially geared for teams that are constantly building new versions of the same application. E.g., a team building an in-house web application on two-week sprints, deploying a new release of the application every two weeks.

With that in mind, there are a few different ways that Docker and containers might be used with Octopus.

Approach 1: Docker is an infrastructure concern

This is perhaps the most basic approach. The infrastructure team would maintain Dockerfiles, and build images from them and deploy them when new servers are provisioned. This would guarantee that no matter which hosting provider they used, the servers would have a common baseline - the same system libraries, service packs, OS features enabled, and so on.

Instead of including the application as part of the image, the image would simply include our Tentacle service. The result would look similar to how Octopus works now, and in fact would require no changes to Octopus.

Octopus/Tentacle in a world of Docker

This has the benefit of making application deployments fast - we're just pushing the application binaries around, not whole images. And it still means the applications are isolated from each other, almost as if they were in virtual machines, without the overhead. However, it does allow for cruft to build up in the images over time, so it might not be a very "pure" use of Docker.

Approach 2: Build a new image per deployment

This approach is quite different. Instead of having lots of copies of Tentacle, we'd just need one on the physical server. On deployment, we'd create new images and run them in Docker.

  1. Build server builds the code, runs unit tests, etc. and creates a NuGet package
  2. Included in the package is a Dockerfile containing instructions to build the image
  3. During deployment, Octopus pushes that NuGet package to the remote machine
  4. Tentacle runs docker build to create an image
  5. Tentacle stops the instance if it is running, then starts the new instance using the new image

The downside of this is that since we're building a different image each time, we're losing the consistency aspect of Docker; each web server might end up with a slightly different configuration depending on what the latest version of various libraries was at the time.

On the upside, we do gain some flexibility. Each application might have different web.config settings etc., and Octopus could change these values prior to the files being put in the image.

Approach 3: Image per release

A better approach might be to build the Docker image earlier in the process, like at the end of the build, or when a Release is first created in Octopus.

Docker images in Octopus

  1. Build server builds the code, runs unit tests, etc.
  2. Build server (or maybe Octopus) runs docker build and creates an image
  3. The image is pushed, either to Octopus or to Docker Hub
  4. Octopus deploys that image to the remote machine
  5. Tentacle stops the instance if it is running, then starts the new instance using the new image

This approach seems to align best with Docker, and provides much more consistency between environments - production will be the same as UAT, because it's the exact same image running in production as was running in UAT.

There's one catch: how will we handle configuration changes? For example, how will we deal with different connection strings or API keys in UAT vs. production? Keep in mind that these values tend to change at a different rate than the application binaries or other files that would be snapshotted in the image.

In the Docker world, these settings seem to be handled by passing environment variables to docker run when the instance of the image is started. And while Node or Java developers might be conditioned to use environment variables, .NET developers rarely use them for configuration - we expect to get settings from web.config or app.config.

There's some other complexity too; at the moment, when deploying a web application, Octopus deploys the new version side-by-side with the old one, configures it, and then switches the IIS bindings, reducing the overall downtime on the machine. With Docker, we'd need to stop the old instance, start the new one, then configure it. Unless we build a new image with different configuration each time (approach #2), downtime is going to be tricky to manage.

Would Octopus still add value?

Yes, of course! :-)

Docker makes it extremely easy to package an application and all the dependencies needed to run it, and the containers provided by the OS make for great isolation. Octopus isn't about the mechanics of a single application/machine deployment (Tentacle helps with that, but that's not the core of Octopus). Octopus is about the whole orchestration.

Where Octopus provides value is for deployments that involve more than a single machine, or more than a single application. For example, prior to deploying your new web application image to Docker, you might want to backup the database. Then deploy it to just one machine, and pause for manual verification, before moving on to the rest of the web servers. Finally, deploy another Docker image for a different application. The order of those steps are important, and some run in parallel and some are blocking. Octopus will provide those high-level orchestration abilities, no matter whether you're deploying NuGet packages, Azure cloud packages, or Docker images.

Future of Azure cloud service projects?

Speaking of Azure cloud packages, will they even be relevant anymore?

There's some similarity here. With Azure, there's web sites (just push some files, and it's hosted for you on existing VM's), or you can provision entire VM's and manage them yourself. And then in the middle, there's cloud services - web and worker roles - that involve provisioning a fresh VM every deployment, and rely on the application and OS settings being packaged together. To be honest, in a world of Docker on Windows, it's hard to see there being any use for these kinds of packages.

Conclusion

This is a very exciting change for Windows, and it means that some of the other changes we're seeing in Windows start to fit together. Docker leans heavily on other tools in the Linux ecosystem, like package managers, to configure the actual images. In the Windows world, that didn't exist until very recently with OneGet. PowerShell DSC will also be important, although I do feel that the sytax is still too complicated for it to gain real adoption.

How will Octopus fit with Docker? Time will tell, but as you can see we have a few different approaches we could take, with #3 being the most likely (#1 being supported already). As the next Windows Server with Docker gets closer to shipping we'll keep a close eye on it.

SSL 3.0 "POODLE" and Octopus Deploy

There's a newly discovered security vulnerability named POODLE:

The attack described above requires an SSL 3.0 connection to be established, so disabling the SSL 3.0 protocol in the client or in the server (or both) will completely avoid it. If either side supports only SSL 3.0, then all hope is gone, and a serious update required to avoid insecure encryption. If SSL 3.0 is neither disabled nor the only possible protocol version, then the attack is possible if the client uses a downgrade dance for interoperability.

As discussed in our post on Heartbleed and Octopus Deploy, we use the .NET framework's SslStream class to set up a secure connection whenever the Octopus Deploy server and Tentacle deployment agents communicate.

When creating an SslStream, you specify the protocols to use. .NET 4.0 supports SSL 2.0, 3.0, and TLS 1.0. .NET 4.5 supports SSL 2.0, 3.0, and TLS 1.0, 1.1 and 1.2.

Interestingly, the default protocol value (in both .NET 4.0 and 4.5) is Tls | Ssl3. In other words, TLS 1.0 is preferred, but if the client/server only supports SSL 3.0, then it will fall back to that. As discussed in the paper, this is a problem even if your client/server support TLS, since an attacker could force a downgrade.

But there's good news - in Octopus, when we construct our SslStream, we're specific about the protocol to use - we specifically limit the connection to TLS 1.0 (Octopus runs on .NET 4.0 so we can't do TLS 1.1/1.2 yet). Since we control both the client and server, we don't need to worry about falling back to SSL 3.0, so we don't allow it.

We've actually been doing this for a long time now; in January 2013 we published an open source project called Halibut, which was a prototype that eventually morphed into the communication stack we use between Octopus and Tentacle. Even back then we were specific about only supporting TLS:

ssl.AuthenticateAsServer(serverCertificate, true, SslProtocols.Tls, false);

Things are a little different with the Octopus web portal (the HTML web front end used to manage your Octopus server). The portal is hosted on top of HTTP.sys, the kernel-mode driver behind IIS. Out of the box the portal use HTTP, but you can configure your web portal to be available over HTTPS if you prefer.

From what I understand, IIS and HTTP.sys use whatever protocols are supported by SChannel, which means they'll allow SSL 3.0. It looks like a registry change is necessary to disable SSL 3.0 in SChannel in order to prevent IIS/HTTP.sys from using it.

Microsoft also have a security advisory that uses Group Policy to disable SSL 3.0, but it seems focussed on Internet Explorer and not IIS.

TL;DR: Octopus/Tentacle communication isn't affected by POODLE. The web portal (if you expose it over HTTPS) might be, just as any other site you serve over HTTPS using IIS might be.

As always, Troy Hunt has a good write up on the POODLE bug.