Service Deployment

Deployment is one of those tasks that can often be left late in the development lifecycle, though it is a non-trivial problem. The adoption of continuous integration as part of an agile approach encourages the deployment aspects to be undertaken along side the development so that at the end of each sprint, the stakeholder has an installable piece of software delivered. When creating a service orientated architecture the deployment problem increases in complexity. Gone are the days of a SQL script for the database server and an installer for the client machine. Now there are often tens of servers interacting in a medium scale solution, often in a web or application server farm to provide both resilience and scale out capabilities. Almost two years ago I took a step back and looked at how we were deploying software and saw that there had to be a better way. We were installing early versions of the Golden Gate software onto customer sites and experiencing a lot teething problems getting the system running. Often the problems were due to the servers not having the required pre-requisites installed such as the .NET framework, they did not have the correct services running and so on. In an attempt to document the installation process we ended up with an installation guide that was rapidly approaching 100 pages. There had to be a better way…

Environment and Role Manifests
I’m on occasion reminded that I’m primarily paid to think so I took a deep breath and started to think about the problem. What would the ideal situation be? The first, and in many ways the biggest, realization is that we wanted to treat the deployment of the whole system as a unit of work. We wanted to allow an administrator to define where they wanted our software to be deployed into their site and then they simply click ‘go’.

The definition of the system would include a list of the servers they wanted to use and the roles they wanted the server to perform. Windows Server has the concept of a role, when setting up a new installation you choose what you want the server to do; is it the active directory controller, is it an application server, is it a file server, is it a web server? Depending upon which roles you allocate, different features are available. Some roles are incompatible on the same server, some roles are dependent upon other roles being satisfied by other servers. The role concept was something we also required as we had a number of different server components: configuration, security, workflow, messaging and application services. Each component was a unit of deployment, a server could be allocated the workflow role for example, which contained a number of services such as instance management and task management . We did not want to have to walk / remote onto to each server and perform an installation, we wanted a central process co-ordinate and manage the installation across all of the servers.

We needed a collective term for the definition of a complete deployment and in the end I chose the term environment. This came from my days working for an internet bank where we had a strictly defined set of staging platforms (environments) that code had to work its way through on the way to production; integration test, system test, user acceptance test, pre production. The environment is the root level object in a system deployment and contains information such as the environment name, the list of servers to install to, common file locations such as the install directory and others. A firm is expected to have multiple environments, as a minimum: development, test and production/live.

The concepts of the environment and the role are similar to the two manifests that ClickOnce uses to control client installations: the publisher manifest and the application manifest. The publisher manifest is owned by the company that is running the software and it includes information specific to them such as the installation URL. The application manifest is owned the the company who authored the software and includes all of the files required on the client to run the software (amongst other details). In fact I drew a lot of inspiration from ClickOnce, what we wanted was a ClickOnce mechanism for the server deployment. ClickOnce is driven from the two XML manifest files that declare what is required, these are given to the ClickOnce engine to action and the deployment takes place. I’m a big fan of both declarative programming and modeling so I wanted a deployment model that could be actioned. This was 12 months before all the excitement around Oslo and DSLs flared up (and then died down again). We had seen that both WPF and WF worked well as XAML driven runtimes (in .NET 3.X) and so the basic concepts of a deployment model and runtime took shape.

In summary an environment contains a mapping of servers to roles. A role represents an installable server component. Both the environment and role details are captured as manifest files which can be described in XML.

Environment Manifest
The environment manifest is quite simple and most easily explained with an example:

<environment    name="Local" 
  <expertDatabaseServer serverName="" serverInstance="">
    <databaseConnection     databaseName="Expert" 
                            password="eo4G3S2KLO05EzgQb3Q==" />
    <server name="" 
            skipPrerequisitesCheck="false" servicesWebsite="Default Web Site">
        <role type="configuration"/>
        <role type="customworkflows"/>
        <role type="employeeIntake"/>
        <role type="fileopening"/>
        <role type="identity"/>
        <role type="messaging"/>
        <role type="queryservice"/>
        <role type="security"/>
        <role type="workflow">
            <roleParameter name="defaultSmtpHost" value="" />
            <roleParameter name="defaultSmtpPort" value="25" />
            <roleParameter name="defaultFromEmailAddress" value="" />

This example manifest captures the environment details specific to the installing firm such as the server names, database details, installation source and so on. In this simple example only one application server is specified for brevity, which runs all of the roles. In reality there would be multiple servers listed each running the roles in a load balanced configuration.

Role Manifest
A role manifest defines the pre-requisites, the files and the services deployed as a unit.

Prerequisite Checking
As mentioned, the first problem we hit during a deployment was pre-requisites. How could we be sure that a server was capable of running our software? There were a number of aspects to this:
• was a supported OS installed
• were the correct operating system components installed
• were third party dependencies met
• were the correct supporting services running
• were the components correctly configured

The pre-requisites vary by component so in the role definition we have a section of checks that must all pass before the deployment can proceed. One of the first examples we saw was that the Microsoft Distributed Transaction Co-ordinator (MSDTC) was not enabled on many of the servers. If it was enabled, then the configuration was incorrect and the machine would not accept remote transactions. For Windows Services, the service control manager (SCM) can be queried to find the state of a service and the registry contained the configuration keys for the component settings. The big problem here was the poor support for remote processes in Windows, coming from a UNIX background it has always frustrated me. At the time Powershell v1 was full of promise but it did not support remote sessions, that was coming in v2. Powershell v2 was a CTP and did not look like it would be ready in time. While a number of shell commands have built-in support for running against a remote machine, there were enough gaps, version incompatibilities between 2003 and 2008 or performance issues that in the end I wrote a Windows service that would perform the checking. Using an xcopy deployment and the SC command it is possible to remotely deploy, register and start a Windows service. This service accepts a list of pre-requisite to check and returned a list of results: pass or fail. The pre-requisites required by a role are defined within the role manifest, examples are:

 description="MSDTC configured to allow remote access." />

      description="Ensure Windows Remote Management (WS-Management) service is available" />

Required Files
A role contains a list of the files required to be installed on the server and where the files need to go. An installation of Expert has a root directory specified by the installing administrator and then the structure is fixed under that:

Each file to be copied is captured in a files section in the role manifest, an example is:

    <file   filename="Aderant.Framework.Notes.dll"
            targetRelativePath="LegacyServices" />
    <file   filename="Aderant.Framework.Notes.Presentation.dll"
            targetRelativePath="LegacyServices" />
    <file   filename="Aderant.Framework.Notes.Services.dll"
            targetRelativePath="LegacyServices" />

In order to be flexible, the file specification allows the source and target paths to be specified as well as the source and target filenames. This allows us to perform any manipulation of the file structure that we need to.

In Golden Gate SP1 we support host services either as Windows Services under the SCM or in IIS under AppFabric. We are in the process of moving all of our services to AppFabric/IIS however this is not yet complete. Therefore a role manifest may contain a section for Windows Services:

  <serviceHost exeName="Expert.Notes.Service"
             displayName="ADERANT Notes Services ({{Name}} instance)"
             description="Host for Notes Services for the {{Name}} environment."
      <service name="Notes"
               serviceName="ADERANT Notes Service"
               port="[[notesServicePort]]" />

and AppFabric hosted services:

      <applicationPool name="[[workflowApplicationPool]]"
                       netVersion="V4.0" />
      <applicationPool name="[[workflowApplicationPool]]"
                       netVersion="V4.0" />
        allowWindowsAuthentication="true" />

In both cases, the information required to create an host a service is provided. For Windows based services we have a reusable service host exe, AppFabric extends IIS and WAS to provide the hosting.

Deployment Engine
Up to this point we really been looking at the deployment model and how it is captured in the two manifests. These manifests are just an XML serialization of a deployment model. When we load an environment we just map from the XML into an in memory object graph of the environment. We now need something to action the model, and this is the deployment engine.

The deployment engine itself is the coordinator that executes a number of deployment actions. A deployment action performs a piece of work required in a deployment, its interface is as follows:

namespace Aderant.Framework.Deployment.Actions {
    public interface IDeploymentAction: IDeploymentMessage {
        void Deploy(Environment environment);
        void Clean(Environment environment);
        void Validate(Environment environment);

The deployment engine supports a set of actions that can be performed to an environment. The three key actions are: deploy, remove (clean in the interface) and validate. When the deployment engine is asked to perform a ‘deploy’, it asks each of the deployment actions in turn to ‘deploy’. We have a library of around 30 deployment actions, examples are:

• AppFabricHostingAction
• FileDeploymentAction
• LoadBalancingConfigurationAction
• ServiceHostBuilderAction
• SQLScriptRunnerAction

Each action in turn knows how to deploy, remove and validate its role in a deployment. The validate action is very important, it allows an administrator to check to see if a pre-installed environment still meets the pre-requisites, still has the required files in place and has the required services up and running. For example it allows an administrator to easy see that a registry setting is no longer correctly set. The deployment actions in turn rely on a set of controller classes that interact with external components such as AppFabric, the file system, the Windows service manager, MSMQ and others. The separation of controller from the deployment actions also a high degree of code re-use as well as better unit testing.

While the deployment engine is currently C# code, it would be relatively easy to move it to a workflow. The deployment engine is a coordinator and therefore the control flow would be quite naturally captured as a workflow. The deployment actions would become an activity library.

As it stands the deployment engine is a command line utility, however it does have a WPF UI that calls through to it (in a very similar model to AppFabric calling the Powershell API from the IIS Manager add-in).

The environment manifest in the screenshot above shows a small load balanced environment being used to host multiple instances of our services.

The declarative deployment model and runtime is a good candidate for a DSL. In fact we prototyped a visual DSL using the Visual DSL toolkit for Visual Studio. This allowed an administrator to literally draw out the deployment diagram for an environment, which was then transformed via a T4 template into an environment XML file. This could then be executed via the deployment engine and used to deploy a full system.

Data over the Web

It’s been a little while since the last posting, in no small part due to my broadband quota exceeding the monthly allowance. Dial-up speed is just painful, and made me realize just how much I use the internet for media: music, movies, podcasts, blogs, … It was also a great reminder just how sensitive applications are when you have a constrained network connection.

One of the most significant changes made to the Expert architecture with SP1 is the introduction of a query service. Prior to SP1, the architectural layering required that data transfer objects (DTOs) were used to move data from a service boundary to the client. The domain model was mapped to whatever shape was required by the client requesting the entities.

Writing the DTO and mapper classes is very repetitive and quite dull and so it was automated using the Visual DSL Toolkit for Visual Studio (now renamed to the Visual Studio Visualization and Modeling SDK). A key component in the Expert framework is our software factory which builds code from 3 models: relational model, domain model and the view model. The view model provides a model and tooling to generate use case specific views of the domain model and the mappers required to transform from domain model to view model and back. An optimization we made when sending data back to the service to update the domain model was just to send back the changes. This required the view model to track any updates made to the model between the time it was fetched from a service and the time it was sent back to the service. The mechanism we wrote to achieve that is worth a few blog entries on its own and I’m going to skip over the details here.

One of the primary clients within this architecture is our workflow service which allows data from the business services to be managed within a long running workflow process before being updated back into the main line of business system. In the original Golden Gate release, the data associated with a workflow instance is sent out with every task within the workflow (a task is a workflow activity that requires human interaction). For very large workflow processes, this can be an issue, particularly over restricted network connections such as VPN or very remote sites. For SP1 we took a look at this particular areas and addressed it in the following ways:

• Tasks now have a data contract so that only the required data is sent.
• The way we fetch data is now via a dedicated query service rather than combining reads and write operations in the same service contract. The query service is http based and therefore can take advantage of out-of-the-box optimizations such as caching and compression.

The separation of query from command at the architectural level we found is currently being explored by a number of people, most vocally by Greg Young and Udi Dahan. The architectural pattern is Command Query Responsibility Segregation (CQRS) and is similar in spirit to the Command-Query separation CQS concept first discussed by Bertrand Meyer in Object Orientated Software Construction back in 1988. This is another topic worthy of blog posts and InfoQ has a great presentation from Greg Young.

Our query service implementation is a WCF Data Service which takes an Entity Framework 4 model and exposes it as a RESTful service.

All data required by a client is fetched via the query service and this is delivered over an http channel. The use of IIS and HTTP gives us the following:

• monitoring via AppFabric
• compression via the dynamic compression in IIS7
• caching using standard HTTP based caches
• cross platform capable data feed

The lifecycle for data is now:

There are a number of interesting aspects to this, not least of which we now use two different ORM technologies: NHibernate and Entity Framework. This is based on both historical use and feature set. Given the data model that we map to, we need the rich extension points available in NHibernate to support the desired object model. The WCF Data Services and EF4 features in .NET 4 / Visual Studio 2010 take most of the heavy lifting out of exposing a domain model via REST. Microsoft is now promoting an Open Data Protocol built on top of http/atom/json as a cross platform capable mechanism for interoperating with data over the web; an ODBC for the web perhaps. The Mix10 keynote included a chapter on OData and how Microsoft is tooling it.

Given that we have a software factory that already contains a domain model which includes the persistence mapping, we just needed to add additional T4 transforms to the software factory so that we could generate a query service and view model implementation from the existing domain model. Along the way we also simplified the change tracking approach, as the explicit task data contract reduced the potential for merge conflicts.

Now that we have a query service and Microsoft is doing all it can to promote OData as a cross platform solution, there are an interesting number of options opening up. One of which is the iPhone/iPad platform from Apple. As part of the OData SDK, Microsoft has released a library and tooling to make consuming OData feeds from the Apple platform straight forward. This includes a tool that generates objective-C classes from the meta data available from an OData stream (via the $metadata directive).