Co-ordinating deployments using the Parallel class in .NET 4.0
January 8, 2011 Leave a comment
It’s been a long time since the last entry, the new year brings with it a fresh post based on some of the deployment work I’ve been looking at recently. This work has opened my eyes to the support for parallel co-ordination of work within .NET 4…
Recently I’ve been looking at the deployment approach we have for our services with an eye to reducing the time it takes for a full deployment. There are two simple concepts that leapt out: the first is to use a pull rather than a push model; the second is to deploy to all of the servers in parallel. This second point becomes increasing important as more servers get involved in hosting the services.
Pull versus Push
One of the most basic operations performed by the deployment engine is the copying of files to the application servers that host the various services within our product. The file copying was originally implemented as a push: the deployment agent performs the copy to the target server using an administration share, e.g. \\appserver01.domain.com\d$\AderantExpert\Live\ . This requires the deployment engine to run with administrator privilege on the remote machines which is not ideal.
An alternative is to send a script to the target server containing the copy the commands, the target server is then responsible for pulling the file to its local storage from a network share (which can be secured appropriately). The deployment engine is responsible for creating the script from the deployment model and co-ordinating the execution of the scripts across the various application servers.
PowerShell remoting is a great option for the remote execution of scripts and it’s quite straight forward to transform an object model into a PowerShell script using LINQ. I created a small script library class that provides common functions, for example:
internal class PowerShellScriptLibrary { internal static void ImportModules(StringBuilder script) { script.AppendLine("import-module WebAdministration"); script.AppendLine("import-module ApplicationServer"); } internal static void StopWindowsServices(string filter, StringBuilder script) { script.AppendLine("# Stop Windows Services"); script.AppendLine(string.Format("Stop-Service {0}", filter)); } internal static void CreateTargetDirectories(string rootPath, IEnumerable fileSpecifications, StringBuilder script) { script.AppendLine("# Create the required folder structure"); fileSpecifications .Where(spec => !string.IsNullOrWhiteSpace(spec.TargetFile.TargetRelativePath)) .Select(x => x.TargetFile) .Distinct() .ToList() .ForEach(targetFile => { string path = Path.Combine(rootPath, targetFile.TargetRelativePath); script.AppendLine(string.Format("if(-not(Test-Path '{0}'))", path)); script.AppendLine("{"); script.AppendLine(string.Format("\tNew-Item '{0}' -ItemType directory", path)); script.AppendLine("}"); }); }
…
The library is then used to create the required script by calling the various functions, the examples below are for the patching approach that allows updates to be installed without requiring a full remove and redeploy:
private string GenerateInstallScriptForPatch(Server server, IEnumerable filesToDeploy, Environment environment, string patchFolder) { StringBuilder powershellScript = new StringBuilder(); PowerShellScriptLibrary.ImportModules(powershellScript); PowerShellScriptLibrary.StopWindowsServices("ADERANT*", powershellScript); PowerShellScriptLibrary.StopAppFabricServices(environment, powershellScript); PowerShellScriptLibrary.CreateTargetDirectories(server.ExpertPath, filesToDeploy, powershellScript); PowerShellScriptLibrary.CreatePatchRollback(server, patchFolder, filesToDeploy, powershellScript); PowerShellScriptLibrary.CopyFilesFromSourceToServer(environment, server, filesToDeploy, powershellScript); PowerShellScriptLibrary.UpdateFactoryBinFromExpertShare(server, environment.NetworkSharePath, powershellScript); PowerShellScriptLibrary.StartAppFabricServices(environment, powershellScript); PowerShellScriptLibrary.StartWindowsServices("ADERANT*", powershellScript); return powershellScript.ToString(); }
Though it is possible to treat NTFS as a transactional system (see http://msdn.microsoft.com/en-us/library/bb968806(v=VS.85).aspx ), and therefore have it participate in atomic actions, I didn’t walk this path. Instead I chose the compensation route and so when the model is transformed into a script I create both an install script and a compensate script which is executed in the event of anything going wrong.
private string GenerateRollbackScriptForPatch(Server server, IEnumerable filesToDeploy, Environment environment, string patchFolder) { StringBuilder powershellScript = new StringBuilder(); PowerShellScriptLibrary.ImportModules(powershellScript); PowerShellScriptLibrary.StopWindowsServices("ADERANT*", powershellScript); PowerShellScriptLibrary.StopAppFabricServices(environment, powershellScript); PowerShellScriptLibrary.RollbackPatchedFiles(server, patchFolder, filesToDeploy, powershellScript); PowerShellScriptLibrary.StartAppFabricServices(environment, powershellScript); PowerShellScriptLibrary.StartWindowsServices("ADERANT*", powershellScript); return powershellScript.ToString(); }
The scripts simply take a copy of the existing files that will be replaced before replacing them with the new versions. If anything goes wrong during the patch install, the compensating script is executed to restore the previous files.
Given that a server specific script is now generated per application server, because different servers host different roles and therefore require different files, the deployment engine has the opportunity to pass the script to the server; ask it to execute it and then wait for the OK from each server. If one server has an error then all can have the compensation script executed as required.
Parallelizing a deployment
Before looking at ome co-ordination code for the deployment engine, I want to explicitly note that there are two different and often confused concepts:
• Asynchronous execution
• Parallel execution
An asynchronous execution involves a call to begin a method and then a callback from that method when the work is complete. IO operations are natural candidates for asynchronous calls to ensure that the calling thread is not blocked waiting on the IO to complete. Single threaded frameworks such as UI are the most common place to see a push for asynchronous programming. In .NET 3, the Windows Workflow Foundation provided an excellent asynchronous programming model where asynchronous activities are co-ordinated by a single scheduler thread. It is bad practice to have this scheduler thread block or perform long running operations as it stalls the workflow progress when in a parallel activity. It is better to schedule multiple asynchronous activities in parallel when possible and have these execute on separate worker threads.
Parallel execution involves breaking a problem into small parts that can be executed in parallel due to the multi-core nature of todays CPUs. Rather than having a single core work towards an answer, many cores can participate in the calculation. To reduce the elapsed time, the time experienced by the end user, of a calculation, it may be possible to execute a LINQ query over all available cores (typically 2, 4 or 8). Linq now has the .AsParallel() extension method which can be applied to queries to enable parallel execution of the query. Of course, profiling is required to determine if the query performs better in parallel for typical data sets.
.NET 4 added the Task Parallel Library into the core runtime. This library adds numerous classes to the BCL to make parallel programming and the writing of co-ordination logic much simpler. In particular the Parallel class can be used to easily schedule multiple threads of work. For example:
Parallel.Invoke( () => Parallel.ForEach(updateMap, server => serverInstallationScripts.Add(server.Key, GenerateInstallScriptForPatch(server.Key, server.Value, environment, patchFolder))), () => Parallel.ForEach(updateMap, server => serverRollbackScripts.Add(server.Key, GenerateRollbackScriptForPatch(server.Key, server.Value, environment, patchFolder))) );
The above code is responsible for creating the install and compensate PowerShell scripts from the deployment model discussed above. There are two levels of parallelism going on here. First the generation of the install and compensate scripts are scheduled at the same time using a Parallel.Invoke() call. Then a Parallel.ForEach() is used to generate the required script for each application server defined in the environment in parallel. The runtime is responsible for figuring out how best to achieve this, as a programmer we simply declare what we want to happen. In the above code the updateMap is an IDictionary<server, IList>, this is a list of files to deploy to each server keyed on the server.
I was simply blown away by how simple and yet how powerful this programming model is.