The case of the Fiddler heisenbug

There is a presentation I give to our graduates during their first week with us, the second slide is:

EverythingYouKnowIsWrong

This is taken from the multi-media overload that was U2s Zoo TV tour. I use it to try to get our graduates to accept that they are really back at the start of their learning process. This is pretty much how I felt a week or two back when one of our consultants said that they were seeing lots of HTTP 401 authentication traffic while our application was running. I’d personally spent a lot of time over the years trying to make sure that we were as efficient as possible so I was sceptical to say the least…

Background

The services architecture for the product I work on follows the Command Query Responsibility Separation approach which I’ve talked about before. In summary we fetch data from an OData service provided by WCF Data Services and then make updates via a suite of services implemented using regular SOAPy WCF. We closely monitor the message exchange between our applications and services to ensure that we aren’t too chatty, messages aren’t too big and so on – we do this using the excellent Fiddler. Many moons ago, I spent quite some time getting my head around how to correctly configure IIS and WCF to use Kerberos to allow the services to be scaled out over a web farm. By now I’ve run through this on numerous test environments and real world environments so I was pretty confident I know how it works.

The Problem

Our software runs on-premise within the walled garden of the corporate network. We support some of the largest law firms in the world and so on occasion have to deal with some very wide area networks. The connection from desktop to server can take place over long distances with the characteristics of high latency and low bandwidth; any messaging overhead can be painful. For years now we’ve used Fiddler to look at our services as all the call activated services use HTTP. At one client, Fiddler was not working [which turned out to be a conflict with the McAfee software they used] and so they used Wireshark instead. When observing the HTTP traffic in Wireshark, our consultants and the client saw many HTTP 401 authentication responses, far more than we expected. Each 401 response results in additional latency delay and requires additional messages to be exchanged between the client and the server. In our testing to date, we believed we had tuned the services to require only a single 401 authentication response and then to cache and present the credentials on each subsequent request.

 

TL;DR

To stop a WCF Data Services request, secured using Windows Authentication, requiring authentication on every call – you need to set the PreAuthenticate flag to true on the HttpWebRequest via the SendingRequest2 event on the generated context. Fiddler (and Web Proxy in the Microsoft Message Analyzer) hides this from you because it implements a connection pool of Keep Alive connections.

 

Reproducing the issue

The first task was to reproduce the behaviour inside one of our test environments. I’m fortunate to have a very well spec’d HP Z420 on my desk which is a great Hyper-V server. Inside Hyper-V I have a private domain set up which has a couple of load balanced application servers running our software. First off, I ran the client software on both Windows 7 and Windows 8.1 with Fiddler running in the background, no sign of the additional 401s. I then switched over to using a lower level network monitoring but rather than using Wireshark, I decided to try out the Microsoft Message Analyzer. This is Microsoft’s replacement for the Network Monitor tool, it provides a number of different filters, two of which were of interest:

  • web proxy – same deal as Fiddler, looking at HTTP
  • local link layer – all traffic on the NIC

Using the web proxy produced the same results as Fiddler however using the local link layer filter showed lots of additional 401 responses – when I ran the Message Analyzer with both web proxy and the local link layer filters there was no additional 401s. We had hit a Heisenbug, when observing the HTTP traffic through a web proxy, the proxy was changing the behaviour of the traffic.

Confirm our current understanding

My faith in our current collective understanding of what was happening was pretty shaken so I ran through the various settings that I previously thought would avoid these 401s:

1. Is the URL of the service trusted? Windows must consider the service URL to be trusted to pass Kerberos tickets. Any easy way to check the zone of any URL is the following code snippet:

var zone = System.Security.Policy.Zone.CreateFromUrl("http://wsakl001013.ap.aderant.com/Expert_Local");
Console.WriteLine(zone.SecurityZone);

If necessary, add the service host URL or a matching pattern to the Local Intranet Zone via IE:

In this example, *.aderant.com has been added to the local intranet zone.

 

2. Are the load balanced services running as a domain account? Does this account have an appropriate HTTP SPN registered against it?

 

3. Do the various IIS web applications have the useAppPoolCredentials flag set in configuration? This instructs IIS to expect the Kerberos SGT (service granting ticket) to be encrypted using the credentials of the account used by the mapped application pool, rather than the default machine account.

 

4. Is Kerberos configured to use a transport session rather than a connection per call for authentication? This is set in IIS against the web application using the authPersistNonNTLM setting.

This adds a Persistent-Auth header to the HTTP response (seen here using Message Analyzer):

image

These settings are available from within the IIS Manager using the Configuration Editor:

IISConfigEditor

Navigate to the system.webServer/security/authentication/windowsAuthentication settings:

image

Set the properties as required. If you want to programmatically set these values via script, IIS will helpfully generate the scripts for you. Look over on the right hand side of the Configuration Editor and you’ll see a ’Generate Script’ option.

image

Clicking on this will generate a change script for you in a number of technologies, I tend to favour PowerShell:

image

All this checked out on my environment but I wanted to ensure that NTLM was not in play (here). To do this I enabled NTLM logging on the domain controller using group policy. Using gpedit.msc, I enabled the ‘Network Security: Restrict NTLM: Audit Incoming NTLM Traffic’ and  ‘Network Security: Restrict NTLM: Audit NTLM authentication in this domain’ policies [under Windows Settings, Security Settings, Local Policies, Security Options]:

image

Interesting it showed that there was unexpected NTLM traffic – from the AppFabric services to the SQL Server. The MSSQLService was set-up to run as a domain account, service.sql, but the appropriate SPN had not been mapped to that account:

> setspn –a MSSQLSvc/SqlServer2012.expert.local:1433 service.sql

> setspn –a MSSQLSvc/SqlServer2012:1433 service.sql

I mapped both the FQDN and the NETBIOS name formats just to be sure. This resolved the issue and I no longer saw NTLM traffic.

image

What Next?

At this point I thought the environment was configured as it should be but I was still seeing the additional 401s. After a lot of searching and head scratching I came across this post from Fiddler author, Eric Lawrence. The rub being:

Keep-Alive

In some cases, the time required to open a new network connection to the server is greater than the time required to send the request and download the response. Therefore, if the client opens a new connection for every request, the application’s performance is greatly degraded. The practice of reusing a single TCP/IP connection for multiple requests is called “keep-alive” and it’s the default behaviour in HTTP/1.1. However, clients or servers may choose to disable keep-alive by either sending a Connection: close header or by abruptly closing the connection after each transaction.

Fiddler maintains a “connection pool” of idle keep-alive connections to the server. When the a client request comes in, this pool is first checked to determine if an existing connection is available on which the request can be sent. Even if the client specifies a Connection: close request header, that only causes Fiddler to close the client’s connection after the response is sent—the server connection is returned to the pool (unless it too disabled keep-alive).

What this means is that if your client isn’t using Keep-Alive connections, its performance can be severely impacted. However, when Fiddler is introduced, performance is improved because “expensive” server connections are reused.(Since Fiddler and the client are (typically) running on the same computer, establishing a new connection from the client to Fiddler is very fast.)

The fix for this problem is simple: Ensure that your client is using KeepAlive connections. That’s as simple as:

  1. Ensure that you’re using HTTP/1.1
  2. Ensure that you haven’t disabled Keep-Alive (e.g. set the KeepAlive property of the HTTPWebRequest object to true)
  3. Don’t send Connection: Close headers

Note that creating connections to servers can be even more expensive than the simple TCP/IP establishment cost. First, there’s TCP/IP Slow-Start, a congestion-management feature of the protocol that means that new connections have a slower transfer rate than longer-lived connections. Next, if you’re using HTTPS, there’s an expensive cryptographic handshake which must be performed on each new connection. Lastly, if your connections use either the NTLM or Negotiate authentication protocols, you may find that each new connection requires a 3-step handshake (e.g. the server sends a HTTP/401 challenge, the client resends the request, the server sends another HTTP/401 challenge, the client resends the request with a challenge-response, and the server finally sends a HTTP/200). Because these are “connection-oriented” authentication protocols, subsequent requests over an existing connection may be able to avoid these extra round-trips.

Here is the heisenbug, Fiddler is maintaining a Keep-Alive connection to the server even though my call may not be.

So how does this relate to the WCF service calls? For the basicHttpBinding, the Keep-Alive behaviour is enabled by default, it can optionally be turned off via a custom binding, see here.

Back to Basics

At this point I was still convinced I should not be seeing those additional 401s, so I decided to build a very simple secured WCF service and generate a proxy to the standard OData service we use.

Here is a WCF Service that simply says Hello to the calling Windows user.

image

WCF Configuration as follows:

image

Visual Studio created a service reference for me an I simply called the service a number of times: both reusing the proxy as well as closing the proxy and recreating it:

image

The link layer trace was as follows:

image

This was as expected, a single 401 but then 200s on subsequent calls. Kerberos was being used successfully and a transport level session was established! Just for completeness I could see the HTTP Keep-Alive header in the POST:

image

 

OK, on to the WCF Data Service. Again in Visual Studio I generated a service reference then:

image

This resulted in:

image

And the following trace:

image

At last here was the repeated 401/200 behaviour.

I checked for the Keep-Alive header in the request:

image

And looked for the Persistent-Auth header in the response:

image

Both present.

More head scratching.

More searching.

Then I posted this question to the Microsoft WCF Data Services forum.

While waiting for an answer, a colleague and I took at look at the System.Data.Services.Client.DataServiceContext base class for the generated context object. Working through that code, I came across the HttpWebRequest class which had a PreAuthenticate property which looked exactly what I wanted. A little more digging and then I found I could do this:

var context = new ExpertDbContext(…

context.Credentials = CredentialCache.DefaultNetworkCredentials;

context.SendingRequest2 += context_SendingRequest2;

 

static void context_SendingRequest2(object sender, SendingRequest2EventArgs e) {

((HttpWebRequestMessage)e.RequestMessage).HttpWebRequest.PreAuthenticate = true;

}

 

This was it!

Testing the code with this small change and the 401s were gone from the WCF Data Service traffic. Just as I was grabbing a celebratory cup of coffee, a colleague asked if I had seen the response to my question on the forum? I had not; it validated the above approach – Thank you Fred Bao.

 

Wrapping Up

This took about a week elapsed to work through, we’ve now updated our query service (OData) proxy to set the PreAuthenticate flag and can see improved system performance, particularly over constrained WAN connections. That Fiddler hid this really threw me, heisenbugs are really hard to dealt to.

 

Windows Identity Foundation, first steps…

I’ve been slowly working through the excellent book Programming Windows Identity Foundation by Vittorio Bertocci. I was getting a little restless though and wanted to see some code so I found this walkthrough and decided to play along. Things didn’t go as smoothly as I had hoped but I did learn more than I bargained for…

One of the first requirements to get the sample running is to ensure that you have SSL enabled on your default web site. This is not a common task for most developers so I’ll elaborate a little:

Setting up HTTPS

IIS Manager supports the creation of a self-signed certificate which is sufficient for development purposes. The server configuration provides a‘Server Certificates’ option as below, in the Actions menu there is a ‘Create Self-Signed Certificate…’ item.

image

There’s not much to the certificate creation process, enter a friendly name for the certificate. In my case I lacked imagination and went with ‘TestCertficateForWIF’. The certificate is created in the machine certificate store so running certmgr.msc doesn’t help as it opens the user store. Instead I ran the mmc.exe directly and added the certificate manager snap-in explicitly, when asked to choose a store I went with the local machine store.

image

image

image

Looking in the Personal | Certificates node reveals the newly created certificate.

image

Setting up an HTTPS binding for the Default Web Site is now possible. Select the site in IIS manager and then choose to Edit Bindings… from the context menu.

image

The dialog allows you to add a new HTTPS binding, you just select the certificate you want to use as part of the encryption process.

image

I next ran through the various steps in the walkthrough but when I tried the run the completed sample I got a KeySet error.

Additional Notes:

  • The certificate name for the DemoSTS web.config only requires CN=, not two.

image

  • As we are using Windows authentication the console client does not need to pass credentials explicitly. When I set the credentials manually to a local test account I would see my domain account as the name in the returned claim.

image

Troubleshooting the Sample

A quick search suggested that the AppPool account my services were running as did not have access to the private key of the certificate. OK, back into the machine certificate store and ‘Manage Private Keys…’ for the certificate.

image

The web applications for the services were mapped to the ApplicationPoolIdentity (I’m running IIS7.5) so I tried adding the read right to the ‘IIS AppPool\DefaultAppPool’ account. This didn’t seem to help so I resorted to creating a specific service account and assigning it the read permission for the certificate.

image

I created a new application pool to run as this new ‘service.sts’ user and set the web applications use this application pool. This was good and resolved by KeySet error but I was now getting a fault back from my secure WCF service. After a little head scratching I fired up Fiddler to watch the traffic:

image

OK – I could see the secure WCF service calling the DemoSTS, the DemoSTS doing it’s work and then calling back to the secure service, then a 500 failure. Looking at the response message for the 500:

image

For some reason I was getting an ‘Invalid Security Token’ error. I knew the error was in the secure WCF service but not much more. While looking through the web.config for the service, I found commented out trace configuration:

image

So I enabled the tracing and re-ran the client. The WIFTrace.e2e file popped into the service directory and I used the Microsoft Service Trace Viewer to look at the log:

image

Looking at the error detail:

image

‘The issuer of the security token was not recognized by the IssuerNameRegistry…’, that looked familiar so back to the web.config.

<microsoft.identityModel>

<service name=”SecureWCFService.Service”>

<audienceUris>

<add value=”http://wsakl0001013.ap.aderant.com/SecureWCFService/Service.svc&#8221; />

</audienceUris>

<issuerNameRegistry type=”Microsoft.IdentityModel.Tokens.ConfigurationBasedIssuerNameRegistry, Microsoft.IdentityModel, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35″>

<trustedIssuers>

<add thumbprint=”?????????????????????????????” name=”http://wsakl0001013.ap.aderant.com/DemoSTS/Service.svc&#8221; />

</trustedIssuers>

</issuerNameRegistry>

</service>

</microsoft.identityModel>

I’ve removed the actual thumbprint, but here was where the service was configured to accept tokens from a STS using a particular certificate identified by it’s thumbprint. I needed the thumbprint of the certificate I had created, easily done via PowerShell:

> $certificate = Get-ChildItem -Path Cert:\LocalMachine\My | where { $_.Subject -match ‘CN\=WSAKL0001013.ap.aderant.com’ }

>$certificate.thumbprint

The thumbprint provided by PowerShell did not match my web.config so I updated the config.

Happy days, the sample now ran:

image

Workflow Services & MSMQ Revisited

I’ recently dusted off a WCF sample I’d written and blogged about a year or two ago. During the process of getting it to work again, I discovered the blog posting is incorrect so I’m reposting with corrections and additional explanation.

Tom Hollander published a great set of posts on this topic which I needed…

http://blogs.msdn.com/b/tomholl/archive/2008/07/12/msmq-wcf-and-iis-getting-them-to-play-nice-part-1.aspx

http://blogs.msdn.com/b/tomholl/archive/2008/07/12/msmq-wcf-and-iis-getting-them-to-play-nice-part-2.aspx

http://blogs.msdn.com/b/tomholl/archive/2008/07/12/msmq-wcf-and-iis-getting-them-to-play-nice-part-3.aspx

We needed a quick proof of concept to show that a workflow service could be activated via a message sent over MSMQ. First part was workflow design and coding, this was the easy part. All I wanted to do was accept a custom type, in this case a TimeEntry, from a SubmitTime service operation that belonged to an ITimeEntryContract. On receiving the time entry I would simple log the fact that it arrived into the event log. This is pretty much the “Hello, World!” of the services demos. The second part was getting the configuration correct…

One of the promises of WCF is that it gives us a unified communication model regardless of the protocol: net.tcp, http, msmq, net.pipe and it does. The best description I’ve heard for WCF is that it is a channel factory, and you configure the channels declaratively in the .config file (you can of course use code too if you prefer). The key benefit is that the service contract and implementation, for the main part, can be channel agnostic. Of course there are the exceptions to prove the rule such as a void return being required by an MSMQ channel but for the most part it holds. As it turns out, it was true that I needed to make no changes to code to move from a default http endpoint to a MSMQ endpoint. I need need to do a lot of configuration and setup though which is not that well documented. This post hopes to correct that is some small way.

First up the easy part, writing the code.

In Visual Studio 2010 I started a new ‘WCF Workflow Service Application’ project. First I define my TimeEntry model class:

using System;

namespace QueuedWorkflowService.Service {
    public class TimeEntry {
        public Guid TimekeeperId { get; set; }
        public Guid MatterId { get; set; }
        public TimeSpan Duration { get; set; }

        public override string ToString() {
            return string.Format(“Timekeeper: {0}, Matter: {1}, Duration: {2}”, TimekeeperId, MatterId, Duration);
        }
    }
}

Then I defined a code activity to write to the time entry provided into the event log:

using System;
using System.Diagnostics;
using System.Activities;

namespace QueuedWorkflowService.Service {
    public sealed class DebugLog : CodeActivity {
        public InArgument<string> Text { get; set; }

        protected override void Execute(CodeActivityContext context) {
            string message = string.Format(“Server [{0}] – Queued Workflow Service – debug :{1}”, DateTime.Now, context.GetValue(this.Text));
            Debug.WriteLine(message);
            EventLog.WriteEntry(“Queued Service Example”, message, EventLogEntryType.Information);
        }
    }
}

All the C# code is now written and I create my workflow:

I need a variable to hold my time entry so I define one at the scope of the service:

The project template creates the CorrelationHandle for me but we won’t be using it.

The receive activity is configured as follows:

With the Content specified as:

This is such a simple service that I don’t need any correlation between messages, it just receives and processes the message without communicating back to the sender of the message. Therefore I also cleared out the CorrelatesOn and the CorrelationInitializer properties.

Finally I set up the Debug activity to write the time entry to the event log:

That’s it! I’m done. This now runs using the default binding introduced in WCF4 (http and net.tcp). Starting up the project launches my service and the WCF test client is also opened pointing to my new service. The service is running in Cassini, the local web server built into the Visual Studio debugging environment.

15 minutes, or there about, to build a workflow service. What follows a summary of the steps discovered over the next 4 hours try to convert this sample from using an http endpoint to an msmq endpoint.

Default Behaviour
One of the key messages Microsoft heard from the WCF 3 community was that configuration was too hard. To even get started using WCF you had to understand a mountain of new terms and concepts including: channels, address, binding, contract, behaviours,… the response to this in .NET 4 is defaults. If you don’t specify an endpoint, binding etc then WCF creates a default one for you based upon your machine configuration settings. This makes getting a service up and running a very straightforward experience. BUT as soon as you want to step outside of the defaults, you need the same knowledge that you needed in the WCF 3 world.

Here’s the web.config I ended up with after a couple of hours, the MSMQ settings were a voyage of personal discovery… (http://msdn.microsoft.com/en-us/library/ms731380.aspx)

<?xml version=”1.0″ encoding=”utf-8″?>
<configuration>
  <system.web>
    <compilation debug=”true” targetFramework=”4.0″ />
  </system.web>
  <system.serviceModel>
    <services>
      <service name=”TimeEntryService”>
        <endpoint
            binding=”netMsmqBinding”
            bindingConfiguration=”nonTxnMsmqBinding”
            address=”net.msmq://localhost/private/QueuedWorkflowService/TimeEntryService.xamlx”
            contract=”ITimeEntryContract” />

        <endpoint
            binding=”netMsmqBinding”
            bindingConfiguration=”txnMsmqBinding”
            address=”net.msmq://localhost/private/QueuedWorkflowServiceTxn/TimeEntryService.xamlx”
            contract=”ITimeEntryContract” />
        <endpoint
            address=”mex”
            binding=”mexHttpBinding”
            contract=”IMetadataExchange” />
      </service>
    </services>
    <behaviors>
      <serviceBehaviors>
        <behavior>
          <serviceMetadata httpGetEnabled=”true” />
        </behavior>
      </serviceBehaviors>
    </behaviors>
    <bindings>
      <netMsmqBinding>
        <binding
            name=”nonTxnMsmqBinding”
            durable=”false”
            exactlyOnce=”false”
            useActiveDirectory=”false
            queueTransferProtocol=”Native”>
          <security mode=”None”>
            <message clientCredentialType=”None” />
            <transport
                msmqAuthenticationMode=”None
                msmqProtectionLevel=”None” />
          </security>
        </binding>

        <binding
            name=”txnMsmqBinding”
            durable=”true”
            exactlyOnce=”true”
            useActiveDirectory=”false”
            queueTransferProtocol=”Native”>
          <security mode=”None”>
            <message clientCredentialType=”None” />
            <transport
                msmqAuthenticationMode=”None”
                msmqProtectionLevel=”None” />
          </security>
        </binding>
      </netMsmqBinding>
    </bindings>
  </system.serviceModel>
    <microsoft.applicationServer>
        <hosting>
            <serviceAutoStart>
                <add relativeVirtualPath=”TimeEntryService.xamlx” />
            </serviceAutoStart>
        </hosting>
    </microsoft.applicationServer>
</configuration>

The two important sections are the endpoint and the netMsmqBinding sections. A single service is defined that exposes two MSMQ endpoints, a transactional endpoint and a non-transactional one. This was done to demonstrate the changes required in the netMsmqBinding to support a transactional queue over a non-transactional queue; namely the durable and exactlyOnce attributes. In both cases no security is enabled. I had to do this to get the simplest example to work. Note that the WCF address for the queue does not include a $ suffix on the private queue name and matches the Uri of the service.

We still have some way to go to get this to work, we need a number of services to be installed and running on the workstation:

Services
• Message Queuing (MSMQ)
• Net.Msmq Listener Adapter (NetMsmqActivator)
• Windows Process Activation Service (WAS)

I also ensured that AppFabric was running as this is the easiest way to start the debugging process:
• AppFabric Event Collection Service (AppFabricEventCollectionService)

If you don’t have these services registered on your workstation you will need to go into the ‘Programs and Features’ control panel, then ‘Turn Windows features on or off’ to enable them (Windows 7).

With the services installed and started you need to create a private message queue to map the endpoint to (see : http://msdn.microsoft.com/en-us/library/ms789025.aspx ). The queue name must match the Uri of the service.

image

The sample is configured to run without security on the queues, i.e. the queues are not authorized. You must allow the anonymous login ‘send’ rights on the queues. If you don’t, the messages will be delivered but the WAS listener will not be able to pick up the messages from the queue.

image

If you have problems and do not see the message delivered to the correct queue, have a look in the system Dead Letter queues.

image

You also need to change your VS2010 project to use IIS as the host rather than Cassini. On the project properties dialog, open the Web tab:

As I wanted events from this service to be added to my AppFabric monitoring store, I also added a connection string to the mapped web application and then configured AppFAbric monitoring to use that connection.


And in AppFabric configuration:

Finally you also need to enable the correct protocols on the web application (Manage Application… | Advanced Settings):

I’ve added in net.msmq for queuing support and also net.pipe for the workflow control endpoint.

Make sure that the user the application pool is running as has access to read and write to the queue.

With all the server configured I then wrote a simple WPF test application that used a service reference generated by VS2010, this creates the appropriate client side WCF configuration. The button click handler called the service proxy directly:

private void submitTimeEntryButton_Click(object sender, RoutedEventArgs e) {
    using (TimeEntryContractClient proxy = new TimeEntryContractClient(“QueuedTimeEntryContract”)) {
        TimeEntry timeEntry = new TimeEntry {
                                            TimekeeperId = Guid.NewGuid(),
                                            MatterId = Guid.NewGuid(),
                                            Duration = new TimeSpan(0, 4, 0, 0)
                                            };

        string message = string.Format(“Client [{3}]- TimekeeperId: {0}, MatterId: {1}, Duration: {2}”,
            timeEntry.TimekeeperId,
            timeEntry.MatterId,
            timeEntry.Duration,
            DateTime.Now);

        proxy.SubmitTimeEntry(timeEntry);
        EventLog.WriteEntry(“Queued Service Example”, message, EventLogEntryType.Information);
    }
}

And the awesome UI:

Click the button and you get entries in the event log, a client event and the server event:

image

I made no changes to the code to move from an http endpoint to a MSMQ endpoint, but it’s not as simple as tweaking the config and you’re good to go. I’d love to see some tooling in VS2010 or VS vNext to take some of the pain away from WCF config, similar to the tooling AppFabric adds into IIS. Until that happens, there are plenty of angle brackets to deal with.

Creating a generic ping proxy

In the previous post we walked through the steps required to implement a service monitoring ‘Ping’ operation as a WCF endpoint behavior. This allows us to add the ping functionality to an endpoint using WCF configuration alone. Following on from this, here’s a generic implementation of a Http proxy class to call Ping on any service.

Last time we left off having created a proxy to a test service using the WcfTestClient application.

Ping operation from metadata in the WCF Test Client

Switching to the XML view lets us see the SOAP request and reply message for a ping.

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
  <s:Header>
    <Action s:mustUnderstand="1" xmlns="http://schemas.microsoft.com/ws/2005/05/addressing/none">http://aderant.com/expert/contract/TestServiceContract/IService1/Ping</Action>
  </s:Header>
  <s:Body>
    <Ping xmlns="http://aderant.com/expert/contract/TestServiceContract" />
  </s:Body>
</s:Envelope>

and the response:

<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
  <s:Header />
  <s:Body>
    <PingResponse xmlns="http://aderant.com/expert/contract/TestServiceContract">
      <PingResult>2011-10-18T01:28:57.0222683Z</PingResult>
    </PingResponse>
  </s:Body>
</s:Envelope>

The namespace highlighted in green is the namespace we gave to our service as part of its ServiceContract attribution:

[ServiceContract(Namespace="http://aderant.com/expert/contract/TestServiceContract")]
public interface IService1 {

The SOAP message varies from service to service according to the namespace of the contract and the type of the contract. If we know these two values then we can construct the appropriate SOAP message for a generic client.

private static void CallWcfServiceUsingGenericProxy() {
    const string address = @"http://localhost/TestService/Service1.svc";
    const string serviceContractType = "IService1";
    const string serviceContractNamespace = @"http://aderant.com/expert/contract/TestServiceContract";
    Console.WriteLine("Pinging WCF service using the generic proxy...");
    DateTime utcStart = DateTime.UtcNow;
    string response = PingService(serviceContractType, serviceContractNamespace, address);
    DateTime utcFinished = DateTime.UtcNow;
    DateTime serverPingTimeUtc = ProcessPingResponse(response, serviceContractNamespace);
    WriteTimingMessage(utcStart, serverPingTimeUtc, utcFinished);
}

The calling of the generic proxy is separated from the processing of the response so that a timing can be made around just the communication.

private static string PingService(string serviceContractType, string contractNamespace, string address) {
    string pingSoapMessage = string.Format(@"", contractNamespace);
    if (!contractNamespace.EndsWith("/")) { contractNamespace = contractNamespace + "/";}
    WebClient pingClient = new WebClient();
    pingClient.Headers.Add("Content-Type", "text/xml; charset=utf-8");
    pingClient.Headers.Add("SOAPAction", string.Format(@"""{0}{1}/Ping""", contractNamespace, serviceContractType));
    string response = pingClient.UploadString(address, pingSoapMessage);
    return response;
}

To Ping the service, we construct the SOAP message and use a WebClient to make the call. The web client requires headers to be added for the content type and the SOAPAction which tells the WCF Dispatcher which method we want to call.

To process the SOAP message returned from the Ping message we use:

private static DateTime ProcessPingResponse(string response, string contractNamespace) {
    XDocument responseXml = XDocument.Parse(response);
    XElement pingTime = responseXml.Descendants(XName.Get("PingResult", contractNamespace)).Single();
    DateTime serverPingTimeUtc = DateTime.Parse(pingTime.Value).ToUniversalTime();
    return serverPingTimeUtc;
}

Now we have the UTC DateTime from the server when it processed the response.

All good, if we know the contract namespace and interface type for the service we can ping it. As we saw, this information is attributed on the service contract class. To simplify the proxy code a little, we can use reflection to determine this information given just the contract type.

public class WcfServicePinger<TServiceContract> where TServiceContract : class {
    public DateTime Ping(string addressUri) {
        string @namespace = string.Empty;
        ServiceContractAttribute attribute = typeof(TServiceContract)
            .GetCustomAttributes(typeof(ServiceContractAttribute), true)
            .FirstOrDefault() as ServiceContractAttribute;
        if(attribute == null) {
            throw new ArgumentException("The specified type {0} is not a WCF service contract.", typeof(TServiceContract).Name);
        }
        if(string.IsNullOrWhiteSpace(attribute.Namespace)) {
            @namespace = "http://tempuri.org";
        } else {
            @namespace = attribute.Namespace;
        }
        return new WcfServicePinger().Ping(addressUri, typeof(TServiceContract).Name, @namespace);
    }
}

The non-generic WcfServicePinger calls our previous code, as above:

public class WcfServicePinger {
    public DateTime Ping(string addressUri, string serviceContractTypename, string serviceContractNamespace) {
        string response = PingService(serviceContractTypename, serviceContractNamespace, addressUri);
        DateTime serverPingTimeUtc = ProcessPingResponse(response, serviceContractNamespace);
        return serverPingTimeUtc;
    }

So in the end we would use:

DateTime serverTime = new WcfServicePinger<IService1>().Ping(“http://localhost/TestService/Service1.svc”);

Note that we have constructed a generic proxy that calls an Http endpoint. I did try to construct a net.tcp generic proxy class too but it broke my time box. The Ping method can be called via net.tcp using a proxy generated from the endpoint metadata.

Checking WCF Service Availability using an Endpoint Behavior

A common operational requirement for an SOA is the ability to determine if a service is available. Just as it is common to ping a machine, we also want to be able to ping an individual service to determine that it is running and able to respond to messages.

Our first pass at solving this issue was to introduce an IPingable interface that each of our service facades would implement. The code was pushed into our service base class and the software factory updated as appropriate. However, this didn’t feel quite right. Other service extensions such as the metadata endpoint were not so invasive, it can be established by a simple addition to the service configuration file. We felt we wanted the same configurable nature for a ping mechanism and so set out to figure out how to create a custom WCF behavior.

This post walks through the creation of a WCF Endpoint Behavior which will add a Ping() method to a service. The method returns the DateTime, in UTC, stating when the server processed the request. This can be used to establish how long a basic roundtrip to the service is taking for both the request and the reply. The use of an endpoint behavior allows us to add this functionality to any service, therefore we can add it retrospectively to services that we have already created without needing to change them.

The source code for this post is available from my DropBox.

Let’s start by looking at how we would configure such an endpoint behavior. For an existing service, we would copy the assembly containing the endpoint behavior into the bin folder for the service and then add some entries into the web.config:

<configuration>
    <system.serviceModel>
        <extensions>
            <behaviorExtensions>
                <add name="Ping" type="CustomWcfBehaviors.PingEndpointBehavior, CustomWcfBehaviors" />
            </behaviorExtensions>
        </extensions>
        <behaviors>
            <endpointBehaviors>
                <behavior>
                    <Ping>
                </behavior>
            </endpointBehaviors>
        </behaviors>

We register the type [CustomWcfBehaviors.PingEndpointBehavior in assembly CustomWcfBehaviors] that is our custom behavior as a behavior extension and then declare that we want to use the extension. The configuration above sets up a default behavior that will be applied to all endpoints since we haven’t explicitly named it. That’s it, fetch the metadata for the service it will now contain a Ping() method.

The configuration of the behavior is reasonably straight forward, let’s look at the implementation…

namespace CustomWcfBehaviors {
    public class PingEndpointBehavior: BehaviorExtensionElement, IEndpointBehavior {
    private const string PingOperationName = "Ping";
    private const string PingResponse = "PingResponse";

The behavior needs to derive from BehaviorExtensionElement which allows it to be declared in the service configuration as an extension element. We need to introduce two overrides:

// factory method to construct an instance of the behavior
protected override object CreateBehavior(){
    return new PingEndpointBehavior();
}
// property used to determine the type of the behavior
public override Type BehaviorType{
    get { return typeof(PingEndpointBehavior); }
}

Now onto the real work, the IEndpointBehavior implementation:

public void Validate(ServiceEndpoint endpoint) {}
public void AddBindingParameters(ServiceEndpoint endpoint, BindingParameterCollection bindingParameters) {}
public void ApplyClientBehavior(ServiceEndpoint endpoint, ClientRuntime clientRuntime) {}

public void ApplyDispatchBehavior(ServiceEndpoint endpoint, EndpointDispatcher endpointDispatcher) {
    if(PingOperationNotDeclaredInContract(endpoint.Contract)) {
        AddPingToContractDescription(endpoint.Contract);
    }
    UpdateContractFilter(endpointDispatcher, endpoint.Contract);
    AddPingToDispatcher(endpointDispatcher, endpoint.Contract);
}

There are four methods that we need to implement but only one that we do any work in. The ApplyDispatchBehavior is where we add our code to manipulate how the endpoint invokes the service operations.

Endpoints map to the ABC of WCF [address, binding and contract]. What we want to do is extend the contract with a new service operation. A service host may allow multiple endpoints to expose the service contract on different protocols. For example, we may choose to host our service using both http and net.tcp. If we use IIS as our service host, this is configured via the Advanced Settings… of the Manage Application… context menu option.

The ‘Enabled Protocols’ lists the protocols we want to use. If we have more than one protocol listed then we have to take care that we don’t attempt to add the new Ping operation multiple times to our contract – once per binding type. To check that we haven’t already added the operation we call a small Linq query…

private bool PingOperationNotDeclaredInContract(ContractDescription contract) {
    return ! contract
        .Operations
        .Where(operationDescription =>
            OperationDescription.Name.Equals(PingOperationName,
                StringComparison.InvariantCultureIgnoreCase))
        .Any();
}

If the Ping operation is not found then we need to add it:

private void AddPingToContractDescription(ContractDescription contractDescription) {
    OperationDescription pingOperationDescription = new
    OperationDescription(PingOperationName, contractDescription);
    MessageDescription inputMessageDescription = new MessageDescription(
        GetAction(contractDescription, PingOperationName), MessageDirection.Input);
    MessageDescription outputMessageDescription = new MessageDescription(
        GetAction(contractDescription, PingResponse), MessageDirection.Output);
    MessagePartDescription returnValue = new MessagePartDescription("PingResult",
        contractDescription.Namespace);
    returnValue.Type = typeof(DateTime);
    outputMessageDescription.Body.ReturnValue = returnValue;
    inputMessageDescription.Body.WrapperName = PingOperationName;
    inputMessageDescription.Body.WrapperNamespace = contractDescription.Namespace;
    outputMessageDescription.Body.WrapperName = PingResponse;
    outputMessageDescription.Body.WrapperNamespace = contractDescription.Namespace;
    pingOperationDescription.Messages.Add(inputMessageDescription);
    pingOperationDescription.Messages.Add(outputMessageDescription);
    pingOperationDescription.Behaviors.Add(new DataContractSerializerOperationBehavior(pingOperationDescription));
    pingOperationDescription.Behaviors.Add(new PingOperationBehavior());
    contractDescription.Operations.Add(pingOperationDescription);
}

Here we are creating an OperationDescription to add to our ContractDescription, this adds the operation specification for our Ping operation to the existing contract. The code to execute is encapsulated by the PingOperationBehavior().

public class PingOperationBehavior : IOperationBehavior {
    public void ApplyDispatchBehavior(OperationDescription operationDescription, DispatchOperation dispatchOperation) {
        dispatchOperation.Invoker = new PingInvoker();
    }
    public void Validate(OperationDescription operationDescription) {}
    public void AddBindingParameters(OperationDescription operationDescription, BindingParameterCollection bindingParameters) {}
    public void ApplyClientBehavior(OperationDescription operationDescription, ClientOperation clientOperation) {}
 }

Similar to the endpoint behavior, we need to implement an interface in this case the IOperationBehavior. The method signatures are similar and we need to fill out the ApplyDispatchBehavior method to call an IOperationInvoker to execute our Ping implementation:

internal class PingInvoker : IOperationInvoker {
    public object[] AllocateInputs() {
        return new object[0];
    }
    public object Invoke(object instance, object[] inputs, out object[] outputs) {
        outputs = new object[0];
        return DateTime.UtcNow;
    }
    public IAsyncResult InvokeBegin(object instance, object[] inputs, AsyncCallback callback, object state) {
        throw new NotImplementedException();
    }
    public object InvokeEnd(object instance, out object[] outputs, IAsyncResult result) {
        throw new NotImplementedException();
    }
    public bool IsSynchronous {
        get { return true; }
    }
 }

The ping operation is so simple that there is no asynchronous implementation. All we do is return the current date time on the server in UTC.

So where are we? Well, we’ve added the Ping method to our existing service contract and mapped it to an IOperationBehavior which will dispatch an IOperationInvoker to call the code.

Next up, we have to update the endpoint dispatcher so that it knows what to do if it receives a Ping request message. The endpoint dispatcher maintains a list of actions that it knows how to action. We need to update this list so that it includes our new ping action. To do this we just refresh the action list from the contract description operations:

private void UpdateContractFilter(EndpointDispatcher endpointDispatcher, ContractDescription contractDescription) {
    string[] actions = (from operationDescription in contractDescription.Operations
                        select GetAction(contractDescription, operationDescription.Name)
                       ).ToArray();
    endpointDispatcher.ContractFilter = new ActionMessageFilter(actions);
}

Finally we need to add a new dispatch operation to the endpoint dispatcher so that it calls our PingOperation when the Ping action is received.

private void AddPingToDispatcher(EndpointDispatcher endpointDispatcher, ContractDescription contractDescription) {
    DispatchOperation pingDispatchOperation = new DispatchOperation(endpointDispatcher.DispatchRuntime,
        PingOperationName,
        GetAction(contractDescription, PingOperationName),
        GetAction(contractDescription, PingResponse));

    pingDispatchOperation.Invoker = new PingInvoker();
    endpointDispatcher.DispatchRuntime.Operations.Add(pingDispatchOperation);
}

private string GetAction(ContractDescription contractDescription, string name) {
    string @namespace = contractDescription.Namespace;
    if(!@namespace.EndsWith("/")) { @namespace = @namespace + "/"; }
    string action = string.Format("{0}{1}/{2}", @namespace, contractDescription.Name, name);
    return action;
}

Well, we are now done on the server side. We’ve created a couple of new classes:
• PingEndpointBehavior
• PingOperationBehavior
• PingInvoker

These three classes allow us to add a Ping method to a service by adding the Ping behavior via the service configuration file.

To test this, you can use the WcfTestClient application. The sample code demonstrates this by calling Ping on a standard WCF and WF service created by Visual Studio, see my DropBox.

In the next post I’ll discuss how we create a generic service proxy to call Ping.

UPDATE: the code works for the basicHttpBinding but not for wsHttpBinding.

UPDATE: Code now available from github.com/stefsewell/WCFPing

Securing WF & WCF Services using Windows Authentication

To finish off the DEV404 session Pete and I presented at TechEd NZ, I gave a brief run through of the steps required to get Windows Authentication working in a load balanced environment using kerberos. Given the number of camera phones that appeared for snaps I’m going to assume this is a common problem with a non-intuitive solution…

The product I work on is an on-premise enterprise solution that uses the Windows Identity to provide an authenticated credential against which to authorize user requests. We host our services in IIS/Windows Server AppFabric and take advantage of the Windows Authentication provided by IIS. This allows one of two protocols to be used: kerberos and NTLM, which have quite separate characteristics.

Why Use Kerberos?
There are two main reasons we want to use kerberos over NTLM:

1. Performance: NTLM uses a challenge response pattern for authentication which leads to a high network utilization. During performance testing we saw a high volume of NTLM challenges which ultimately throttled our ability to serve requests. Kerberos uses tickets which can be cached permitted a better performing protocol.

1. Double hops: NTLM does not flow credentials – the canonical example is a user requesting serviceA on server1 to access a secured resource on server2. Server1 cannot flow the users identity to server2.

Kerberos and Load Balancing
We want to run our services within a load balanced cluster to avoid single points of failure and to be able to grow resources to meet demand as required, without having to adopt bigger tin. The default configuration of IIS does not encourage this… the Application Pools run as a local machine account. This is a significant issue for Kerberos because of the manner in which the protocol encrypts the tickets passed between client, TGS and target server. The password of the account running the service is used to encrypt tickets so that only a process running under that account can decrypt the message. The default use of a machine specific account prevents a ticket granting access to serviceX on server A also being used to access serviceX on server B.

The following steps are required to fix this:

1. Use a common domain account for the applications pools.

We use a DOMAIN\service.expert account to run our services. This domain account is granted log on as a service and log on as a batch job rights on each of the application servers.

2. Register an SPN mapping the service class to the account.

We run our services on HTTP and so register the load balancer address with the domain account used to run the services:

>setspn -a HTTP/clusteraddress serviceAccount

We are using the WCF BasicHttpBinding which does not require the client to ensure the service is running as a particular user (to prevent man in the middle attacks). If you are using any other type of binding then the client needs to state who it expects the service to be running as.

3. Configure IIS to use the application pool account rather than a machine account

system.webServer/security/authentication/windowsAuthentication useAppPoolCredentials must be set to true.

4. Configure IIS to allow kerberos authentication tokens to be cached

system.webServer/security/authentication/windowsAuthentication authPersistNonNTLM must be set to true.

See also http://support.microsoft.com/kb/954873

5. Ensure the cluster address is considered to be in the Local Intranet zone


Kerberos tokens are not supported in the Internet zone, therefore the URL for your services must be considered to be trusted. The standard way to implement this is to roll out a group policy that adds your domain to the local intranet zone settings.

The slide deck for the talk is available from http://public.me.com/stefsewell/

PowerShell Part 2 – Installing a new service

Following on from the brief introduction to PowerShell, let’s walk through the installation script…

The script installs a simple Magic Eight Ball service that will return a pseudo-random answer to any question it’s given. The service is written as a WCF service in C#, the files to deploy are available from http://public.me.com/stefsewell/ , have a look in TechEd2010/DEV306-WindowsServerAppFabric/InstallationSource. The folder contains a web.config to set up the service activation and a bin folder with the service implementation. The PowerShell scripts are also available from the file share, look in Powershell folder in DEV306…

Pre-requisite Checking

The script begins by checking a couple of pre-requisites. If any of these checks fail then we do not attempt to install the service, instead the installing admin is told of the failed checks. There are a number of different checks we can make, in this script we check the OS version, that dependent services are installed and that the correct version of the .NET framework is available.

First we need a variable to hold whether or not we have a failure:

$failedPrereqs = $false

Next we move on to our first check: that the correct version of Windows being used:

$OSVersion = Get-WmiObject Win32_OperatingSystem
if(-not $OSVersion.Version.StartsWith('6.1')) {
    Write-Host "The operating system version is not supported, Windows 7 or Windows Server 2008 required."
    $failedPrereqs = $true
    # See http://msdn.microsoft.com/en-us/library/aa394239(v=VS.85).aspx for other properties of Win32_OperatingSystem
    # See http://msdn.microsoft.com/en-us/library/aa394084(VS.85).aspx for additional WMI classes
}

The script fetches the Win32_OperatingSystem WMI object for interrogation using Get-WmiObject. This object contains a good deal of useful information, links are provided above to let you drill down into other properties. The script checks the Version to ensure that we are working with either Windows 7 or Windows Server 2008, in which case the version starts with “6.1”.

Next we look for a couple of installed services:

# IIS is installed
$IISService = Get-Service -Name 'W3SVC' -ErrorAction SilentlyContinue
if(-not $IISService) {
    Write-Host "IIS is not installed on" $env:computername
    $FailedPrereqs = $true
}

# AppFabric is installed
$AppFabricMonitoringService = Get-Service -Name 'AppFabricEventCollectionService' -ErrorAction SilentlyContinue
if(-not $AppFabricMonitoringService) {
    Write-Host "AppFabric Monitoring Service is not installed on" $env:computername
    $FailedPrereqs = $true
}

$AppFabricMonitoringService = Get-Service -Name 'AppFabricWorkflowManagementService' -ErrorAction SilentlyContinue
if(-not $AppFabricMonitoringService) {
    Write-Host "AppFabric Workflow Management Service is not installed on" $env:computername
    $FailedPrereqs = $true
}

A basic pattern is repeated here using the Get-Service command to determine if a particular Windows Service is installed on the machine.

With the service requirements checked, we look to see if we have the correct version of the .NET framework installed. In our case we want the RTM of version 4 and go to the registry to validate this.

$frameworkVersion = get-itemProperty -Path 'HKLM:\SOFTWARE\Microsoft\NET Framework Setup\NDP\v4\Full' -ErrorAction SilentlyContinue
if(-not($frameworkVersion) -or (-not($frameworkVersion.Version -eq '4.0.30319'))){
    Write-Host "The RTM version of the full .NET 4 framework is not installed."
    $FailedPrereqs = $true
}

The registry provider is used, HKLM: [HKEY_LOCAL_MACHINE], to look up a path in the registry that should contain the version. If the key is not found or the value is incorrect we fail the test.

Those are all the checks made in the original script from the DEV306 session, however there is great feature in Windows Server 2008 R2 that allows very simple querying of the installed Windows features. I found this by accident:

>Get-Module -ListAvailable

This command lists all of the available modules on a system, the ServerManager module looked interesting:

>Get-Command -Module ServerManager

CommandType Name Definition
----------- ---- ----------
Cmdlet Add-WindowsFeature Add-WindowsFeature [-Name] [-IncludeAllSubFeature] [-LogPath ] [-...
Cmdlet Get-WindowsFeature Get-WindowsFeature [[-Name] ] [-LogPath ] [-Verbose] [-Debug] [-Err...
Cmdlet Remove-WindowsFeature Remove-WindowsFeature [-Name] [-LogPath ] [-Concurrent] [-Restart...

A simple add/remove/get interface which allows you to easily determine which Windows roles and features are installed – then add or remove as required. This is ideal for pre-requisite checking as we can now explicitly check to see if the WinRM IIS Extensions are installed for example:

import-module ServerManager

if(-not (Get-WindowsFeature ‘WinRM-IIS-Ext’).Installed) {
    Write-Host "The WinRM IIS Extension is not installed"
}

Simply calling Get-WindowsFeature lists all features and marks-up those that are installed with [X]:

PS>C:\Windows\system32> Get-WindowsFeature

Display Name Name
------------ ----
[ ] Active Directory Certificate Services AD-Certificate
[ ] Certification Authority ADCS-Cert-Authority
[ ] Certification Authority Web Enrollment ADCS-Web-Enrollment
[ ] Certificate Enrollment Web Service ADCS-Enroll-Web-Svc
[ ] Certificate Enrollment Policy Web Service ADCS-Enroll-Web-Pol
[ ] Active Directory Domain Services AD-Domain-Services
[ ] Active Directory Domain Controller ADDS-Domain-Controller
[ ] Identity Management for UNIX ADDS-Identity-Mgmt
[ ] Server for Network Information Services ADDS-NIS
[ ] Password Synchronization ADDS-Password-Sync
[ ] Administration Tools ADDS-IDMU-Tools
[ ] Active Directory Federation Services AD-Federation-Services
[ ] Federation Service ADFS-Federation
[ ] Federation Service Proxy ADFS-Proxy
[ ] AD FS Web Agents ADFS-Web-Agents
[ ] Claims-aware Agent ADFS-Claims
[ ] Windows Token-based Agent ADFS-Windows-Token
[ ] Active Directory Lightweight Directory Services ADLDS
[ ] Active Directory Rights Management Services ADRMS
[ ] Active Directory Rights Management Server ADRMS-Server
[ ] Identity Federation Support ADRMS-Identity
[X] Application Server Application-Server
[X] .NET Framework 3.5.1 AS-NET-Framework
[X] AppFabric AS-AppServer-Ext
[X] Web Server (IIS) Support AS-Web-Support
[X] COM+ Network Access AS-Ent-Services
[X] TCP Port Sharing AS-TCP-Port-Sharing
[X] Windows Process Activation Service Support AS-WAS-Support
[X] HTTP Activation AS-HTTP-Activation
[X] Message Queuing Activation AS-MSMQ-Activation
[X] TCP Activation AS-TCP-Activation
...

The right hand column contains the name of the feature to use via the command.

I ended up writing a simple function to check for a list of features:

<#
.SYNOPSIS
Checks to see if a given set of Windows features are installed.    

.DESCRIPTION
Checks to see if a given set of Windows features are installed.

.PARAMETER featureSetArray
An array of strings containing the Windows features to check for.

.PARAMETER featuresName
A description of the feature set being tested for.

.EXAMPLE
Check that a couple of web server features are installed.

Check-FeatureSet -featureSetArray @('Web-Server','Web-WebServer','Web-Common-Http') -featuresName 'Required Web Features'

#>
function Check-FeatureSet{
    param(
        [Parameter(Mandatory=$true)]
        [array] $featureSetArray,
        [Parameter(Mandatory=$true)]
        [string]$featuresName
    )
    Write-Host "Checking $featuresName for missing features..."

    foreach($feature in $featureSetArray){
        if(-not (Get-WindowsFeature $feature).Installed){
            Write-Host "The feature $feature is not installed"
        }
    }
}

The function introduces a number of PowerShell features such as comment documentation, functions, parameters and parameter attributes. I don’t intend to dwell on any as I hope the code is quite readable.

Then to use this:

# array of strings containing .NET related features
$dotNetFeatureSet = @('NET-Framework','NET-Framework-Core','NET-Win-CFAC','NET-HTTP-Activation','NET-Non-HTTP-Activ')

# array of string containing MSMQ related features
$messageQueueFeatureSet = @('MSMQ','MSMQ-Services','MSMQ-Server')

Check-FeatureSet $dotNetFeatureSet '.NET'
Check-FeatureSet $messageQueueFeatureSet 'Message Queuing'

To complete the pre-requisite check, after making each individual test the failure variable is evaluated. If true then the script ends with a suitable message, otherwise we go ahead with the install.

Installing the Service

The first step in the installation is to copy the required files from a known location. This is a pull model – the target server pulls the files across the network, rather than having the files pushed on to the server via an administration share or such like [e.g. \\myMachine\c$\Services\].

$sourcePath = '\\SomeMachine\MagicEightBallInstaller\'
$installPath = 'C:\Services\MagicEightBall'

if(-not (Test-Path $sourcePath)) {
Write-Host 'Cannot find the source path ' $sourcePath
Throw (New-Object System.IO.FileNotFoundException)
}

if(-not (Test-Path $installPath)) {
New-Item -type directory -path $installPath
Write-Host 'Created service directory at ' $installPath
}

Copy-Item -Path (Join-Path $sourcePath "*") -Destination $installPath -Recurse

Write-Host 'Copied the required service files to ' $installPath

The file structure is copied from a network share onto the machine the script is running on. The Test-Path command determines whether a path exists an allows appropriate action to be taken. To perform a recursive copy the Copy-Item command is called, using the Join-Path command to establish the source path. These path commands can be used with any provider, not just the file system.

With the files and directories in place, we now need to host the service in IIS. To do this we need to use the PowerShell module for IIS:

import-module WebAdministration # require admin-level privileges

Next…

$found = Get-ChildItem IIS:\AppPools | Where-Object {$_.Name -eq "NewAppPool"}
if(-not $found){
    New-WebAppPool 'NewAppPool'
}

We want to isolate our service into its own pool so we check to see if NewAppPool exists and if not we create it. We are using the IIS: provider to treat the web server as if it was a file system, again we just use standard commands to query the path.

Set-ItemProperty IIS:\AppPools\NewAppPool -Name ProcessModel -Value @{IdentityType=3;Username="MyServer\Service.EightBall";Password="p@ssw0rd"} # 3 = Custom

Set-ItemProperty IIS:\AppPools\NewAppPool -Name ManagedRuntimeVersion -Value v4.0

Write-Host 'Created application pool NewAppPool'

Having created the application pool we set some properties. In particular we ensure that .NET v4 is used and that a custom identity is used. The @{} syntax allows us to construct new object instances – in this case a new process model object.

New-WebApplication -Site 'Default Web Site' -Name 'MagicEightBall' -PhysicalPath $installPath -ApplicationPool 'NewAppPool' -Force

With the application pool in place and configured, we next set-up the web application itself. The New-WebApplication command is all we need, giving it the site, application name, physical file system path and application pool.

Set-ItemProperty 'IIS:/Sites/Default Web Site/MagicEightBall' -Name EnabledProtocols 'http,net.tcp' # do not include spaces in the list!

Write-Host 'Created web application MagicEightBall'

To enable both HTTP and net.tcp endpoints, we simply update the EnabledProtocols property of the web application. Thanks to default endpoints in WCF4, this is all we need to do get both protocols supported. Note: do not put spaces into the list of protocols.

Configuring AppFabric Monitoring

We now have enough script to create the service host, but we want to add AppFabric monitoring. Windows Server AppFabric has a rich PowerShell API, to access it we need to import the module:

import-module ApplicationServer

Next we need to create our monitoring database:

[Reflection.Assembly]::LoadWithPartialName("System.Data")

$monitoringDatabase = 'MagicEightBallMonitoring'
$monitoringConnection = New-Object System.Data.SqlClient.SqlConnectionStringBuilder -argumentList "Server=localhost;Database=$monitoringDatabase;Integrated Security=true"
$monitoringConnection.Pooling = $true

We need a couple of variables: a database name and a connection string. We use the SqlConnectionStringBuilder out of the System.Data assembly to get our connection string. This demonstrates the deep integration between PowerShell and .NET.

Add-WebConfiguration -Filter connectionStrings -PSPath "MACHINE/WEBROOT/APPHOST/Default Web Site/MagicEightBall" -Value @{name="MagicEightBallMonitoringConnection"; connectionString=$monitoringConnection.ToString()}

We add the connection string to our web application configuration.

Initialize-ASMonitoringSqlDatabase -Admins 'Domain\AS_Admins' -Readers 'DOMAIN\AS_Observers' -Writers 'DOMAIN\AS_MonitoringWriters' -ConnectionString $monitoringConnection.ToString() -Force

And then we create the actual database, passing in the security groups. While local machine groups can be used, in this case I’m mocking a domain group which is more appropriate for load balanced scenarios.

Set-ASAppMonitoring -SiteName 'Default Web Site' -VirtualPath 'MagicEightBall' -MonitoringLevel 'HealthMonitoring' -ConnectionStringName 'MagicEightBallMonitoringConnection'

The last step is to enable monitoring for the web application, above we are setting a ‘health monitoring’ level which is enough to populate the AppFabric dashboard inside the IIS manager.

Set-ASAppServiceMetadata -SiteName 'Default Web Site' -VirtualPath 'MagicEightBall' -HttpGetEnabled $True

Last of all we ensure that meta data publishing is available for our service. This allows us to test the service using the WCFTestClient application.

Configuration for Kerberos

This is a summary of the voodoo required to get WCF services hosted in IIS to work with a load balancer and kerberos. This took me way longer than I had hoped to figure out so I hope I can save someone else that pain.

We have recently been running some load and stress tests against our latest Golden Gate SP1 product which supports the horizontal scale out of workflow services. This scale out capability is one of the core features of Windows Server AppFabric. Our software is designed to run in an ‘on premise’ scenario and leverages Windows integrated security for authorization of users. A major performance improvement we discovered during our original Golden Gate testing was to ensure kerberos was used rather than NTLM when performing Windows Authentication. We wanted to ensure that our new services were using kerberos for Windows authentication since we had moved some of our services from being hosted as a Windows Service to being hosted in IIS, in particular the workflow services.

Note: in addition to performance advantages, you need to use Kerberos if you want to achieve multi-hop delegation of credentials, NTLM does not support this. The resources at the end of this post discuss this further.

In this post I’m going to walk through a worked example and give a checklist to follow. In a later post I may drill down into a little more of the background, in the meantime I’ll include some additional resources at the end.

Scenario
The scenario involves three application servers that are configured into a network load balanced (NLB) cluster using NLB in Windows Server 2008. The machine names are:
• svexpgg310.ap.aderant.com
• svexpgg311.ap.aderant.com
• svexpgg312.ap.aderant.com

The virtual host name for the NLB is svnlb301.ap.aderant.com.

The NLB is set-up to load balance traffic on port 80, for our HTTP based services and the port range 18180-18199 for our Windows Services. Each of the servers runs all of the services that we support horizontal scale out for and one of the servers (310) runs the services that only support a single instance. In a typical installation we have around 15 services, rather than list out all of these I’ll concentrate on two types:
• services hosted in IIS that expose HTTP endpoints
• services hosted as Windows Services that expose net.tcp endpoints

Alongside the three application servers is a database server that hosts the ADERANT Expert database, the AppFabric monitoring database and the AppFabric workflow persistence database.

The basicHttpBinding configuration used to enable Windows authentication is as follows:

      <basicHttpBinding>
        <binding name="expertBasicHttpBinding" maxReceivedMessageSize="2147483647">
          <readerQuotas maxArrayLength="2147483647" maxStringContentLength="2147483647" />
          <security mode="TransportCredentialOnly">
            <transport clientCredentialType="Windows" proxyCredentialType="Windows">
              <extendedProtectionPolicy policyEnforcement="Never" />
            </transport>
          </security>
        </binding>
      </basicHttpBinding>

1. The servers must be in the local intranet zone of any calling machines.
As of Windows Server 2003, by default only the local intranet zone supports the passing of credentials for Windows Integrated authentication between machines. This makes sense as you rarely want to pass your Windows credentials beyond your own domain. At ADERANT we have a group policy set-up so that all machines have any machine with a name matching *.aderant.com registered in the local intranet zone.

You can explicitly name the servers for the zone, also ensure that the servers are not listed in the Trusted Sites zone.

2. Windows Services exposing WCF net.tcp endpoints must have SPNs registered for both the application server and the network load balancer addresses.

When a non-basicHttpBinding is used, such as net.tcp, the WCF infrastructure checks to ensure that the service is running under the identity that the client expects. This prevents ‘man-in-the-middle’ attacks where someone spoofs the service you want to call with their own for some nefarious purpose. When you generate a service proxy against a net.tcp endpoint you’ll see something similar to the following configuration snippet in the app.config:

<client>
  <endpoint
    address="net.tcp://myserver.mydomain.com:8003/servicemodelsamples/service/spnIdentity"
    binding="netTcpBinding"
    bindingConfiguration="netTcpBinding_ICalculator_Windows"
    contract="ICalculator"
    name="netTcpBinding_ICalculator">
    <identity>
      <servicePrincipalName value="CalculatorSvc/myServer.myDomain.com:8003" />
    </identity>
  </endpoint>
</client>

There is an identity element that specifies the expected identity of the service host. There are two different options supported: and . If your service is published on a domain and you always expect the client calling the service to be online, then the userPrincipalName is easiest to configure. The value attribute contains the identity that the service is running as, e.g. value=“ADERANT_AP\service.expert”.

Alternatively you can set a servicePrincipalName, as above. The service principal name (SPN) is broken down into three parts:

serviceClassName / address [: portNumber]

The service class name is a token that uniquely represents the service. Common service classes are HTTP and HOST, the example above is using CalculatorSvc to uniquely identify a calculation service. At ADERANT we use class names such as ExpertConfigurationSvc. After the service class name comes the machine name, e.g. SVEXPGG310. Note that the NetBIOS name and the fully qualified domain names are considered to be different, it is common place to register both. For example:

ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180
ExpertConfigurationSvc/SVEXPGG310:18180

Once we have an SPN, it must be registered in Active Directory (AD) against the user account used to run the service. We recommend a service account along the lines of myDomain\service.expert to run the ADERANT services. To register this account with an SPN there is a command line tool setspn:

setspn -A ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180 service.expert

As part of our deployment tooling we automatically generate a batch file containing all the SPNs that require to be registered in AD for a given environment. An SPN must not be registered twice, this will cause errors. To see the SPNs currently registered against a user you can use the setspn tool using the -L option and passing the account name:

setspn -L service.expert

If we take our configuration service as an example, we need the following SPNs registered in AD for the scenario environment:

ExpertConfigurationSvc/SVNLB301.ap.aderant.com:18180
ExpertConfigurationSvc/SVNLB301:18180
ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180
ExpertConfigurationSvc/SVEXPGG310:18180
ExpertConfigurationSvc/SVEXPGG311.ap.aderant.com:18180
ExpertConfigurationSvc/SVEXPGG311:18180
ExpertConfigurationSvc/SVEXPGG312.ap.aderant.com:18180
ExpertConfigurationSvc/SVEXPGG312:18180

If you are running a development workstation, you will often see HOST/localhost as the SPN generated by the svcutil for locally hosted WCF services. This indicates that the service is expected to be running on the local machine.

If the service needs to support delegation then the AD account used to run the service must have this enabled:

The account must also be granted ‘Log on as a service’ rights on the application server hosting the service. This can be set-up using the local machine policies admin tool or pushed out via group policy.

3. Load balanced WCF Services hosted in IIS, using HTTP bindings, must have HTTP SPNs added for the account of the application pool.

By default an SPN is created in AD for the machine account of a server running IIS, for example HTTP/SVEXPGG310. In a load balanced scenario the machine account SPN cannot be used to issue a kerberos ticket because it is different for each machine in the application farm. Instead the kerberos ticket needs to be issued using the identity of the application pool that the web service is running under. If you have multiple application pools, these must all be running under the same account. The application pool account must have SPNs registered for the HTTP service as follows:

setspn -A HTTP/svnlb301.ap.aderant.com service.expert
setspn -A HTTP/svnlb301 service.expert
setspn -A HTTP/svexpgg310.ap.aderant.com service.expert
setspn -A HTTP/svexpgg310 service.expert
setspn -A HTTP/svexpgg311.ap.aderant.com service.expert
setspn -A HTTP/svexpgg311 service.expert
setspn -A HTTP/svexpgg312.ap.aderant.com service.expert
setspn -A HTTP/svexpgg312 service.expert

Here we have both the NetBIOS and FQDNs for the servers and the load balancer.

4. Load balanced WCF services hosted in IIS, using HTTP bindings, must use the Application Pool credentials to issue kerberos tickets.

In addition to adding the SPNs in 3, now change IIS so that it uses the app pool credentials for the kerberos ticket. This can be done either through the configuration manager in IIS or from the command line.

The obscured section path is system.webServer/security/authentication/windowsAuthentication.
From a command line:
appcmd set config /section:windowsAuthentication /useAppPoolCredentials:true

This has to be set on all of the application servers within the application farm.

While in IIS configuration, it is also worth setting authPersistNonNTLM to true, see http://support.microsoft.com/kb/954873 for details.

5. Enabled Windows Authentication on the required web applications in IIS.
There are two parts to this, the first of which is to ensure that the Windows Authentication provider for IIS is installed. This can be checked in the Windows features control panel.

The next step isto enable the Windows Authentication on the website itself. From the dashboard for the site, open the Authentication manager and then ensure that Windows Authentication is enabled:

While you are here, it’s worth checking the advanced properties of the Windows Authentication (available from the context menu) to ensure that Kernel-mode authentication is set.

This can also be set programmatically:

appcmd set config “Default Web Site/MyWebService” -section:system.webServer/security/authentication/windowsAuthentication /enabled:true /commit:apphost

Wrap up & Testing
Those are the key steps required to get kerberos working in a load balanced environment:
1. ensure the servers are in the local intranet zone.
2. create and register SPNs for net.tcp services for all app servers and the load balancer.
3. create and register HTTP SPNs for all app servers and the load balancer.
4. take care to avoid duplicate SPNs.
5. understand that NetBIOS and FQDNs require separate SPNs.
6. set useAppPoolCredentials to true on all IIS servers in the app farm.
7. run all application pools using a common domain service account, give this account permission to delegate and log on as a service.
8. ensure the web applications for the services have Windows authentication enabled.

It’s mostly straight forward once you’ve been through the steps once.

The easiest tool to test with is a browser and Fiddler. From within Fiddler you can look at the authorization headers for the HTTP requests which will show you if kerberos or NTLM is used. We expose an OData service which requires Windows authentication, it was very easy to trace the authentication negotiation going on for this site within Fiddler.

Resources
Security in WCF (MSDN Magazine): http://msdn.microsoft.com/en-us/magazine/cc163570.aspx

Patterns & Practices Kerberos Overview: http://msdn.microsoft.com/en-us/library/ff649429.aspx

Patterns & Practices WCF Security Guide: http://msdn.microsoft.com/en-us/library/ff650794.aspx

A Tale of Two Services

Now back in New Zealand after two weeks in the US, first week at TechEd and then a week in our US development centre. I finally feel free of jet lag and so it’s time to make good on a promise to write up a couple of samples I didn’t show at TechEd. The first is a quick introduction to authoring services…

The source code to accompany this post can be downloaded from http://public.me.com/stefsewell/ from the TechEd2010 folder. The sample code is in the archive ServiceAuthoringSample.zip.


A service is simply a piece of software that provides some functionality, access to this functionality is formalized into a contract. A service is often hosted in a separate process and utilized by a number of different consumers. The service does not know anything about the consumer, it just performs some work on their request. Between the consumer and service is most likely a process, machine and possibly a network boundary, therefore any data to be exchanged must be serializable. For the consumer to call the service, it must know where it lives, therefore the service has an address. The consumer must also be able to understand and be understood by the service, the supported communication protocols are captured as bindings. So there we have the ABC of Windows Communication Foundation; the Address, the Binding and the Contract.

Services in Code

With each release of Visual Studio, the key use cases that Microsoft is targeting with its tooling become easier to perform. In VS2010 the ease of service authoring and hosting has taken a leap forward and the code line count required to implement a service dropped. Let’s look at a very simple service that provides a random answer to a question, a Magic Eight Ball service. The contract for the magic eight ball is very simple and is captured as the following class:

using System.ServiceModel;

namespace MagicEightBall.CodedService {
    [ServiceContract]
    public interface MagicEightBallContract {
        [OperationContract]
        string AskQuestion(string question);
    }
}

There is a single method that takes a string containing a question and returns a string containing the answer. The System.ServiceModel namespace is the hint that we are going to use WCF to take care of our service. To provide an implementation of the service we have the following code.

using System;

namespace MagicEightBall.CodedService {
    public class MagicEightBallService : MagicEightBallContract {
        public string AskQuestion(string question) {
            return EightBall.Shake();
        }
    }

    internal sealed class EightBall {
        private readonly static Random random = new Random();
        private readonly static string[] answers = { "Yes", "No", "Ask again", "Definitely", "Bad idea", "Perhaps", "Unsure" };

        public static string Shake(){
            return answers[random.Next(0, answers.Length)];
        }
    }
}

The eight ball is captured as a simple class with a Shake method, the service is not enforcing any validation such as ensuring a question is asked to keep things simple. Note that there is no System.ServiceModel using statement, this is vanilla .NET. We have a service contract and an implementation, our coding is complete. The next step is to host the service and allow our consumers to call it. The service host can be implemented in a number of ways, for this example we are going to use WAS (Windows Process Activation Service) which uses the IIS infrastructure to host the service – we don’t need to write a host, we’ll just use one that Microsoft provides. To access the service, the host exposes an endpoint, the endpoint is composed of the address, binding and contract. One of the criticisms of WCF in .NET 3 was the steep initial learning curve required to get a service hosted and configured. In .NET 4, the idea of defaults has been introduced which greatly reduces the amount of WCF configuration required to get up and running (to the point where it is possible to have no explicit configuration). In the example below we have a little configuration due to a slightly non-standard approach.

<?xml ="1.0"?>
<configuration>
  <system.serviceModel>
    <serviceHostingEnvironment>
      <serviceActivations>
        <add relativeAddress="MagicEightBall.svc" service="MagicEightBall.CodedService.MagicEightBallService"/>
      </serviceActivations>
    </serviceHostingEnvironment>
    <behaviors>
      <serviceBehaviors>
        <behavior>
          <serviceMetadata httpGetEnabled="True"/>
          <serviceDebug includeExceptionDetailInFaults="False"/>
        </behavior>
      </serviceBehaviors>
    </behaviors>
  </system.serviceModel>
</configuration>

Here we are using the element to specify the last part of the address of the service rather than having a separate .svc file. Personally I think this is quite a tidy approach rather than having separate .config and .svc files. The section states that we want to publish metadata about this service and that we want to hide any exception details from consumers of our service. By publishing metadata about our service we allow tooling to generate a proxy class for us that allows our service to be easily called. Visual Studio provides such tooling, from within a project you can add a Service Reference:

The service reference needs to know the address of the service and then from the metadata it creates a class, the proxy, that allows the project to make use of the service. After clicking on OK, the service reference is listed as part of the project, in the sample below the MagicEightBall client is making use of two separate services.

I’m jumping a little bit ahead though, since we haven’t got the service host set up yet. We want to publish the service which we can do from within VS2010 by choosing Publish… from the context menu for the project:

A dialog pops up asking from a location to publish to, I used http://localhost/MagicEightBall which set up a new web application in IIS. By default the web application is set up to support the http protocol. If you want to change this you need to alter the ‘Enabled Protocols’ in the Advanced Settings dialog which is available from the web application context menu in IIS Manager [Manage application | Advanced Settings…].

In the example above I added the net.tcp protocol in addition to http. Note that there is no space between the comma and net.tcp. Putting a space in here will break the enabled protocols! Now we have created and published a WCF service, to test it, point your browser to http://localhost/MagicEightBall/MagicEightBall.svc. You should see the standard metadata page for your service instructing how to create a proxy class and consume it.

[Note that I have .NET 4 registered as the default framework version for IIS and so the default app pool uses .NET 4. The command C:\Windows\Microsoft.NET\Framework64\v4.0.30319\aspnet_regiis.exe -i registers .NET 4 as the default for IIS.]

To test the service, create a console application, add a service reference called MagicEightBallService using the http url. Code to call the service is as follows:

using System;
using System.Text;

using MagicEightBall.Client.MagicEightBallService;

namespace MagicEightBall.Client {
    class Program {
        private const string CodeEndpointNameHttp = "BasicHttpBinding_MagicEightBallContract";
        string question = "Will you answer my questions?";
        string answer = string.Empty;

        using (MagicEightBallContractClient client = new MagicEightBallContractClient(CodeEndpointNameHttp)) {
            answer = client.AskQuestion(question);
        }

        Console.WriteLine(answer);
    }
}

In total there is less than 30 lines of code required for us to write to define, implement, host and consume a WCF service.

Services as Workflows
There is an alternative way to author services which uses a workflow to define the service implementation. A functionally equivalent Magic Eight Ball service can be developed as a workflow service as follows…

First create a new project in VS2010 that is a ‘WCF Workflow Service Application’ which sets up the basic send / receive service template. We need to set up a couple of variables within our workflow so click on the variables button at the bottom left having selected the outer scope:

The handle is created by the template so we need to add in the question and answer strings. The variables are used to pass data into and out of activities, the activity is the equivalent of a program statement and acts on the data. In workflow it is possible to author new activities such as the EightBall in the example above. The code for the activity is as follows:

using System;
using System.Activities;

namespace MagicEightBall.WorkflowService {
    public sealed class EightBall : CodeActivity {
        private static Random random = new Random();
        private static string[] answers = { "Yes", "No", "Ask again", "Definitely", "Bad idea", "Perhaps", "Unsure" };

        public InArgument Question { get; set; }

        protected override string Execute(CodeActivityContext context) {
            string question = context.GetValue(this.Question);
            string answer = answers[random.Next(0, answers.Length - 1)];

            return answer;
        }
    }
}

This activity is essentially the same code as the Eightball class in the original service. The question is captured as an InArgument to the activity and the result is a string, specified as a CodeActivity. Note the use of the CodeActivityContext to get the value of the question from the workflow runtime at execution time.

After compiling the project we get an EightBall activity in our toolbox and this can be dragged into the service workflow. The completed implementation looks as follows with the addition of the EightBall activity:

The EightBall activity needs to have its arguments mapped to variables. The properties of the activity are defined as follows:

In the receive activity, the operation name is changed to AskQuestion and the content is changed to:

Here the receive activity expects to get a string parameter called question which is mapped to the question variable we created earlier. The receive/send activity pairing is analogous to the AskQuestion method in our coded service.

The send activity returns a string and is paired with the Receive Question send activity as shown in the Request field.

Here we are returning the answer that we got from the EightBall activity. This workflow is now functionally equivalent to our original coded example: a string containing a question is passed in, a string containing an answer is returned.

To host the workflow service, the same steps are taken as before. You simply choose to publish the service from Visual Studio into IIS. The service exposes metadata in the same way as the coded service, therefore you can as Visual Studio to generate a service reference for you and then consume the service in the same way as we did for the coded service.

So we have two ways to solve a problem – which is better? It depends on the work that the service is performing. If the service is co-ordinating work across multiple services then a workflow makes sense as it can be easier to visualize the intended flow of control. If the service co-ordination is long running and needs to be persisted then again a workflow makes sense as this long running, durable capability is built right into the workflow service host that Microsoft ships out of the box.

The sample code contains some additional concepts not discussed such as a separate activity library and instrumentation options for service code. The code is small and so hopefully this does not clutter the examples too much.

Migration from .NET 2/3/3.5 to .NET 4

During the TechEd session, the question was asked:

“How do I migrate my services from WCF3 to WCF4?”

The simple as answer is that you recompile your source under .NET 4 and you should be done. .NET 4 is backwards compatible with .NET 2/3.X but you need to recompile for the new CLR (common language runtime).

TechEd NZ 2009 Sessions

This year Microsoft have opened up the TechEd sessions to the public and so you no longer have to be a TechEd attendee to be able to be able to watch the sessions online. This includes sessions from previous years, which means the sessions I co-presented at New Zealand TechEd last year are now available.

A first look at WCF and WF in .NET 4.0
http://www.msteched.com/2009/NewZealand/SOA206

This session covered the new features in .NET 4 for WCF and WF. The slide deck was prepared and originally presented by Aaron Skonnard from Pluralsight. Mark, a colleague at ADERANT, and I were asked to present in New Zealand due to our .NET 4.0 TAP involvement (Technology Adoption Program). The demos were our own and so the content is slightly different to the original presentation.

Building declarative apps in .NET 4.0
http://www.msteched.com/2009/NewZealand/SOA306

In this session we wanted to show how Microsoft is choosing a declarative approach for much of its new technology, freeing the developer from the how and letting them concentrate on the what. Using the Visual Studio DSL toolkit is it possible to build your own visual DSLs and designers. From these models you can then use T4 to transform the model into code. This approach is at the heart of a software factory we use internally in ADERANT and has saved us from technology churn as well as speeding up product development.

Note: The DSL toolkit has been renamed for VS2010 and is now the Visual Studio Virtualization and Modeling SDK.

Data over the Web

It’s been a little while since the last posting, in no small part due to my broadband quota exceeding the monthly allowance. Dial-up speed is just painful, and made me realize just how much I use the internet for media: music, movies, podcasts, blogs, … It was also a great reminder just how sensitive applications are when you have a constrained network connection.

One of the most significant changes made to the Expert architecture with SP1 is the introduction of a query service. Prior to SP1, the architectural layering required that data transfer objects (DTOs) were used to move data from a service boundary to the client. The domain model was mapped to whatever shape was required by the client requesting the entities.

Writing the DTO and mapper classes is very repetitive and quite dull and so it was automated using the Visual DSL Toolkit for Visual Studio (now renamed to the Visual Studio Visualization and Modeling SDK). A key component in the Expert framework is our software factory which builds code from 3 models: relational model, domain model and the view model. The view model provides a model and tooling to generate use case specific views of the domain model and the mappers required to transform from domain model to view model and back. An optimization we made when sending data back to the service to update the domain model was just to send back the changes. This required the view model to track any updates made to the model between the time it was fetched from a service and the time it was sent back to the service. The mechanism we wrote to achieve that is worth a few blog entries on its own and I’m going to skip over the details here.

One of the primary clients within this architecture is our workflow service which allows data from the business services to be managed within a long running workflow process before being updated back into the main line of business system. In the original Golden Gate release, the data associated with a workflow instance is sent out with every task within the workflow (a task is a workflow activity that requires human interaction). For very large workflow processes, this can be an issue, particularly over restricted network connections such as VPN or very remote sites. For SP1 we took a look at this particular areas and addressed it in the following ways:

• Tasks now have a data contract so that only the required data is sent.
• The way we fetch data is now via a dedicated query service rather than combining reads and write operations in the same service contract. The query service is http based and therefore can take advantage of out-of-the-box optimizations such as caching and compression.

The separation of query from command at the architectural level we found is currently being explored by a number of people, most vocally by Greg Young and Udi Dahan. The architectural pattern is Command Query Responsibility Segregation (CQRS) and is similar in spirit to the Command-Query separation CQS concept first discussed by Bertrand Meyer in Object Orientated Software Construction back in 1988. This is another topic worthy of blog posts and InfoQ has a great presentation from Greg Young.

Our query service implementation is a WCF Data Service which takes an Entity Framework 4 model and exposes it as a RESTful service.


All data required by a client is fetched via the query service and this is delivered over an http channel. The use of IIS and HTTP gives us the following:

• monitoring via AppFabric
• compression via the dynamic compression in IIS7
• caching using standard HTTP based caches
• cross platform capable data feed

The lifecycle for data is now:

There are a number of interesting aspects to this, not least of which we now use two different ORM technologies: NHibernate and Entity Framework. This is based on both historical use and feature set. Given the data model that we map to, we need the rich extension points available in NHibernate to support the desired object model. The WCF Data Services and EF4 features in .NET 4 / Visual Studio 2010 take most of the heavy lifting out of exposing a domain model via REST. Microsoft is now promoting an Open Data Protocol built on top of http/atom/json as a cross platform capable mechanism for interoperating with data over the web; an ODBC for the web perhaps. The Mix10 keynote included a chapter on OData and how Microsoft is tooling it.

Given that we have a software factory that already contains a domain model which includes the persistence mapping, we just needed to add additional T4 transforms to the software factory so that we could generate a query service and view model implementation from the existing domain model. Along the way we also simplified the change tracking approach, as the explicit task data contract reduced the potential for merge conflicts.

Now that we have a query service and Microsoft is doing all it can to promote OData as a cross platform solution, there are an interesting number of options opening up. One of which is the iPhone/iPad platform from Apple. As part of the OData SDK, Microsoft has released a library and tooling to make consuming OData feeds from the Apple platform straight forward. This includes a tool that generates objective-C classes from the meta data available from an OData stream (via the $metadata directive).

How to diagnose errors in AppFabric monitoring configuration

It wasn’t the best Friday, my external hard drive died taking my work iTunes library with it and I wasn’t having much fun with AppFabric either. The dashboard showed no data and the Windows application event log kept filling up with login errors. Looking back, the afternoon was useful since I learned that little bit more about AppFabric though I didn’t get any ‘real’ work done.

I started off reading this: http://social.technet.microsoft.com/wiki/contents/articles/appfabric-items-to-check-when-configuring-appfabric-monitoring.aspx before getting stuck in.

AppFabric has two data stores: a monitoring store and a workflow persistence store. These stores are paired with two Windows services, an event collection service paired with the monitoring store and a workflow management service paired with the workflow persistence store.

Lets start with the event collection service and monitoring store. This service is responsible for capturing the WF and WCF events emitted by services hosted in IIS/WAS and storing them in the monitoring store. These events are used to populate the dashboard that is integrated into IIS Manager. To enable capture of events you can use the ‘Manage WF and WCF Services | Configure…’ option in the web application context menu or the Powershell commands Set-ASAppMonitoring and Start-ASAppMonitoring. For help on these commands call get-help, e.g. ‘get-help Set-ASAppMonitoring’, from a Powershell command line.

When you set up monitoring you need to provide a connection string name and set the monitoring level. As a minimum, the level needs to be set to Health Monitoring to populate the AppFabric dashboard. Below this are the levels Off and Errors Only which are self explanatory. Above this level are End-to-End Monitoring and Troubleshooting both of which capture additional information. End-toEnd Monitoring adds a header into WCF traffic to allow a logical call sequence to be followed. When a WCF service calls another WCF service the header is flowed across the call providing a correlation token for querying by. Note that the capture levels are cumulative, the higher level setting includes all of the events from the settings below. The higher the setting, the greater the impact on the performance of the system as more resources are required to capture and log the monitored events. For day to day operations health monitoring is recommended with the more verbose options used when required to aid troubleshooting. The connection string is a named connection string value, set as a property of the web application (or one of its ancestors). The connection string dashboard page is available from the ASP.NET section of the Features View for the web application.

Clicking on the Connection Strings option brings up the following:


Note that IIS configuration is hierarchical, the connection strings available to the Magic8Ball web application are both inherited which means they are defined at a higher node in the tree. In this case the strings are defined in the machine web.config found at %SystemDrive%\Windows\Microsoft.NET\Framework64\v4.0.30128\Config (I’m using 64-bit Windows and .NET 4.0 RC). When installing AppFabric the default connection strings are written into the machine level web.config. In my case, both connection strings are set-up to use integrated security.

The event collection service is a Windows Service and so managed through the services administration snap-in, services.msc. To help set up integrated security from Windows through to SQL Server, I run the services under a domain account. Note that if you plan to use a machine that is not always on a domain, you need to use a local machine account.


This account needs to have login rights to the SQL Server and should be mapped to the ASMonitoringDbWriter role. In my case I’ve mapped the user to all three roles set up in the monitoring store.

There are four Jobs managed by the SQL Agent that are used to populate and manage the tables in the monitoring database. These are:

The SQL Server Agent must be running on for the tables to be populated. The Import*Events jobs run every 10 seconds by default, if they are not correctly set up your application event log soon fills up with errors and warnings (as I found). These jobs call stored procedures defined in the monitoring database: ASImportTransferEvents, ASImportWcfEvents, ASImportWFEvents and run as the AS_MonitoringDbJobsAdmin. The AutoPurge job is scheduled to run once every minute and calls the ASAutoPurge stored procedure. These stored procedures in turn call ASInternal_* versions of themselves and you can drill into the SQL to see exactly what they do. To housekeep the monitoring database you can use the Clear-ASMonitoringSqlDatabase command. An other option is to move the events to an archive database so that the queries feeding the dashboard remain responsive, see Set-ASMonitoringSqlDatabaseArchiveConfiguration. The archive database can then be managed as per any audit requirements you may have.

To monitor the SQL Agent jobs, you can use the Job Activity Monitor:

The Windows Event Viewer is a great help tracking down the cause of issues and AppFabric sets up a couple of customs logs.

To see the Debug and Analytic logs you need to set the following:

Right click on a debug or analytic log and enable it. Make sure you disable it when you are finished to prevent performance degradation due to high volume event capture.

From these logs I could determine that my IIS configuration had invalid entries, the SQL Server login was failing for the Event Collector and so on. I’ll talk more about diagnosing IIS configuration issues and the workflow persistence store in the next post…