Configuration for Kerberos
July 11, 2010 Leave a comment
This is a summary of the voodoo required to get WCF services hosted in IIS to work with a load balancer and kerberos. This took me way longer than I had hoped to figure out so I hope I can save someone else that pain.
We have recently been running some load and stress tests against our latest Golden Gate SP1 product which supports the horizontal scale out of workflow services. This scale out capability is one of the core features of Windows Server AppFabric. Our software is designed to run in an ‘on premise’ scenario and leverages Windows integrated security for authorization of users. A major performance improvement we discovered during our original Golden Gate testing was to ensure kerberos was used rather than NTLM when performing Windows Authentication. We wanted to ensure that our new services were using kerberos for Windows authentication since we had moved some of our services from being hosted as a Windows Service to being hosted in IIS, in particular the workflow services.
Note: in addition to performance advantages, you need to use Kerberos if you want to achieve multi-hop delegation of credentials, NTLM does not support this. The resources at the end of this post discuss this further.
In this post I’m going to walk through a worked example and give a checklist to follow. In a later post I may drill down into a little more of the background, in the meantime I’ll include some additional resources at the end.
Scenario
The scenario involves three application servers that are configured into a network load balanced (NLB) cluster using NLB in Windows Server 2008. The machine names are:
• svexpgg310.ap.aderant.com
• svexpgg311.ap.aderant.com
• svexpgg312.ap.aderant.com
The virtual host name for the NLB is svnlb301.ap.aderant.com.
The NLB is set-up to load balance traffic on port 80, for our HTTP based services and the port range 18180-18199 for our Windows Services. Each of the servers runs all of the services that we support horizontal scale out for and one of the servers (310) runs the services that only support a single instance. In a typical installation we have around 15 services, rather than list out all of these I’ll concentrate on two types:
• services hosted in IIS that expose HTTP endpoints
• services hosted as Windows Services that expose net.tcp endpoints
Alongside the three application servers is a database server that hosts the ADERANT Expert database, the AppFabric monitoring database and the AppFabric workflow persistence database.
The basicHttpBinding configuration used to enable Windows authentication is as follows:
<basicHttpBinding> <binding name="expertBasicHttpBinding" maxReceivedMessageSize="2147483647"> <readerQuotas maxArrayLength="2147483647" maxStringContentLength="2147483647" /> <security mode="TransportCredentialOnly"> <transport clientCredentialType="Windows" proxyCredentialType="Windows"> <extendedProtectionPolicy policyEnforcement="Never" /> </transport> </security> </binding> </basicHttpBinding>
1. The servers must be in the local intranet zone of any calling machines.
As of Windows Server 2003, by default only the local intranet zone supports the passing of credentials for Windows Integrated authentication between machines. This makes sense as you rarely want to pass your Windows credentials beyond your own domain. At ADERANT we have a group policy set-up so that all machines have any machine with a name matching *.aderant.com registered in the local intranet zone.
You can explicitly name the servers for the zone, also ensure that the servers are not listed in the Trusted Sites zone.
2. Windows Services exposing WCF net.tcp endpoints must have SPNs registered for both the application server and the network load balancer addresses.
When a non-basicHttpBinding is used, such as net.tcp, the WCF infrastructure checks to ensure that the service is running under the identity that the client expects. This prevents ‘man-in-the-middle’ attacks where someone spoofs the service you want to call with their own for some nefarious purpose. When you generate a service proxy against a net.tcp endpoint you’ll see something similar to the following configuration snippet in the app.config:
<client> <endpoint address="net.tcp://myserver.mydomain.com:8003/servicemodelsamples/service/spnIdentity" binding="netTcpBinding" bindingConfiguration="netTcpBinding_ICalculator_Windows" contract="ICalculator" name="netTcpBinding_ICalculator"> <identity> <servicePrincipalName value="CalculatorSvc/myServer.myDomain.com:8003" /> </identity> </endpoint> </client>
There is an identity element that specifies the expected identity of the service host. There are two different options supported: and . If your service is published on a domain and you always expect the client calling the service to be online, then the userPrincipalName is easiest to configure. The value attribute contains the identity that the service is running as, e.g. value=“ADERANT_AP\service.expert”.
Alternatively you can set a servicePrincipalName, as above. The service principal name (SPN) is broken down into three parts:
serviceClassName / address [: portNumber]
The service class name is a token that uniquely represents the service. Common service classes are HTTP and HOST, the example above is using CalculatorSvc to uniquely identify a calculation service. At ADERANT we use class names such as ExpertConfigurationSvc. After the service class name comes the machine name, e.g. SVEXPGG310. Note that the NetBIOS name and the fully qualified domain names are considered to be different, it is common place to register both. For example:
ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180 ExpertConfigurationSvc/SVEXPGG310:18180
Once we have an SPN, it must be registered in Active Directory (AD) against the user account used to run the service. We recommend a service account along the lines of myDomain\service.expert to run the ADERANT services. To register this account with an SPN there is a command line tool setspn:
setspn -A ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180 service.expert
As part of our deployment tooling we automatically generate a batch file containing all the SPNs that require to be registered in AD for a given environment. An SPN must not be registered twice, this will cause errors. To see the SPNs currently registered against a user you can use the setspn tool using the -L option and passing the account name:
setspn -L service.expert
If we take our configuration service as an example, we need the following SPNs registered in AD for the scenario environment:
ExpertConfigurationSvc/SVNLB301.ap.aderant.com:18180 ExpertConfigurationSvc/SVNLB301:18180 ExpertConfigurationSvc/SVEXPGG310.ap.aderant.com:18180 ExpertConfigurationSvc/SVEXPGG310:18180 ExpertConfigurationSvc/SVEXPGG311.ap.aderant.com:18180 ExpertConfigurationSvc/SVEXPGG311:18180 ExpertConfigurationSvc/SVEXPGG312.ap.aderant.com:18180 ExpertConfigurationSvc/SVEXPGG312:18180
If you are running a development workstation, you will often see HOST/localhost as the SPN generated by the svcutil for locally hosted WCF services. This indicates that the service is expected to be running on the local machine.
If the service needs to support delegation then the AD account used to run the service must have this enabled:
The account must also be granted ‘Log on as a service’ rights on the application server hosting the service. This can be set-up using the local machine policies admin tool or pushed out via group policy.
3. Load balanced WCF Services hosted in IIS, using HTTP bindings, must have HTTP SPNs added for the account of the application pool.
By default an SPN is created in AD for the machine account of a server running IIS, for example HTTP/SVEXPGG310. In a load balanced scenario the machine account SPN cannot be used to issue a kerberos ticket because it is different for each machine in the application farm. Instead the kerberos ticket needs to be issued using the identity of the application pool that the web service is running under. If you have multiple application pools, these must all be running under the same account. The application pool account must have SPNs registered for the HTTP service as follows:
setspn -A HTTP/svnlb301.ap.aderant.com service.expert setspn -A HTTP/svnlb301 service.expert setspn -A HTTP/svexpgg310.ap.aderant.com service.expert setspn -A HTTP/svexpgg310 service.expert setspn -A HTTP/svexpgg311.ap.aderant.com service.expert setspn -A HTTP/svexpgg311 service.expert setspn -A HTTP/svexpgg312.ap.aderant.com service.expert setspn -A HTTP/svexpgg312 service.expert
Here we have both the NetBIOS and FQDNs for the servers and the load balancer.
4. Load balanced WCF services hosted in IIS, using HTTP bindings, must use the Application Pool credentials to issue kerberos tickets.
In addition to adding the SPNs in 3, now change IIS so that it uses the app pool credentials for the kerberos ticket. This can be done either through the configuration manager in IIS or from the command line.
The obscured section path is system.webServer/security/authentication/windowsAuthentication.
From a command line:
appcmd set config /section:windowsAuthentication /useAppPoolCredentials:true
This has to be set on all of the application servers within the application farm.
While in IIS configuration, it is also worth setting authPersistNonNTLM to true, see http://support.microsoft.com/kb/954873 for details.
5. Enabled Windows Authentication on the required web applications in IIS.
There are two parts to this, the first of which is to ensure that the Windows Authentication provider for IIS is installed. This can be checked in the Windows features control panel.
The next step isto enable the Windows Authentication on the website itself. From the dashboard for the site, open the Authentication manager and then ensure that Windows Authentication is enabled:
While you are here, it’s worth checking the advanced properties of the Windows Authentication (available from the context menu) to ensure that Kernel-mode authentication is set.
This can also be set programmatically:
appcmd set config “Default Web Site/MyWebService” -section:system.webServer/security/authentication/windowsAuthentication /enabled:true /commit:apphost
Wrap up & Testing
Those are the key steps required to get kerberos working in a load balanced environment:
1. ensure the servers are in the local intranet zone.
2. create and register SPNs for net.tcp services for all app servers and the load balancer.
3. create and register HTTP SPNs for all app servers and the load balancer.
4. take care to avoid duplicate SPNs.
5. understand that NetBIOS and FQDNs require separate SPNs.
6. set useAppPoolCredentials to true on all IIS servers in the app farm.
7. run all application pools using a common domain service account, give this account permission to delegate and log on as a service.
8. ensure the web applications for the services have Windows authentication enabled.
It’s mostly straight forward once you’ve been through the steps once.
The easiest tool to test with is a browser and Fiddler. From within Fiddler you can look at the authorization headers for the HTTP requests which will show you if kerberos or NTLM is used. We expose an OData service which requires Windows authentication, it was very easy to trace the authentication negotiation going on for this site within Fiddler.
Resources
Security in WCF (MSDN Magazine): http://msdn.microsoft.com/en-us/magazine/cc163570.aspx
Patterns & Practices Kerberos Overview: http://msdn.microsoft.com/en-us/library/ff649429.aspx
Patterns & Practices WCF Security Guide: http://msdn.microsoft.com/en-us/library/ff650794.aspx