Friday, January 30, 2009

Quick Deploys And Assembly Resolution

The Error

We recently ran into the most bizarre error on a clients machine that hosts the central administration web site. Every 15 minutes on the dot we'd get the following error for many different assemblies. The error looked a lot like:

Event Type: Error
Event Source: Windows SharePoint Services 3
Event Category: Runtime
Event ID: 6611
Date: 1/15/2009
Time: 1:11:26 PM
User: N/A
Computer: [MACHINE NAME]
Description:
Error: Failure in loading assembly: [AssemblyName], Version=1.0.0.0, Culture=neutral, PublicKeyToken=e376b6bc65267f90

It's worth mentioning that these errors mentioned the names of assemblies that were used in multiple sites (even though they were deployed into bin folders). The sites that used these assemblies were customized, and their Master Pages and Page Layouts referenced the assemblies that supposedly couldn't be loaded (as far as the error was concerned. Some of these dependencies were strong named and others were not. None of them lived in the GAC (all bin folder deployed).

The Solution

We used the fact that the errors occurred every 15 minutes to link them to Quick Deploy jobs. When we changed the quick deployments to every 10 minutes, the errors followed suit reoccurring at the same interval.

At this point we were convinced that the SharePoint Timer (OWSTimer.exe) was trying to load these assemblies butEssential .NET Volume 1 couldn't find them. So how does assembly resolution happen in the .NET framework? We were saved by an an excerpt from Essential .NET, Volume 1: The Common Language Runtime, a book by Don Box and Chris Sells. Essentially it works like this.

  1. When the assembly loader goes to load an assembly, it first looks to whether the assembly is strong named. If it IS, then the loader first looks in the GAC (not the bin folder).
  2. If it can't find the assembly in the GAC, the loader will then look for <codeBase> hints in the applications configuration file. Remember that these settings inherit down from machine.config to app.config/web.config.
  3. If there's no <codeBase> hints then the loader will resort to probing as a last ditch effort to find the assembly. This includes bin folders and any other location dictated by <probing> elements.

If the above seems confusing, consider the following flow chart.Flow chart speaking to assembly resolution.

We first tried putting hints in the local web.configs, but because the OWSTimer has nothing to do with the our applications, the settings were being ignored. We finally helped the timer out by making the following modifications to the machine.config. To ensure that we didn't remap ALL lookups for the given assembly to the specific location, we had any other application who used the same assembly name override these settings below with a similar one in their own web.config. The overridden setting listed a location that made more sense for the particular web application (ie. web applications should go hunting for .dll's in their own bin folder, not some other applications).

The machine.config was changed to read:

<runtime>  
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="[NameOfAssembly]" publicKeyToken="9871993fa258bc6" culture="neutral" />
<codeBase version="1.0.0.0" href="file://C:/folder/directory/bin/[NameOfAssembly].dll" />
</dependentAssembly>
</assemblyBinding>
</runtime>

The web.config was changed to read:

<runtime>  
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="[NameOfAssembly]" publicKeyToken="9871993fa258bc6" culture="neutral" />
<codeBase version="1.0.0.0" href="file://C:/webapplicationLocation/bin/[NameOfAssembly].dll" />
</dependentAssembly>
</assemblyBinding>
</runtime>

You can also provide <codeBase> hints for assemblies that aren't strongly named, you just need to omit the the optional publicKeyToken and culture attributes. Another tip is that you can also use the <probing> element if you want to probe folders other than the bin folder.

We then restarted IIS, followed by the timer service. Low and behold, our random error went away. This toast goes out to the highly configurable .NET framework, and the expressiveness of app/web/machine.config files.

Best,
Tyler

3 comments:

Andrew said...

craziness. You can also create an app.config and drop it right next the owstimer.exe file and it will use it.

Tyler Holmes said...

Wow. That would have been a lot less invasive than a machine.config entry. I'll keep that in mind for next time.

Best,
Tyler

benzhi said...

Great post and comment, thanx! Microsoft votes against a owstimer.exe.config though: http://msdn.microsoft.com/en-us/library/cc406686(v=office.12).aspx
I'd really like to know about your experiences with these solutions, because they both seem to be a good solution to a lot of problems i encountered with assembly loading in owstimer.exe