Thursday, May 29, 2008

Error When Installing Commerce Server: The Default Web Site does not exist on this computer

The Error

I was doing a Commerce Server 2007 install today. Initially I was extremely pleased to find a very accessible evaluation/developer edition available (thanks MS). The install experience however quickly became less than pleasant.

The install itself went ok, but while I was running the Commerce Server Configuration Wizard I got this bizarre error:

The Default Web Site does not exist on this computer. This feature requires the default website to be present.

Failure

I of course hit up Google and found this one thread which suggested that Commerce Server merely polls IIS and looks for site with a Site Identifier of 1. The solution that was offered though was less than stellar though (uninstalling and then reinstalling IIS) so I decided to see if I couldn't improve upon it.

As it turns out all you need to do to get around this error is create a Web Site and make sure it has a Site Identifier of 1. We can do this by editing the IIS Metabase.xml. I've only tested this solution on my machine (running IIS 6 on Win2k3 Ent SP2). Proceed with some caution.

The Solution

These are the steps I took, you may have a slightly different approach, as far as I know you simply need a web site with an IIS Site Identifier of 1 for the config wizard to succeed.

  1. Create an IIS Web Site, name it Default Web Site if doesn't already exist.
  2. Open up the IIS Manager (Start->Run->Inetmgr) and click on Web Sites to see all the Identifiers for sites on your machine.2008.05.29 14.23.52
  3. Remember the Identifier (mine was 998577302) for the site you want to change to be the 'Default Web Site'. Now we're going to stop IIS (Start->Run->iisreset /stop).
  4. Navigate to C:\Windows\System32\Inetsrv and make a backup of your Metabase.xml. Seriously, backup this file. We're about to change it and you're going to want a backup if you mess it up.
  5. After you've made a backup open up the original in a text editor like NotePad. We're going to find the XML nodes that have the old site Identifier number (mine was 998577302) and replace that number with 1 (the new site identifier). This is apparently how the Commerce server decides if the given site is the "Default Web Site". Pretty arbitrary if you ask me but at least it's something that you can do something about.2008.05.29 14.25.46
  6. When this is done save the file and start up IIS again using the following command. (Start->Run->iisreset).
  7. Open up the IIS Manager again (Start->Run->inetmgr), click on Web Sites and notice that now the identifier for your site is 1. Run the Commerce Server Configuration wizard again and you should get a success message.Number1 success

Well that's it. Hope it helps someone.

Best,
Tyler

Monday, May 26, 2008

Http Headers and Caching: Cache-Control, Expires, Last-Modified and Pragma

HTTP Headers

There are a few topics that most web developers don't really get good exposure to early in their career. One of these topics is HTTP headers. There are certain tasks that are only really possible after developing a strong grasp on HTTP headers and how they work.
Not only do HTTP headers open up doors to certain browser behaviors, but I've found that learning them and troubleshooting them really solidifies an understanding of how the browser works and how the http request/response model operates.
This is the second of a 2 part article series dealing with HTTP headers. The first article focused on Content-Type and Content-Disposition. In this article we'll be talking about caching content and how browsers react differently to headers like Cache-Control, Expires, Last-Modified and Pragma.

Tools

Since HTTP headers are pretty much invisible without the proper tools. I'd strongly recommend one of the following utilities.
If you prefer using FireFox I'd recommend you download and get familiar with Tamper Data, a FireFox extension. Otherwise I'd consider using HTTP Fiddler if you prefer working in IE. All the screen caps of http headers you see here will be from Tamper Data.

What Are Caches and How do they Work?

When we speak of caches, we're generally referring to Browser Caches, Proxy Caches and Gateway Caches. It's important to remember that just because you never intended for any of your content to be cached doesn't mean it's not going to be cached. You rarely have any control about what kinds of caches are downstream of your web site. You CAN however instruct these caches about how to handle your content to achieve the effect that you want (which may be to NOT cache your content).

Validation Headers

I like to think of validation headers as the kind that you get for free. Validation headers are most often those emitted by web servers like IIS and Apache that help caches discern whether or not the representation being cached is still valid.
If you put a file on disk and serve it directly off of the web server (for IIS 6) you'll get two validation http headers for free, ETag and Last-Modified.
Examples of Response ETag and Last-Modified headers.
Last-Modified is pretty self explanatory, it changes whenever your content has been modified, I'm pretty sure it's analogous with the Last Modified Date of the file on disk.
ETag is slightly more interesting. It was introduced with HTTP v1.1 and is used to help caches discern if the content is the same. It is generated by the web server and different servers will go about different means to generate them.
Ideally how they're supposed to work is the browser looks at it's own cache and then issues a request with Request headers using If-Modified-Since and If-None-Match headers. If the If-None-Match header matches the ETag and the If-Modified-Since is still the same date as the Last-Modified header then the web server responds with a 304 - Not Modified response. Below is an example of the browser's (FireFox) request, and the web server's (IIS 6) response. I'll repeat, the 304 is sent because the ETag's match AND the files Last-Modified date hasn't changed. The moment you change the file the cache becomes invalid and the web server issues a new Response - 200 and serves off the file in entire. Example of request If-Modified-Since and If-None-Match request headers. Example of Not Modified - 304 reponse from IIS when ETag and Modified-Since match the request.

Pragma

Pragma is misused pretty often. There's been a misconception on the streets that issuing an http header Pragma: no-cache will tell browsers not to cache your content. It doesn't work. In fact there's nothing in the HTTP spec about Pragma being used in Response headers at all. It's supposed to be used in Request headers.
Probably the most common use of Pragma is when you press CTRL-F5 in a web browser. Your request makes use of Pragma: no-cache and looks a little like this (below). It tells the caches (including intermediary caches) not to serve any cached content.
Example of proper use of Pragma: no-cache in an HTTP request.

Expires

The simplest way to instruct caches to cache your content is to set an Expires header. It doesn't come with a lot of options and so isn't all that versatile but a lot of developers like it for it's simplicity. The only valid value an Expires header has is a date in GMT (Greenwich Mean Time) format. Google makes good use of this header on their classic Google image. If you look at the http headers on http://www.google.com/intl/en_ALL/images/logo.gif you'll see an expires header set to January of 2038 (below).
Example of Google setting the Expires reponse header on an http response.
An example of an HttpHandler that serves off a file and tells caches to cache the content for 20 days might look something like below. I checked and it does indeed cache (at least in FireFox v2.0.0.14).
public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "image/jpeg";
context.Response.AddHeader("Expires",
DateTime.Now.AddDays(20).ToUniversalTime().ToString("R"));
context.Response.WriteFile(context.Server.MapPath("stem.jpg"));
}
The associated Response looks like (below):Setting the Expires reponse header on an http response.
Some of the troubles with Expires is that it's date driven in GMT. So if the web server and the cache are out of sync time wise it's possible your caches may not be being honored like you intended.

Cache-Control

Cache-Control is the fully featured sibling of Expires. With Cache-Control you have the following possible values.
  • public — marks authenticated responses as cacheable. Normally if HTTP authentication is required (whether it's forms, NT, etc...) responses are not cached. Marking them public will allow them to be cached.
  • no-store — instructs caches not to keep a copy of the representation under any conditions.
  • no-cache — forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid freshness, without sacrificing all of the benefits of caching.
  • must-revalidate — tells caches that they must obey any freshness information you give them about a representation. HTTP allows caches to serve stale representations under special conditions; by specifying this header, you’re telling the cache that you want it to strictly follow your rules.
  • proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.
  • max-age=[seconds] — specifies the maximum amount of time that an representation will be considered fresh. Similar to Expires, this directive is relative to the time of the request, rather than absolute. [seconds] is the number of seconds from the time of the request you wish the representation to be fresh for.
  • s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches.
As you can see there's a tonne of different scenarios supported by the above values. One example might be:
Cache-Control: must-revalidate; max-age=604800
Which tells the cache to respect your instructions and to keep the representation for a week. Or consider:
Cache-Control: public, no-cache
Which makes the cache authenticate the request before releasing a copy of the cache. This is popular when caching authenticated content so that you can ensure the user is authenticated before showing them secured content.

Rules of Thumb When it Comes to Caching

  1. If the Response tells the cache not to keep the content, it wont.
  2. If the Response is secure (Cache-Control: private) it won't be cached by proxies. Some browsers may cache these data.
  3. If the Response doesn't have any cache instructions (Cache-Control, Expires) and there's no validator (Last-Modified, ETag) it won't be cached.
  4. A cached representation is considered fresh (that is, able to be sent to a client without checking with the origin server) if:
    • It has an expiry time or other age-controlling header set, and is still within the fresh period.
    • If a browser cache has already seen the representation, and has been set to check once a session.
    • If a proxy cache has seen the representation recently, and it was modified relatively long ago.
      Normally representations are served directly from the cache, without checking with the origin server.
  5. If an representation is stale, the origin server will be asked to validate it, or tell the cache whether the copy that it has is still good.

Summary

I remember back in the day when the stuff I wrote either didn't have a lot of users or was on machines farmed in such a way that caching didn't really matter. More and more these days I find my code is sharing a machine with MANY other CPU/Memory hungry applications and my sites need to be ever more efficient. One of the easiest ways to do this is to move things to the client, and that includes leveraging proven caching infrastructure at the proxy and client side.
Best,
Tyler

References:

http://www.mnot.net/cache_docs/

Saturday, May 17, 2008

Http Headers and Content: Content Type And Content-Disposition

HTTP Headers

There are a few topics that most web developers don't really get good exposure to early in their career. One of these topics is http headers. There are certain tasks that are only really possible after developing a strong grasp on HTTP Headers and how they work.

Not only do HTTP Headers open up doors to certain browser behaviors, but I've found that learning them and troubleshooting them really solidifies an understanding of how the browser works and the http request/response model.

This is the first of a 2 part article series dealing with HTTP Headers. This first part focuses on Content-Type and Content-Disposition. The second part will talk more about Cache-Control and how cache headers are involved in browser/proxy cache.

Scenario

Lets say you were tasked with writing some page which returned a file to a user. That is, the user clicks some link and then the user receives the file.

You might notice on some browsers that the file opens right in the browser, while in other browsers users get prompted to download the file. Why?

Browser Behavior

When a browser gets handed a response that is a file it has two options. It can either prompt the user to download the file (Save As Dialog), or it can try to embed the file in the browser (the file will open and the contents will visible in the browser).

Normally most browsers (including IE and Firefox) will look at the MIME type (the Content-Type http header) and if it's recognized, try to open/embed it in the browser. If the the browser doesn't recognize the MIME type of the content, then it will prompt the user with a Save File dialog and the user can choose to either save it to disk or open it with a program of her choice.

Your only way of changing this behavior (if you wanted to say prompt the user to download the file every time) is to use the Content-Disposition and Content-Type http headers. Content-Type tells the browser what kind of content it IS, Content-Disposition tells the browser how the content should be handled.

It's also of note that even if you don't explicitly set the Content-Type header (MIME type) your web server probably will probably serve it off as text/html.

Tools

Since HTTP Headers are pretty much invisible without the proper tools. I'd strongly recommend one of the following utilities.

If you prefer using FireFox I'd recommend you download and get familiar with Tamper Data, a FireFox extension. Otherwise I'd consider using HTTP Fiddler if you prefer working with IE. All the screen caps of http headers you see here will be from of Tamper Data.

Some Code

Consider the following two cases. Lets say I write an http handler that serves off content and I write it like so (below). What really happens?

public void ProcessRequest(HttpContext context)
{
context.Response.WriteFile(context.Server.MapPath("Lipsum.txt"));
}

Well the file Lipsum.txt which is just a text file gets pulled off the disk and returned for the http request. What do the http headers look like?

Default Content-Type http header from most web servers if you don't sepcify one is text/html.

Notice how there's a Content-Type header which is "text/html"? This is because we didn't specify a content type and so IIS 6 defaulted to using the "Content-Type: text/plain" http header. Also notice that there is no Content-Disposition header. That is we are not explicitly telling the browser how to handle this content.

Because the browser gets a Content-Type of "text/html" (and it knows how to display such content) it simply displays it in the browser.

Without a Content-Disposition set the browser tries to display the content for known Content-Types.

Content-Type

If we change the content type we can change the behavior. Lets assume we changed the content type to a PowerPoint (.ppt) document but still served up the same file (Lipsum.txt).

If we explicitly set a Content-Type then the browser will handle the document quite differently. Since the browser doesn't know how to display a PowerPoint document it prompts the user to use the OS to either save the document or open it in a program that supports the given Content Type.

public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "application/vnd.ms-powerpoint";
context.Response.WriteFile(context.Server.MapPath("Lipsum.txt"));
}

File being downloaded w/out Content-Disposition set. The browser doesn't know how to handle the content.

A fairly complete list of content types maintained by W3Schools can be found here.

Content-Disposition

So Content-Type allows us to tell the browser what kind of content we're handing it and it can behave accordingly. But what if we wanted to tell the browser how to handle content. Content-Disposition allows us to name the file that comes down over the wire and tell the browser to either whether to try and embed it in the browser, or to prompt the user to save the document to disk/open it in another program.

Before when Lipsum.txt was offered up with a Content-Type of text/html it was embedded in the browser (see above). What if we wanted to instead prompt the user to save the content to disk even if the browser is capable of displaying it? And what if we wanted to name the file as it came down instead of having it be the name of the .aspx/.ashx file serving it off like the screen cap above (FileHandler.ashx)?

public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "text/plain";
context.Response.AddHeader("Content-Disposition",
"attachment; filename=Lipsum.txt");
context.Response.WriteFile("Lipsum.txt");
}

File being downloaded (attached) with Content-Type and Content-Disposition set.

In the example above we set a Content-Disposition to "attachment" which specifies that the file should be downloaded as opposed to embedded in the browser. If we wanted it to be shown in the browser we would have set it to "inline". We also named the file coming over the wire by setting the "filename" property. We got the behavior from the screen cap above because the http headers looked like so:

File being downloaded with both a Content-Type and Content-Disposition header set.

Summary

In this article we talked briefly about how browsers handle content. If you don't set a Content-Type then the web server will often set one for you, this is often "text/html".

If you don't set a Content-Disposition then the filename of the document will often default to the page serving the content. The browser will also try to open the document (if it recognizes the content type), otherwise it will prompt the user to download the content. The only way to set the file name of the file and how it should be handled is to set a Content-Disposition header.

Best,
Tyler