Saturday, May 17, 2008

Http Headers and Content: Content Type And Content-Disposition

HTTP Headers

There are a few topics that most web developers don't really get good exposure to early in their career. One of these topics is http headers. There are certain tasks that are only really possible after developing a strong grasp on HTTP Headers and how they work.

Not only do HTTP Headers open up doors to certain browser behaviors, but I've found that learning them and troubleshooting them really solidifies an understanding of how the browser works and the http request/response model.

This is the first of a 2 part article series dealing with HTTP Headers. This first part focuses on Content-Type and Content-Disposition. The second part will talk more about Cache-Control and how cache headers are involved in browser/proxy cache.

Scenario

Lets say you were tasked with writing some page which returned a file to a user. That is, the user clicks some link and then the user receives the file.

You might notice on some browsers that the file opens right in the browser, while in other browsers users get prompted to download the file. Why?

Browser Behavior

When a browser gets handed a response that is a file it has two options. It can either prompt the user to download the file (Save As Dialog), or it can try to embed the file in the browser (the file will open and the contents will visible in the browser).

Normally most browsers (including IE and Firefox) will look at the MIME type (the Content-Type http header) and if it's recognized, try to open/embed it in the browser. If the the browser doesn't recognize the MIME type of the content, then it will prompt the user with a Save File dialog and the user can choose to either save it to disk or open it with a program of her choice.

Your only way of changing this behavior (if you wanted to say prompt the user to download the file every time) is to use the Content-Disposition and Content-Type http headers. Content-Type tells the browser what kind of content it IS, Content-Disposition tells the browser how the content should be handled.

It's also of note that even if you don't explicitly set the Content-Type header (MIME type) your web server probably will probably serve it off as text/html.

Tools

Since HTTP Headers are pretty much invisible without the proper tools. I'd strongly recommend one of the following utilities.

If you prefer using FireFox I'd recommend you download and get familiar with Tamper Data, a FireFox extension. Otherwise I'd consider using HTTP Fiddler if you prefer working with IE. All the screen caps of http headers you see here will be from of Tamper Data.

Some Code

Consider the following two cases. Lets say I write an http handler that serves off content and I write it like so (below). What really happens?

public void ProcessRequest(HttpContext context)
{
context.Response.WriteFile(context.Server.MapPath("Lipsum.txt"));
}

Well the file Lipsum.txt which is just a text file gets pulled off the disk and returned for the http request. What do the http headers look like?

Default Content-Type http header from most web servers if you don't sepcify one is text/html.

Notice how there's a Content-Type header which is "text/html"? This is because we didn't specify a content type and so IIS 6 defaulted to using the "Content-Type: text/plain" http header. Also notice that there is no Content-Disposition header. That is we are not explicitly telling the browser how to handle this content.

Because the browser gets a Content-Type of "text/html" (and it knows how to display such content) it simply displays it in the browser.

Without a Content-Disposition set the browser tries to display the content for known Content-Types.

Content-Type

If we change the content type we can change the behavior. Lets assume we changed the content type to a PowerPoint (.ppt) document but still served up the same file (Lipsum.txt).

If we explicitly set a Content-Type then the browser will handle the document quite differently. Since the browser doesn't know how to display a PowerPoint document it prompts the user to use the OS to either save the document or open it in a program that supports the given Content Type.

public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "application/vnd.ms-powerpoint";
context.Response.WriteFile(context.Server.MapPath("Lipsum.txt"));
}

File being downloaded w/out Content-Disposition set. The browser doesn't know how to handle the content.

A fairly complete list of content types maintained by W3Schools can be found here.

Content-Disposition

So Content-Type allows us to tell the browser what kind of content we're handing it and it can behave accordingly. But what if we wanted to tell the browser how to handle content. Content-Disposition allows us to name the file that comes down over the wire and tell the browser to either whether to try and embed it in the browser, or to prompt the user to save the document to disk/open it in another program.

Before when Lipsum.txt was offered up with a Content-Type of text/html it was embedded in the browser (see above). What if we wanted to instead prompt the user to save the content to disk even if the browser is capable of displaying it? And what if we wanted to name the file as it came down instead of having it be the name of the .aspx/.ashx file serving it off like the screen cap above (FileHandler.ashx)?

public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "text/plain";
context.Response.AddHeader("Content-Disposition",
"attachment; filename=Lipsum.txt");
context.Response.WriteFile("Lipsum.txt");
}

File being downloaded (attached) with Content-Type and Content-Disposition set.

In the example above we set a Content-Disposition to "attachment" which specifies that the file should be downloaded as opposed to embedded in the browser. If we wanted it to be shown in the browser we would have set it to "inline". We also named the file coming over the wire by setting the "filename" property. We got the behavior from the screen cap above because the http headers looked like so:

File being downloaded with both a Content-Type and Content-Disposition header set.

Summary

In this article we talked briefly about how browsers handle content. If you don't set a Content-Type then the web server will often set one for you, this is often "text/html".

If you don't set a Content-Disposition then the filename of the document will often default to the page serving the content. The browser will also try to open the document (if it recognizes the content type), otherwise it will prompt the user to download the content. The only way to set the file name of the file and how it should be handled is to set a Content-Disposition header.

Best,
Tyler

8 comments:

Peter said...

Thanks for the post! I remember seeing a "webserver-generated Excel report" and wondered how it was done specifically. Looking forward to part 2.

Anonymous said...

what if you want to display the file from the database to browser that file is stored as bytea

Tyler Holmes said...

If that's the case I would recommend writing an HttpHandler that takes a URL with a series of params (maybe filename or FileId). This HttpHandler takes those params, runs off to the database and then Response.BinaryWrite() or Response.Write()s the contents into the response stream. There's a tonne of examples out there if you Google HttpHandler and File Handler. Just be sure to do it right and set appropriate content disposition and content type!

It's of note that when ASP.NET get's files uploaded it's usually through an HttpPostedFile control, which has a ContentType property. It's usually an extremely good idea to store that content type in the table along with the file.

Best,
Tyler

Anonymous said...

If i don't have a web server, Is there a way to specify the content-type and content disposition using only either java script or html.

Tyler Holmes said...

Unfortunately no (at least not that I'm aware of). Content Disposition (and for the most part Content Type) are used to describe responses. The only way I could think of doing this client side is writing a browser plug in, but this would of course only work for the browser you authored it for in the first place.

Sorry if that's not helpful.

Best,
Tyler

Johnny said...

you can change the content-type and disposition in the META tags in your html:

meta http-equiv="Content-Type" content="text/html; charset=utf-8"

meta name="content-disposition" content="inline; filename=excel.csv"

etc...

Tyler Holmes said...

That's a good point, I completely forgot about Meta Tags. It might be worth noting that not all proxies look at meta tags so it might not be a great candidate to dictate caching,
but is should be fine for content-disposition and content-type.

Michael Brenden said...

March 2015 and here we are all over again.

----- FAILURE CASE -----

For the URI "/pdf/a1b2c3"
this works on Opera 28 and FF 36

Content-Type: application/pdf
Content-Disposition: attachment; filename="test 1.pdf"

but it fails on Chrome b43 -- the filename is NEVER "test 1.pdf" but instead becomes "a1b2c3.pdf" (the trailing .pdf being magically added). So, WTF ?

----- SUCCESS CASE -----

For the URI "/pdf/a1b2c3"
this works on Opera 28, FF 36 and Chrome b43

Content-Type: application/octet-stream (or just remove this entire header line)
Content-Disposition: attachment; filename="test 1.pdf"

then filename is "test 1.pdf" as it should have been all along.


This is an example of everything else working great and Chrome failing.