Tuesday, January 13, 2009

Big Fat Content Deployment Jobs

The Blessings and Woes of Content Deployment

If you've spent a lot of time dealing with Incremental Content Deployment Jobs, then you've most likely either developed the patience of a saint, or a taste for 80 proof liquor.

This particular feature has worked, and then not worked over a series of SharePoint patches. For example, it didn't work under RTM for many sites, then a hot fix corrected the behavior. It was then broken again for most web applications under service pack one, which was followed by a fix in the Infrastructure Update.

Because this particular feature has dealt the world so much hair loss, a lot of clients I've met with have resorted to using full content deployment jobs to get their content from one site collection to another.

The key difference between the two is that an incremental deployment job will deploy only changes (new records, updates, and deletes). A full content deployment job will deploy everything in the site/branch/site collection that you select (that is, it will deploy a copy of the current version, not the version history as well). If you select a site, you're about to get another copy of every asset in that site. If you select the entire site collection, there's even more content that's about to get duplicated.

Hold Off On That Full Content Deployment

The real problem with resorting to full content deployments is that if you're prone to using them too often your content database size can go through the roof. For those who normally deal with SharePoint capacity and planning, this can be a nightmare. The last thing you want is bloated a content database with duplicate copies of 40 MB PowerPoint presentations coupled with the HR departments prized PDF collection.

A particular client we work with ended up using full content deployments for sub sites about 6 months ago because they couldn't get incremental content deployments to work. Before they knew it their destination site collection was 16 times the size of the source site collection. Luckily this particular content database was pretty lean (about 150 MB, and so the destination only ended up being ~2.5 GB). Imagine if they had a source site collection that was around 50 GB (which is very common for many intranet site collections), they'd be looking at a destination site collection which would be ~ 0.8 TB, a near unmanageable (or at least painful) amount of data for most small to medium IT shops.

No one really wins when it comes to regular full content deployments. Even if there's quotas on the site collection the inherent duplication still steals space that users could otherwise use to store useful data. If patching it isn't something you feel you can easily do yourself consider contacting MS SharePoint Support, they're not half bad, and they're pretty cheap ($250 last time I checked).

If you're not already convinced, here's a couple screen caps of data pulled from  Red Gate SQL Data Compare detailing the differences on just a small site. The first cap is a before/after comparison of a sub site being deployed using an incremental job after a small change as taken place (content edit).

The second screen cap is of the same site before/after a full content deployment job targeting the same sub site.

Record Difference Before and After an Incremental Content Deployment


Record Difference Before and After a Full Content DeploymentSQLCompareFull

These results will of course dramatically change with the site of your site collection and what you're doing a full content deployment on. The point is just to offer a reminder that the unnatural growth of the content database just may not be worth it in the long run. If you're currently wed to full content deployments I'd suggest either getting a divorce ASAP, or investing in some sizable disks to help manage the data explosion.

Buying more disks,

1 comment:

Shoukry said...

Hi Tyler,

Thank you for your interting article. This already the case we have in our production website. the production website is about 18 times the staging website. we use content deployment job between the two sites.
My question is, How can I reduce now the production website size?
Now on, I will use incremental deployment, but I want to reduce the size of the production for the time being.
Your reply is highly appreciated.