Sunday, November 30, 2014

Adjustments For SQL Azure – No OLE DB

Same Same, But Different

sql-azureLately, I’ve been working with SQL Azure a lot more. For some of our simpler projects the differences aren’t that palpable, but the more data centric applications have started to tease out some of the more notable differences.

Over next couple of months I’ll write up some of the more glaring ones and how they’re likely to impact development (e.g. throttling, connection dropping, missing system stored procedures, statistics on sample slowing of some our ETL on Basic, Standard, Professional, Web, and Business editions vs. on premise deployments).

Today I’d like to talk about one of the documented, but less obvious differences. SQL Azure does not support OLE DB (at least not at time of writing).

An OLE DB Diet

Even though connecting to SQL Azure via OLE DB is a documented limitation, it still works (i.e. no error is thrown, commands are processed, and result sets are returned).

Even commonly used sites like connectionstrings.com list OLE DB connections in their SQL Azure section, so it’s not obvious to most developers new to SQL Azure that this genre of providers are not supported. I wouldn’t be surprised if a lot of teams are using unsupported OLE DB providers.

In addition, most of the SSIS/ETL developers that I’ve worked with tend to favor OLE DB components. A lot of this stems from bad experiences with initial ADO.NET implementations when they first came out (lack of 3rd party support, initial implementations by ISVs tended to be slower than their OLE DB implementations), but pretty much all of these have been addressed these days, and ADO.NET providers are in most cases equally efficient.

As an alternative, when connecting to SQL Azure you’re encouraged to use ADO.NET as a provider. This shouldn’t cause much acid reflux for most .NET developers but what’s the difference and should I care?

What are These Technologies Doing For Me Anyways?

Remember that OLE DB and ADO.NET are both data access APIs designed by Microsoft to allow data access with common reusable paradigms across vendor specific implementations (e.g. DB2, PostgreSQL, Oracle, etc…).

OLE DB breaks up the API into contracts that should be implemented by OLE DB providers and OLE DB consumers, and is based on COM.

ADO.NET also dictates an API for providers and consumers, asking that they rally around the DbConnection, DbCommand, DbDataReader, and DbDataAdaptor classes. The implementations fore these are generally written in managed code.

These days, popular software products almost always have equivalently speedy implementations in both flavors (OLE DB and ADO.NET).

But if you’re looking to connect to either an older system or one that has very sparse support, you might have an easier time finding an OLE DB provider (but this scenario should be rare).

So…What’s the Difference?

This is one of those questions that seldom receives a straight forward answer. In the case of SQL Azure…one isn’t supported so use ADO.NET.

In the general case…it depends. When evaluating the differences between provider architectures, here are some points of interest that help make the decision:

  1. Tool Support: It’s probably still possible to find SSIS components (mostly from 3rd parties) that don’t support ADO.NET and DO support OLE DB, but this is starting to become relatively rare.
  2. Performance: It used to be that ADO.NET provider implementations were just managed facades around existing OLE DB implementations. But again for most well supported data stores and components, this is no longer the case.
  3. 32-bit vs. 64-bit Environments: If you have mixed environments (32-bit and 64-bit), a managed provider (like ADO.NET) is likely to simplify your deployments as you won’t need to worry about which assembly to deploy. With OLE DB you’ll need to ensure that the correct driver is deployed AND that there are both 32-bit and 64-bit implementations from the original vendor. If there’s not, you can still run the 32-bit driver in a 64-bit environment with WOW64, but you won’t be able to address more than 4GB of memory (32-bit limitation) and you can’t run mixed architectures (i.e. 32-bit and 64-bit) in the same process.

In most cases you’re fine to go with the managed providers, but it’s always worth a cursory web search to get a feel for any advantages or pitfalls with alternative providers (OLE DB, ODBC, etc…).

Hope That Helps,
Tyler