Since I wrote Whose data is it anyway? and Whose data is it anyway? Part 2 I've started to dive a bit deeper into the subject of web analytics data ownership as part of my vendor research for the just released The 2009 Web Analytics Report and because I'm managing a vendor selection process for a Federal government entity.
It’s a complex issue and one that often gets overlooked during web analytics vendor selection and contract negotiations. Many analysts and web managers whom I speak to at large organizations either don’t read or don’t have access to their vendor contracts…and generally don’t ask about data ownership during the vendor evaluation process. Many people just assume that they own their web analytics data.
So what does data ownership really mean? Is it simply a “concept”, or does it translate into anything of business value? Well, it depends on your vendor…what they say, what they don’t say and what they put in writing.
Let’s look at the four aspects of "data
Use of Your Data
As we’ve already written, Google Analytics and Yahoo! Web Analytics state in their contracts that may use your data for a variety of commercial purposes. Not all, but a few of the fee-based vendors state that they’ll use the data for benchmarking or industry average information. This seems to be a more palatable pill to swallow for large, security-concerned enterprises, such as those in government and financial services.
All of the largest web analytics vendors have gone to great lengths to ensure the safety of your data from the perspective of storage, access, and redundancy and server failover. Yet if there’s a data loss, you’re on your own. All agreements absolve the vendors of any liability. So, in this case, what does data ownership really do for you? Not much.
Being able to run reports on historical data and having access to unaggregated data as far back as your contract runs may be considered to be a sign of data ownership, and one that many take for granted, but it is nothing of the sort. There are basically two types of web analytics data you can retain: data that is used in creating summary tables used for the reports you run in the core solution, and unaggregated data that makes up your web analytics database. The length of time that this data is kept may or may not be in the contract. We find this to be part of your service agreement, and can be completely negotiable. Your vendor will have a default length of time for both types of data…typically 12-36 months. You want more than the default, you can pay for it. We know of one fee-based vendor that breaks out default retention on an hourly, daily, weekly and monthly basis…just a bit more complicated and just another way to add to your cost of ownership. The upshot of this: Data retention has real meaning for what “data ownership” is all about…or does it?
Data access- API or file extraction of core data tables for export to another system.
Before you get all excited about making a deal for keeping your data for 36 months, you have to think about how you’re going to export the data out of the web analytics solution. While it may be easy to pull old reports, what if you want to extract the unaggregated data to export to another database, or what if you want to use it with a business intelligence tool? How do you get it? Vendors that provide a data warehouse capability, are happy to sell you this capability and therefore your underlying data is available and extractable…typically through APIs. But if you don’t purchase a data warehouse capability or your vendor doesn’t offer one, you could get your unaggregated data in a state that is really hard to deal with…it could be unprocessed, meaning that the data is in a “log file” state, or it could be in a CSV format. In either case, you’ll have a lot of work to do to get the data in a state that is usable.
So, what’s “data ownership”? I think you have to figure out what elements of data ownership make the most sense. From my perspective, making sure that you can get your hands on your web data when you need it and with as little interference as possible are prime considerations. Security and privacy are paramount, but if you choose to go with an on-demand solution, you’ll need to be assured that the vendor meets your criteria, knowing that if there’s a data loss, you’ll have little recourse.