The Difference Between Historians...Data, Process, Site, Enterprise and Cloud

If you have spent any time researching industrial databases, you have likely run into several different labels for historians. So what’s the difference between them, and most importantly, which one do you need to get the job done?

Nov 16, 2021

If you have spent any time researching industrial databases, you have likely run into several different labels for historians. So what’s the difference between them, and most importantly, which one do you need to get the job done?

It would be helpful if the answer was simply ‘they are all the same, it doesn’t matter'. In the case of Canary, that is exactly the answer; whether you need storage at the edge, site, enterprise, or in the cloud, Canary uses the same technology at the same scalable price point.

However, you should be warned, when looking outside the 'Canary-verse' you probably need to understand the difference as it can greatly affect the capability of the technology as well as what you have to pay to use it.

The Process Historian Is Born

A bit of background might be helpful. When time series specific databases were first being developed for industrial automation, two main companies were participating, OSIsoft and Canary. OSIsoft started in 1980, Canary in 1984. Considering how limiting the available technology was at the time, the effective use case for these databases was isolated to localized operations. Since the data that was being collected from these sites was representative of the local process, the term process historian was coined and used interchangeably with data historian.

SQL Historian Shows Up

Meanwhile, the first commercially available versions of SQL relational databases were being released (Oracle 1979) and gaining more and more traction and use cases. In time, SQL-based solutions began to emerge in the data historian marketplace. To differentiate the data historian technology that was being offered by OSIsoft's Pi Archive and the Canary Historian from SQL, the term time series began to be emphasized. Time series simply refers to the vertical storage of data values based on a timestamp property. It has been long held and understood that for large volumes of records, SQL or relational databases will not perform to the same standards as a time series database. To further ensure the underlying technology was obvious to the potential customer, Canary and OSIsoft further distinguished themselves as ‘NoSQL time series’ databases.

Enterprise Births Site Historians

By the 1990s, process historians and data historians were becoming commonly used tools and were integral parts of most control systems. As networking became widespread within larger organizations the ability to share data collected at remote sites with team members at other locations became possible. As a result, tools were built to replicate local data historians to corporate data historians, and the enterprise historian was born. In some cases, the technology that was deployed at the enterprise level was different, in other cases, the only difference was the price list. Either way, once the term enterprise historian was coined, it became natural to then refer to the data historian at the plant level as a site historian.

The Cloud Historian Arrives

Remember a time when the only people that talked about clouds were meteorologists? With the increasing popularity of cloud-based solutions, it became clear in the 2010’s that data historians would also have a place in the cloud, hence the term cloud historian became popular. Immediately, confusion around historians and data lakes ensued. Do you need both? Can one take the place of the other? You really don't need a data lake, in fact, most would agree they should be called data swamps! Why not just land your historian in the cloud and consume data on demand as needed? What data lakes are always missing is context and structure. A historian can provide more structure and better performance.

The Trappings of Historian Labels

So now that the terms have been defined, a question...why continue to offer separate solutions rather than one install that can fit all roles and needs? Do we really need separate solutions for each of these categories, or would we be better served with one flexible solution that can be applied where necessary?

Vendors that continue to sell various versions of historians can cause you some serious growing pains. When edge or site technology is limited in capacity, scaling a solution over time requires additional licensing, new implementation, and new integration work, ouch! When enterprise-worthy technology doesn't move easily to cloud applications or visa-versa you can't be as flexible in your deployments as you may want to be.

Ideally, you should be able to install the same technology on the edge, replicate that history to other historian instances around your sites, while simultaneously moving that data to enterprise and cloud solutions.

Canary’s Philosophy

One solution that works anywhere you need to deploy it sounds simple, but realistically, it’s extremely difficult to do well. Consider that it has to be simple enough for a fast deployment (little systems on the edge or small sites) while still scalable enough to work for enterprise applications (large tag counts with tons of necessary standardization and contextualization) and open enough to make it extremely easy to move data into other cloud applications. But the technology is just the first piece, now it has to be universally affordable for every industry vertical and use case.

We think we nailed it. How about trying it out for yourself?

No charge, no commitment, for 90 days.