The first edition of a weekly Question and Answer column with Gary Stern, President and Founder of Canary Labs. Have a question you would like me to answer? Email askgary@canarylabs.com
Dear Curious,
As historical data is added to an SQL database, the database adds more rows to tables. As the table within the database continues to grow, with more and more rows, it needs more memory space and disk space to load and search for data, resulting in slower and slower performance. When an SQL database gets too slow, the DBA needs to roll-off some older data. The data is either abandoned, “rolled up” to lower resolution (losing data), or placed off-line, (no longer easily accessible).
To better handle both performance and resources, Canary uses a custom, non-SQL database designed to handle high volumes of data typical with historian applications without requiring a DBA. To achieve the highest performance Canary incorporates both “time and dataset segmentation” into our design.
The overall database is physically stored in pieces called datasets (logical grouping of commonly collected tags) and within a dataset by time range segments. Segmentation control, what we call “rollover”, is normally set to daily, but there also are various other options. The historian keeps track of all these database segments, moving portions of them in and out of memory as necessary (based on demand). This allows us to manage an extremely large database without needing a large and expensive server. It also keeps performance consistent regardless of the size. We have customers with 20 years of raw, unchanged data “on-line” which is immediately and quickly accessible.
I think it is worth noting however, there are two sides to the performance issue, “Reading” and “Writing”; and it goes back to the underlying design of how the data is stored. Some designs can write data quickly, but when they read data, the performance is terrible. The design can be optimized for data retrieval, but then writing the data is more difficult and much slower limiting the number of tags and amount of data that a single server can handle.
Here, time and experience favor Canary. The “Historian Core” is a fourth generation design that began in 1987. Each new generation incorporated the field operating experience from customers and leveraged new technological improvements. We continue to make many improvements to the Historian service for easier access, security, administration, etc. to ensure that all of Canary’s surrounding products are built around a rock-solid and time-tested solution.
So what does this mean to you? Simply put, the Canary historian performs. Canary tests the Historian’s writing performance by requiring a single machine to write one million tags with each tag value changing every second. For reading performance, we require the Historian to return more than four million TVQs (Timestamp, Value, Quality) per second. This raw horsepower means you never compromise performance and always have your data accessible.
Another issue affecting performance is the amount of storage required. We believe Canary has the best compression and lowest storage needs of any historian on the market. We have never seen anything better. Besides the ability to “dead-band” the incoming data at the logger level, the historian uses a format that eliminates redundant information. And since all Canary compression is “loss-less”, the data that goes in is exactly the same as the data the comes out; an important feature when later using analytical tools such as predictive analysis.
I know this is a lengthy reply to what seems like a relatively simple question, but it’s really quite a complicated issue.
Sincerely,