In 2014 Gary Stern and the engineering team at Canary Labs sought to test the full capabilities of the Canary Data Historian. Published below are the complete findings from their study. As this was originally intended for use as an internal document, please excuse any grammatical or spelling errors.
Scaling and Capacity
Scaling and Capacity are complex, multiple-layer issues. Generally, when scalability and total capacity are questioned, it is to discover if the system will have the capacity and performance to meet the need and demands of the end user's system.
To answer this basic question, a complete system analysis must be examined to determine whether the data historian can store data as fast as needed and, then after storing, provide data to clients at an acceptable rate.
At the heart of overall system performance is the Canary Historian, so testing started with our proprietary database.
For testing purposes, the following benchmark machine was used:
System Specifications
Dell Precision T7600 Workstation
OS: Windows 7 Pro
CPU Xeon 2 GHz processor
Cores: 6 physical (12 Logical)
Memory: 32 Gb
Disk Storage: RAID 320Gb – SATA 7200 RPM Drives
Cost to Buy: Less than $3000 in 2014Canary Historian: Version 10.1 (64-bit Version)
Historian Raw Write Performance:
2.80 Million TVQs/second
Benchmark Parameters: 100 Million TVQ’s across 10,000 tags in 8 Datasets, Data type: R4
Historian CPU usage was around 35%.
Based on our analysis, we believe that for this test the limiting factor is disk performance and better performance may be achieved by using solid state drives.
Question: Does the total number of tags affect performance?
Answer: Yes, but not by much. It will take longer to write the data because of the structure of the historian. Data is stored on a per tag basis, so for a higher tag count it will need to write to a higher number of locations within the file. For instance, test results show that 100,000 tag write times are only 18% slower than 10,000 tag write times.
When 100 Million TVQs are sent to the Historian:
Tag Count
|
Total .hdb
File Size
|
Average Bytes/TVQ
|
Throughput
|
10,000
|
524.2 Mb
|
5.36
|
2.80M TVQs/sec
|
100,000
|
524.2 Mb
|
5.36
|
2.37M TVQs/sec
|
Noteworthy:
Canary has tested a single historian with one million tags successfully updating at one million TVQ’s per second.
Question: Does the data types affect performance?
Answer: Yes. An R8 value versus an R4 value is twice the number of bytes. Thus, the throughput is approximately 3% slower.
When 100 Million TVQs are sent to the Historian:
Data Type
|
Total .hdb
File Size
|
Average Bytes/TVQ
|
Throughput
|
R4
|
524.2 Mb
|
5.36
|
2.80M TVQs/sec
|
R8
|
914.8 Mb
|
9.36
|
2.70M TVQs/sec
|
Question: Does the amount of Historical data already stored in the Historian affect performance?Answer: No. The design of the Canary Historian is such that the amount of data already stored does not affect the writing performance. For instance, if there are ten years of data stored versus only one day there will be no difference in system performance. This occurs because the historian “rolls over” the .hdb file on a regular basis (usually daily). Since live TVQ updates occur in the last, or currently open .hdb file, writing performance is very consistent.
Question: How does the Data Profile affect performance?
Answer: For the purposes of this benchmark test, every TVQ was a different value when written to the historian. In real-world operations, we typically see change rates in process control systems of 20-30%. For example, if you are monitoring a temperature of a tank, the temperature reading changes slowly and a different reading occurs every minute instead of every second. The Canary Historian is optimized with "update by exception" logic to reduce disk space, thus improving performance.
Question: What is maximum number of tags the Canary Historian can address?
Answer: A single Canary Historian was tested with 25 million tags without issue. More points could have been tested, however, since we have never needed more than a five million tag capacity, we stopped there. A single, stand-alone historian with 25 million tags would be a feasible solution for a "smart meter” electric utility where the data rate is at fifteen minute intervals and the data is loaded in blocks three or four times per day. It is not recommended for a one second interval system. To accomplish this you would need several historical servers.
For larger systems over 250,000 tags or higher TVQ data volumes, it is recommended to configure a “Server Class” machine for the Canary Historian, with the appropriate number of cores, memory and disk storage and consider the overall data collection system and client requirements.
The Canary Historian has often been deployed as a VM (Virtual Machine). When ran in a VM environment, the historian saw performance lessened by approximately 25% when compared to a dedicated server machine.
Data Historian Raw Read Performance:
4.6 Million TVQs/second
Benchmark Parameters: 100 Million Total TVQ’s, across 1000 tags, 8 Datasets, Data type: R4
Most client applications do not connect directly to the Canary Historian; they connect to the data historian Web Service which provides additional functionality, but also communicates with the client via WCF instead of COM/DCOM. This allows access to historian data through firewalls and because it uses a single port, is IT friendly.
When performance is measured at the client side and data is accessed using the historian Web Service, there are two components that influence overall performance. The first is the time interaction between the Web Service and the Historian and the second is the marshaling and transmission of the results from the Web Service to the Client via WCF.
However, The Historian Web Service is multi-threaded and multiple clients may access the underlying historian simultaneously. Since the client communicates to the Web Service via WCF, they must also consider the speed of the communication channel and the protocol overheads of WCF. The fastest WCF communications is via the “net.tcp” protocol. Various security and encryption levels add more overhead to the WCF.
A function in the WebServiceHelper called “CommunicationSpeedCheck” will return a speed indicator between the raw communication speed from the client application and the Historian Web Service.
Reading Historian data through the Historian Web Service by Client application
Total TVQs
|
Client on Local to Web Service
|
Client on LAN
|
Client with Wireless Access
|
Client with Internet Connection
|
|
Read Raw
|
432,000
|
2.1 sec
|
2.4 sec
|
4.1 sec
|
28.2 sec
|
Read Processed (1 Minute Aggregate)
|
7,200
|
0.94 sec
|
0.96 sec
|
1.2 sec
|
2.6 sec
|
Client/Web Service ComSpeed
|
4.5M bps
|
1.2M bps
|
324K bps
|
16K bps
|
Have unique system requirements? We would love to help you solve your process data problems.