In an ideal world, your data will flow through your system, piece by piece. Each piece getting processed and passed on quickly to minimise latency and keep the system responsive.
Sometimes though the overhead in processing can kill this flow.
I've had two such examples recently with customer projects. Very different causes but the same outcome:
- One customer needs to replace an instrument but the new instrument is not very automation friendly. Where the old instrument used to send an event when there was new data ready. The new instrument only provides a file which we must open, read and close each time we access it. This open-close flow adds significant overhead which would limit the rate of acquisition.
- Another customer has very high data throughput and wanted to use GPU processing to calculate properties of high speed images. However, their system captures images at 1000s of frames per second and transferring each image to the GPU can take milliseconds - limiting the overall throughput.
Very different problems, but similar solutions - we must spread the overhead across multiple pieces of data with batching.
Terminology
Lets define some terms:
Frame - A single data/image capture.
Throughput - The Frames per second that the system can process from an input.
Latency - The time from the system capturing a frame to outputting it in whatever form is required.
Overhead - A fixed minimum time for an operation irrelevant of the amount of work it is doing.
Batching
The overhead on the file example above was around 100ms (+ negligible time per frame). If we grab an individual frame each time then we can achieve 1/100ms or 10 frames/second throughput.
But we want to target 100 frames/second throughput so instead if we can grab 10 frames in each read that gives us 10/100ms or 100 frames/second throughput. What we give up is latency. Instead of loading a frame every 10ms, we get 10 frames every 100ms. This means some frames have been delayed through the system.
(and yes this is exactly like a manufacturing production flow)
Can you take the Latency Hit?
Whether latency is important depends on your application.
If you are using the data to make real time decisions then the latency can become very important. For customer 1 this is the case and we will be careful testing different batch sizes to keep the throughput high enough without causing too many delays.
The GPU customer is sending this data to a user interface and data storage so we can tolerate larger delays.
100ms doesn't mean much if it is going to a user interface. But if you need 10 second batches the interface updates will look juddery (think video buffering when the internet was younger)
For storage we can tolerate much higher latency - the key concern though is how much data might be lost if there is a system failure. The latency represents a quantity of data that could be lost in the case of a power failure for example.