Here's the update to the previous post, in which we'll show the performance results of building a complete State History for very large CTF traces but using bigger block sizes, namely 1 and 4 MB. The default of 32 KB led to a small degradation of performance (nothing dramatic, but worse than what one could expect from the logarithmic curve). Without any more drumming, here are the results:

32K vs 1M vs 4M blocks
Construction time
Full queries (log X)
Single queries (log X)

History file size
State Tree depth

 

Curiously, using bigger blocks would increase the time taken to just build the history. If anything it shows that we are indeed bottlenecked by the writing-to-disk part, which is usual for complete histories. This might be a side-effect of the BTRFS filesystem?
At smaller scales varying the block size wouldn't affect construction chart much, as we can see in this older chart (this older test was on ext4 however).

For the query tests, going to 1 MB blocks did help the larger traces, for both types of queries. Going to 4 MB improved single queries slightly further, but decreased the performance of full queries overall. It seems 1 MB is a better size for state histories that end up being over ~20 GB (with my current settings, this means traces of sizes over 10 GB).

 

I've also added the sizes of the history files, as well as the resulting tree's depth. File size stays roughly the same (1 MB and 4 MB are practically identical), despite taking a bit more time to be written. As usual, the event handler used results in histories almost twice as big as the origin trace. The state history content is similar to the state information saved in LTTV, PLUS one set of uncompressed statistics (number of events of each type, so one extra state change per event). As we've seen before, a partial history is a really good way of saving on history space, without that big of an impact on performance.

The tree depth chart is completely reassuring, since going to bigger blocks does reduce the chance of parent nodes getting filled before their children. The two points not shown on the 32K curve are at 144 and 614, respectively. Interesting to note the 11 GB trace having some peculiar content. Probably a big process dying in the middle or something similar, which would end a lot of states at the same time. (I'm trying to keep the machine as idle as possible when taking traces or running tests. However, since it's doubling as the apt proxy for the whole lab, there might be some disturbances sometimes...) But moving to 4 MB blocks completely absorbed this irregularity.

 

I think this completes the "scalabiliy testing" part. While the State History performance scales a bit worse at very big sizes (it doesn't follow a perfect logarithmic curve), it is still quite reasonable : ~150 ms to rebuild a complete history from a state tree of almost 1 terabyte. Blink, and it's done! ;)

The non-perfect scaling is probably unavoidable, since we benefit less and less of swap and filesystem cache as we move to bigger sizes. One other factor might be that in my tests, I pick 200 random locations in the history and run queries on them. As we move to bigger histories, there's less and less chance for those locations to be close enough to one another to benefit from the cache. Note that in a real-life scenario, users would normally examine one particular area of the trace at a time, thus would benefit from the cache a bit more.