A study of lookup strategies into digital forensic databases published last month concluded that flat hash maps (fhmap)have considerable advantages over Hashdatabase (hashdb) and hierarchical Bloom filter trees (hbft) in terms of runtime performance and applicability. Yet the other lookup strategies still have their place in the digital toolbox.

Hashdb was particularly bad in single core scenarios but it is still the only strategy that offers full parallelization, transactional features, and single-level storage. Hbft has problems with extensibility and maintenance but it still delivers efficiency in lookups.

Here is a link to the ScienceDirect article:

We discussed and evaluated three different implementations of artifact lookup strategies in the course of digital forensics. Several extensions have been proposed to finally perform a comprehensive performance evaluation of hashdb, hbft, and fhmap. We introduced concepts to handle multihits for hbft and fhmap by the implementation of deduplication and filtration features. Moreover, we interfaced fhmap with a rolling hash based extraction of chunks. For a better comparison to hashdb, we additionally parallelized the extraction of chunks.

Results show that fhmap outperforms hbft in most of the considered performance evaluations. While hbfts are faster than hashdb in nearly all evaluations, the concept introduces false positives by the utilized Bloom filters. Even if hbfts have small advantages in case of memory and storage efficiency, their complexity, fixed parametrization, and limited scope of features make such an advantage negligible. However, specific use cases with tight memory constraints could make hbfts still valuable.

Discussions of hashdb in terms of performance should consider the underlying concept of single-level stores. Shifting the discussion to offered features and a long term usage with an ongoing maintenance, hashdb and fhmap are more suitable. One thing to note is that hashdb is the only implementation that is able to deal with databases which do not fit into main memory. In addition it supports transactional features.