Commit Graph

7 Commits

Author SHA1 Message Date
06b379494f apply new code formatter and save action 2019-11-24 10:20:43 +01:00
4367323fcd replace deprecated dependency configurations
Using api and implementation instead of the
deprecated compile configuration.

Update to Gradle 6.0.
2019-11-10 11:08:50 +01:00
57ad6a1cee update SpringBoot to 2.1.9
Also remove direct dependencies to log4j-api and log4j-core where
possible. log4j-slf4j-impl is enough in many cases.
2019-10-04 20:15:09 +02:00
cc0157fe0b update java 3rd-party libs 2018-11-20 19:13:59 +01:00
b06ccb0d00 update 3rd party libs
spring boot to 2.0.1
guava to 24.1-jre
jackson to 2.9.5
log4j2 to 2.10.0 (same version as pulled by spring boot)
testng to 6.14.3
2018-04-21 20:01:39 +02:00
347f1fdc74 update 3rd-party libraries 2017-09-23 18:24:51 +02:00
ac1ee20046 replace ludb with data-store
LuDB has a few disadvantages. 
  1. Most notably disk space. H2 wastes a lot of valuable disk space.
     For my test data set with 44 million entries it is 14 MB 
     (sometimes a lot more; depends on H2 internal cleanup). With 
     data-store it is 15 KB.
     Overall I could reduce the disk space from 231 MB to 200 MB (13.4 %
     in this example). That is an average of 4.6 bytes per entry.
  2. Speed:
     a) Liquibase is slow. The first time it takes approx. three seconds
     b) Query and insertion. with data-store we can insert entries 
        up to 1.6 times faster.

Data-store uses a few tricks to save disk space:
  1. We encode the tags into the file names.
  2. To keep them short we translate the key/value of the tag into 
     shorter numbers. For example "foo" -> 12 and "bar" to 47. So the
     tag "foo"/"bar" would be 12/47. 
     We then translate this number into a numeral system of base 62
     (a-zA-Z0-9), so it can be used for file names and it is shorter.
     That way we only have to store the mapping of string to int.
  3. We do that in a simple tab separated file.
2017-04-16 09:07:28 +02:00