Commit Graph

41 Commits

Author SHA1 Message Date
07ad62ddd9 use Junit5 instead of TestNG
We want to be able to use @SpringBootTest tests that fully initialize
the Spring application. This is much easier done with Junit than TestNG.
Gradle does not support (at least not easily) to run Junit and TestNG
tests. Therefore we switch to Junit with all tests.
The original reason for using TestNG was that Junit didn't support
data providers. But that finally changed in Junit5 with
ParameterizedTest.
2019-12-13 14:33:20 +01:00
85679ca0c8 send CSV file via REST 2019-12-08 18:39:43 +01:00
06b379494f apply new code formatter and save action 2019-11-24 10:20:43 +01:00
0e9e2cd53a remove dependency to Guava 2019-09-01 15:44:36 +02:00
3252fcf42d improve trace logging
- Add filename for trace logs for read/write operations.
2019-08-18 09:25:49 +02:00
0b3eb97b96 Fix to string for maps with values of type Empty
The MAX_KEY inserted into the tree had a value of one byte. This
triggered an assertion for maps with values of type Empty, because they
expected values to be empty.
Fixed by using an empty array for the value of the MAX_KEY.
2019-08-12 08:35:40 +02:00
95f2f26966 handle IOExceptions earlier 2019-03-17 11:13:46 +01:00
cbcb7714bb split BSFile into a TimeSeries and a LongStream file
BSFile was used to store two types of data. This makes
the API complex. I split the API into two files with
easier and more clear APIs. Interestingly the API of
BSFile is still rather complex and has to consider both
use cases.
2019-02-10 09:59:16 +01:00
2e48061793 add LRU cache to PersistentMap
This should speed up fetching and inserting of values
that are used often.
2019-02-02 17:26:25 +01:00
76e5d441de rewrite query completion
The old implementation searched for all possible values and then
executed each query to see what matches.
The new implementation uses several indices to find only
the matching values.
2019-02-02 15:35:56 +01:00
72e9a9ebe3 prepare more efficient query completion
adding an index that answers the question
given a query "a=b and c=", what are possible values
for c.
2019-01-13 10:22:17 +01:00
3dca7483de utility that generates a csv with many different tags 2019-01-05 08:33:57 +01:00
593752470c cleanup 2018-11-25 07:46:58 +01:00
5404253bc6 use TreeMap in PersistentMapDiskNode instead of list 2018-11-24 15:57:05 +01:00
d67e452a91 cache disk blocks in an LRU cache
Improves read access by factor 4 for small trees.
2018-11-24 15:07:37 +01:00
218ea9ed68 use custom date parser
A specialized date parser that can only handle ISO-8601 like dates
(2011-12-03T10:15:30.123Z or 2011-12-03T10:15:30+01:00) but does this
roughly 10 times faster than DateTimeFormatter and 5 times
faster than the FastDateParser of commons-lang3.
2018-11-19 19:23:57 +01:00
6f48a25d53 do not force changes to disk
diskBlock.force() makes insertion speed very slow, because it adds
two digit ms to tree changes. I disabled it for now. The tree is not
crash resistent anyway.
2018-11-19 19:22:27 +01:00
135ab42cd8 tags are now stored as variable length byte sequences of longs
Replaced Tags.filenameBytes with a SortedSet<Tag>. Tags are now
stored as longs (variable length encoded) in the PersistenMap.
Tags.filenameBytes was introduced to reduce memory consumption, when
all tags were hold in memory. Tags are now stored in a PersistentMap
and only read when needed.

Moved the VariableByteEncoder into its own project, because it was
needed by pdb-api.
2018-11-17 20:03:46 +01:00
b2107acf4e synchronize access to the PerstistentMap
The map is not (yet) thread-safe. Eventually we'll replace the
synchronized blocks with read/write locks on the nodes.
2018-11-17 10:02:29 +01:00
fce0f6a04d use PersistentMap in DataStore
Replaces the use of in-memory data structures with the PersistentMap.
This is the crucial step in reducing memory usage for both persistent
storage and main memory.
2018-11-17 09:45:35 +01:00
3ccf526608 PersistentMap now requires only a path instead of a DiskStorage
This makes the PersistentMap easier to use.
2018-11-10 10:08:21 +01:00
e90506c1b0 add visitor that find all values by a prefix of the key 2018-11-10 09:48:36 +01:00
807257d330 remove the unused node visitor 2018-11-04 10:44:05 +01:00
008f0db377 add generics to PersistencMap 2018-11-04 10:42:05 +01:00
f2d5c27668 insertion of many values into the persistent map 2018-11-04 10:11:10 +01:00
c6782df0e5 the root node can have more than two children it it is an inner node
It is not yet possible to split inner nodes or the root node.
2018-10-27 10:17:45 +02:00
8b48b8c3e7 add a pointer to the root node
Before the offset of the root node was hard-coded.
Now the offset of the pointer to the root node is hard-coded.
That allows us to replace the root node.
2018-10-27 08:55:15 +02:00
8bb98deb1e PersistentMap can store data in multiple nodes 2018-10-26 18:35:32 +02:00
bb4514c940 insert values into root node 2018-10-14 19:53:02 +02:00
3855d03ead BSFile uses a wrapper for DiskBlock to add BSFile specific stuff
This keeps the DiskBlock class clean, so that it can be used
for PersistentMap.
2018-10-14 17:13:33 +02:00
c83b6e11e2 Add first part of a persistent map implementation. 2018-10-14 16:47:17 +02:00
bd88c63aff ensure BSFiles use blocks that are aligned to 512 Byte offsets 2018-10-14 09:00:26 +02:00
a2520c0238 move method only used in tests to the tests 2018-10-13 20:03:02 +02:00
b42fec8fe2 use var keyword 2018-10-13 10:14:52 +02:00
b42bb88dff DiskStorage can allocate and free blocks of arbitrary sizes 2018-10-13 10:03:41 +02:00
0539080200 use byte offsets instead of block numbers
We want to allow arbitrary allocations in DiskStorage. The
first step was to change the hard coded block size into a
dynamic one.
2018-10-12 08:10:43 +02:00
e03fccbdf7 support for negative values in variable byte encoding
We now support negative values. This will allow us to
store time/value sequences that are not monotonically
increasing, so that we do not have to create multiple
files just because some values were send out of order.

This is done by first transforming the values into
positive values by using interleaved encoding (there
is a name for it, but I don't remember it). We are
mapping values like this:
 0 -> 1
 1 -> 2
-1 -> 3
 2 -> 4
-2 -> 5
...

Renamed LongSequenceEncoderDecoder to VariableByteEncoder.
Made methods static.
2018-09-29 19:48:57 +02:00
84350c4dfb move TimeStampDeltaDecoder to BSFile
Now the encoding and decoding code is in the same class.
2018-09-13 13:08:45 +02:00
1182d76205 replace the FolderStorage with DiskStorage
- The DiskStorage uses only one file instead of millions.
  Also the block size is only 512 byte instead of 4kb, which
  helps to reduce the memory usage for short sequences.
- Update primitiveCollections to get the new LongList.range
  and LongList.rangeClosed methods.
- BSFile now stores Time&Value sequences and knows how to
  encode the time values with delta encoding.
- Doc had to do some magic tricks to save memory. The path
  was initialized lazy and stored as byte array. This is no
  longer necessary. The patch was replaced by the
  rootBlockNumber of the BSFile.
- Had to temporarily disable the 'in' queries.
- The stored values are now processed as stream of LongLists
  instead of Entry. The overhead for creating Entries is
  gone, so is the memory overhead, because Entry was an
  object and had a reference to the tags, which is
  unnecessary.
2018-09-12 09:35:07 +02:00
26dc052b95 cleanup 2018-08-26 09:40:38 +02:00
b7ebb8ce6a new implementation of an integer storage
It can store multiple streams of integers in a single
file. It uses blocks of 512 byte, which is only 1/8th
of the block size the file based data-store uses. This
reduces the overhead and waste of memory for short
integer streams significantly. Storing data in one big
file, instead of many small files, makes backups much
more efficient.
2018-08-26 09:37:56 +02:00