Commit Graph

383 Commits

Author SHA1 Message Date
37207d67ab use utf-8 as resource encoding 2018-11-25 07:29:29 +00:00
593752470c cleanup 2018-11-25 07:46:58 +01:00
5404253bc6 use TreeMap in PersistentMapDiskNode instead of list 2018-11-24 15:57:05 +01:00
d67e452a91 cache disk blocks in an LRU cache
Improves read access by factor 4 for small trees.
2018-11-24 15:07:37 +01:00
9889252205 use only one thread for evictions
Instead of spawning a new thread for every cache, we use a single thread
that will evict entries from all caches.
The thread keeps a weak reference to the caches, so that they can be
garbage collected.
2018-11-24 08:32:05 +01:00
64771417e4 only iterates over elements when at least one element can be evicted 2018-11-23 07:23:38 +01:00
f78f69328b add cache for docId to Doc mapping
A Doc does not change once it is created, so it is easy to cache.
Speedup was from 1ms per Doc to 3ms for 444 Docs (0.00675ms/Doc).
2018-11-22 19:51:07 +01:00
6c546bd5b3 update primitiveCollections
The new version comes with an improved removeAll method that is O(n+m)
on sorted lists.
2018-11-21 18:55:54 +01:00
cc0157fe0b update java 3rd-party libs 2018-11-20 19:13:59 +01:00
218ea9ed68 use custom date parser
A specialized date parser that can only handle ISO-8601 like dates
(2011-12-03T10:15:30.123Z or 2011-12-03T10:15:30+01:00) but does this
roughly 10 times faster than DateTimeFormatter and 5 times
faster than the FastDateParser of commons-lang3.
2018-11-19 19:23:57 +01:00
6f48a25d53 do not force changes to disk
diskBlock.force() makes insertion speed very slow, because it adds
two digit ms to tree changes. I disabled it for now. The tree is not
crash resistent anyway.
2018-11-19 19:22:27 +01:00
afd1e36066 fix unsupported operation exception when adding to an unmodifiable set 2018-11-19 19:19:51 +01:00
135ab42cd8 tags are now stored as variable length byte sequences of longs
Replaced Tags.filenameBytes with a SortedSet<Tag>. Tags are now
stored as longs (variable length encoded) in the PersistenMap.
Tags.filenameBytes was introduced to reduce memory consumption, when
all tags were hold in memory. Tags are now stored in a PersistentMap
and only read when needed.

Moved the VariableByteEncoder into its own project, because it was
needed by pdb-api.
2018-11-17 20:03:46 +01:00
b2107acf4e synchronize access to the PerstistentMap
The map is not (yet) thread-safe. Eventually we'll replace the
synchronized blocks with read/write locks on the nodes.
2018-11-17 10:02:29 +01:00
fce0f6a04d use PersistentMap in DataStore
Replaces the use of in-memory data structures with the PersistentMap.
This is the crucial step in reducing memory usage for both persistent
storage and main memory.
2018-11-17 09:45:35 +01:00
3ccf526608 PersistentMap now requires only a path instead of a DiskStorage
This makes the PersistentMap easier to use.
2018-11-10 10:08:21 +01:00
e90506c1b0 add visitor that find all values by a prefix of the key 2018-11-10 09:48:36 +01:00
807257d330 remove the unused node visitor 2018-11-04 10:44:05 +01:00
008f0db377 add generics to PersistencMap 2018-11-04 10:42:05 +01:00
f2d5c27668 insertion of many values into the persistent map 2018-11-04 10:11:10 +01:00
c6782df0e5 the root node can have more than two children it it is an inner node
It is not yet possible to split inner nodes or the root node.
2018-10-27 10:17:45 +02:00
8b48b8c3e7 add a pointer to the root node
Before the offset of the root node was hard-coded.
Now the offset of the pointer to the root node is hard-coded.
That allows us to replace the root node.
2018-10-27 08:55:15 +02:00
8bb98deb1e PersistentMap can store data in multiple nodes 2018-10-26 18:35:32 +02:00
bb4514c940 insert values into root node 2018-10-14 19:53:02 +02:00
3855d03ead BSFile uses a wrapper for DiskBlock to add BSFile specific stuff
This keeps the DiskBlock class clean, so that it can be used
for PersistentMap.
2018-10-14 17:13:33 +02:00
c83b6e11e2 Add first part of a persistent map implementation. 2018-10-14 16:47:17 +02:00
bd88c63aff ensure BSFiles use blocks that are aligned to 512 Byte offsets 2018-10-14 09:00:26 +02:00
a2520c0238 move method only used in tests to the tests 2018-10-13 20:03:02 +02:00
b42fec8fe2 use var keyword 2018-10-13 10:14:52 +02:00
b42bb88dff DiskStorage can allocate and free blocks of arbitrary sizes 2018-10-13 10:03:41 +02:00
0539080200 use byte offsets instead of block numbers
We want to allow arbitrary allocations in DiskStorage. The
first step was to change the hard coded block size into a
dynamic one.
2018-10-12 08:10:43 +02:00
eaa234bfa5 rename put to putEntries
The method name put is used too often so that eclipse has a
hard time finding references.
2018-10-11 19:25:01 +02:00
979e001efd TcpIngestor can handle csv files 2018-10-11 18:56:16 +02:00
6d4e3da672 add test for sending entries with negative values to the ingestor 2018-10-07 09:08:25 +02:00
c2ba395015 remove date.js
All references to date.js were replaced with moment.js.
2018-10-04 19:02:06 +02:00
979d3269fa remove obsolete classes and methods 2018-10-04 18:46:51 +02:00
8939332004 remove the wrapper class PdbDB
It did not serve any purpose and could be replaced by DataStore.
2018-10-04 18:43:27 +02:00
01b93e32ca replace EhCache with a custom implementation
The cache must remove/evict writers after a few seconds, but EhCache
only evicts entries when a new entry is added. That is not acceptable
for us, because that would leave lots of files open and we would need
a second mechanism to close them.
Therefore I write a simple wrapper for a ConcurrentHashMap that evicts
entries after timeToLive+5s.
2018-10-03 20:22:45 +02:00
0e5a47ac10 make sure serialized tags are always sorted the same way 2018-10-03 16:50:09 +02:00
c9dcc77b53 reuse existing PdbFiles 2018-10-03 16:49:46 +02:00
60578b45ec PdbWriters are now closed by the cache TagsToFile
we do not have to close the files when the input streams are idle.
2018-10-03 16:47:29 +02:00
ad630fc6b2 simplify caching in TagsToFile
- PdbFiles no longer require dates to be monotonically
  increasing. Therefore TagsToFile does not have to ensure
  this. => We only have one file per Tags.
- Use EhCache instead of HashMap.
2018-09-30 10:38:25 +02:00
d799682b4d Fix build issue with Java 11.
For some reason the Gradle build with Java 11 failed
because of an inner class. After extracting it the build
no longer fails.
2018-09-29 19:50:05 +02:00
e03fccbdf7 support for negative values in variable byte encoding
We now support negative values. This will allow us to
store time/value sequences that are not monotonically
increasing, so that we do not have to create multiple
files just because some values were send out of order.

This is done by first transforming the values into
positive values by using interleaved encoding (there
is a name for it, but I don't remember it). We are
mapping values like this:
 0 -> 1
 1 -> 2
-1 -> 3
 2 -> 4
-2 -> 5
...

Renamed LongSequenceEncoderDecoder to VariableByteEncoder.
Made methods static.
2018-09-29 19:48:57 +02:00
f07977c27a update java, gradle and third party libs 2018-09-29 09:08:29 +02:00
24fcfd7763 prepare the addition of a date index 2018-09-28 19:07:01 +02:00
1d88c8dfd7 update spring-boot to 2.0.5.RELEASE
update commons-lang3 to 3.8
2018-09-13 18:58:07 +02:00
bd54a8ad8d update gradle to 4.10.1 2018-09-13 18:47:48 +02:00
84350c4dfb move TimeStampDeltaDecoder to BSFile
Now the encoding and decoding code is in the same class.
2018-09-13 13:08:45 +02:00
861797acf7 zoom by mouse wheel 2018-09-13 09:26:43 +02:00