The writerCache in DataStore did not use the partitionId
in its cache key. Therefore the cache could return the
wrong writer and events were written to the wrong
partition.
Fixed by changing the cache key.
To make it easier/possible to write stable unit test the CSV upload
can optionally wait until all entries have been flushed to disk.
This is necessary for tests that ingest data and then read the data.
We want to be able to use @SpringBootTest tests that fully initialize
the Spring application. This is much easier done with Junit than TestNG.
Gradle does not support (at least not easily) to run Junit and TestNG
tests. Therefore we switch to Junit with all tests.
The original reason for using TestNG was that Junit didn't support
data providers. But that finally changed in Junit5 with
ParameterizedTest.
Recommind is using a pseudo ISO date format for their
log files. It uses a comma instead of a dot for the
second to milli second separator and it does not
add a timezone. Dates without timezone are assumed to be UTC.
Refactoring to prepare the addition of CSV parsing rules. The parsing
rules will contain information about which columns to ingest or ignore.
This will be used to add the ability to upload files via HTTP post in
addition to the TcpIngestor.
We are now reading the CSV input without transforming
the data into strings. This reduces the amount of bytes
that have to be converted and copied.
We also made Tag smaller. It no longer stores pointers
to strings, instead it stored integers obtained by
compressing the strings (see StringCompressor). This
reduces memory usage and it speeds up hashcode and
equals, which speeds up access to the writer cache.
Performance gain is almost 100%:
- 330k entries/s -> 670k entries/s, top speed measured over a second
- 62s -> 32s, to ingest 16 million entries
Compared to FastISODateParser.parse, which returns an
OffsetDateTime object, parseAsEpochMilli returns the
epoch time millis. The performance improvement for
date parsing alone is roughly 100% (8m dates/s to
18m dates/s).
Insertion speed improved from 13-14s for 1.6m entries
to 11.5-12.5s.
A specialized date parser that can only handle ISO-8601 like dates
(2011-12-03T10:15:30.123Z or 2011-12-03T10:15:30+01:00) but does this
roughly 10 times faster than DateTimeFormatter and 5 times
faster than the FastDateParser of commons-lang3.
- split the 'sortby' select field into two fields
- sort by average
- legend shows plotted and total values in the date range
- removed InlineDataSeries, because it was not used anymore