ahr
829fddf88c
merge key and value arrays
...
we have several hundred thousand of those MiniMaps and this reduces
the memory requirement by 8 bytes per instance
2018-03-09 08:40:12 +01:00
ahr
7e5b762c0d
pre-compute firstByteMaxValue
...
this operation is executed very often during ingestion
2018-03-09 08:38:58 +01:00
ahr
5a9aae70af
handle corrupt json
...
Entries must be separated by a newline. This allows
us to handle corrupt json entries, because we know
that entries only start at a line beginning.
2018-03-03 09:58:50 +01:00
ahr
9d4eb660a5
update gradle and spring
...
gradle to 4.6
spring to 1.5.10.RELEASE
2018-03-03 08:34:38 +01:00
ahr
6b60fd542c
add percentile plots
2018-03-03 08:19:26 +01:00
b4a0514267
draw points and percentile lines with the same color
2018-01-21 14:09:34 +01:00
bb7701e7c4
use enum for line type instead of string
2018-01-21 11:01:30 +01:00
8f15aba0d5
replace individual percentile aggregates with a single one for all
2018-01-21 10:54:13 +01:00
b439c9d79a
update third-party libs
...
antlr4: 4.7 -> 4.7.1
commons-lang3: 3.6 -> 3.7
2018-01-21 08:44:30 +01:00
469dc20411
add gradle plugin: gradle-versions-plugin
2018-01-21 08:43:45 +01:00
9f45eb24ca
add trace logging for creation of new writer
2018-01-21 08:36:40 +01:00
ahr
740cb1cb2d
print metrics every 10 seconds, not every 10.001 seconds
2018-01-14 09:52:08 +01:00
ahr
d98c45e8bd
add index for tags-to-documents
...
Now we can find writer much faster, because we don't have to execute
a query for documents that match the tags. We can just look up the
documents in the map.
Speedup: 2-4ms -> 0.002-0.01ms
2018-01-14 09:51:37 +01:00
ahr
64613ce43c
add metric logging for getWriter
2018-01-13 10:32:03 +01:00
ahr
0f2fcc3c9c
extract long_to_string converter
2018-01-06 08:40:58 +01:00
ahr
c5c7c03c66
add example logger
2017-12-30 10:09:19 +01:00
ahr
bcc30f0f3f
add trace logging and make set of proposals synchronized
...
I checked if computing the proposals with a parallel stream would be
beneficial. Turns out the stream uses several threads, but the overall
computation is not faster, because each individual computation is
slower.
2017-12-30 10:08:54 +01:00
ahr
3cc512f73d
update third party libs
...
testng 6.11 -> 6.13.1
jackson-databind 2.9.1 -> 2.9.3
guava 23.0 -> 23.6-jre
2017-12-30 10:06:57 +01:00
ahr
fc30ffd928
sort IntLists in DataStore
...
The IntLists were no longer sorted since we made the initialization run
in parallel. Therefore a much slower implementation for
intersection/union was used.
2017-12-30 09:45:50 +01:00
ahr
5617547d63
add percentile plots
2017-12-30 09:15:26 +01:00
ahr
9b9554552d
Extract interface for DataSeries
...
This will make it possible to have DataSeries that do not require a csv
file on disk.
2017-12-29 09:15:29 +01:00
ahr
cc70f45c12
add different plot types
...
Step 1:
Added PlotType enum and a drop down to the UI.
Extracted the code for scatter plots.
2017-12-29 08:57:34 +01:00
ahr
2df66c7b2f
update primitiveCollections
...
This fixes a performance issue where the IntLists were not sorted and
therefore slow union/intersection algorithms were chosen.
2017-12-29 08:20:52 +01:00
ahr
e060e9761d
cleanup
2017-12-23 10:06:52 +01:00
ahr
8037212145
synchronize docIdToDoc list
...
When we parallelized the initialization we forgot to
synchronize the docIdToDoc list.
Luckily there is a high probability, that queries return
results, that are obviously wrong.
2017-12-23 10:06:45 +01:00
ahr
888d25f7ea
trim docIdToDoc list
...
This reduces memory usage by 1 or 2 MB.
33% of an ArrayList can be free. If the list is 1 million entries long,
then the list wastes 2.6 MB.
The Doc objects in the list are much bigger.
2017-12-23 09:42:08 +01:00
ahr
e59caa0f02
parallelize initialization of DataStore
...
When the files are already in the OS cache, then the initialization time
for 750k files went down from 35 seconds to 15 seconds.
2017-12-23 08:58:42 +01:00
ahr
a6251074cf
add trace logging to ExpressionToDocIdVisitor
2017-12-20 11:14:41 +01:00
ahr
6509391059
sometimes plots are missing
...
The csv generation is running in parallel, but the
list that collects the results was not synchronized.
2017-12-16 19:22:56 +01:00
ahr
cafaa7343c
remove obsolete method
2017-12-16 19:20:38 +01:00
ahr
fd1479760a
use same log format for console and file
2017-12-16 19:20:24 +01:00
ahr
a359652f8b
log stdout/stderr of the gnuplot process
2017-12-16 19:19:35 +01:00
ahr
04b029e1be
add trace logging
2017-12-16 19:19:12 +01:00
ahr
6ef4e7a96b
reduce memory footprint of index by trimming IntLists
...
Reduced the memory usage of the IntLists in the index by 4.1MB (19.9MB
to 15.8MB) for 683,390 files and 4,046,250 values in the IntLists.
2017-12-16 17:57:15 +01:00
ahr
8225dd2077
update primitiveCollections to 0.1.20171216143737
...
Use intersection and union methods from IntList.
2017-12-16 17:35:16 +01:00
ahr
a2512b210f
update to gradle 4.4
2017-12-16 17:33:45 +01:00
ahr
d63fabc85d
prevent parallel plot requests
...
Plotting can take a long time and use a lot of resources.
Multiple plot requests can cause the machine to run OOM.
We are now allowing plots for 500k files again. This is mainly to
prevent unwanted plots of everything.
2017-12-15 17:20:12 +01:00
ahr
8d48726472
remove unnecessary mapping to TagSpecificBaseDir
2017-12-15 16:52:20 +01:00
ahr
eb1f026c2f
update spring-boot to 1.5.9
2017-12-11 08:28:21 +01:00
ahr
8860a048ff
remove call of listRecursively on a file
...
The call was needed in a very early version.
2017-12-10 17:55:16 +01:00
ahr
c4dce942a6
parallelize csv generation
...
speedup 50% and more
2017-12-10 17:53:53 +01:00
ahr
3ee6336125
log time of query execution
2017-12-10 17:52:32 +01:00
ahr
f17bc55a8f
hide prev/next image buttons when splitBy is not active
2017-12-10 17:28:29 +01:00
ahr
f2dfa92966
add refresh button
2017-12-10 17:21:59 +01:00
ahr
8e3213e2fc
split by field
2017-12-10 17:00:45 +01:00
ahr
84084c3e08
remove logo
2017-12-10 09:34:10 +01:00
ahr
159c5ff371
write logs to logfile
2017-12-10 09:22:49 +01:00
ahr
06d25e7ceb
do not allow search results with more than 100k docs
...
a) they take a long time to compute
b) danger of OOM
c) they should drill down
2017-12-10 09:19:28 +01:00
a6a2236d18
do not compute counts when proposing all keys
2017-11-18 13:03:45 +01:00
14d1367c4a
remove duplicate enums
2017-11-18 12:30:45 +01:00