Commit Graph

572 Commits

Author SHA1 Message Date
ahr
9b9554552d Extract interface for DataSeries
This will make it possible to have DataSeries that do not require a csv
file on disk.
2017-12-29 09:15:29 +01:00
ahr
cc70f45c12 add different plot types
Step 1: 
Added PlotType enum and a drop down to the UI.
Extracted the code for scatter plots.
2017-12-29 08:57:34 +01:00
ahr
2df66c7b2f update primitiveCollections
This fixes a performance issue where the IntLists were not sorted and
therefore slow union/intersection algorithms were chosen.
2017-12-29 08:20:52 +01:00
ahr
e060e9761d cleanup 2017-12-23 10:06:52 +01:00
ahr
8037212145 synchronize docIdToDoc list
When we parallelized the initialization we forgot to
synchronize the docIdToDoc list.
Luckily there is a high probability, that queries return
results, that are obviously wrong.
2017-12-23 10:06:45 +01:00
ahr
888d25f7ea trim docIdToDoc list
This reduces memory usage by 1 or 2 MB.
33% of an ArrayList can be free. If the list is 1 million entries long,
then the list wastes 2.6 MB.
The Doc objects in the list are much bigger.
2017-12-23 09:42:08 +01:00
ahr
e59caa0f02 parallelize initialization of DataStore
When the files are already in the OS cache, then the initialization time
for 750k files went down from 35 seconds to 15 seconds.
2017-12-23 08:58:42 +01:00
ahr
a6251074cf add trace logging to ExpressionToDocIdVisitor 2017-12-20 11:14:41 +01:00
ahr
6509391059 sometimes plots are missing
The csv generation is running in parallel, but the 
list that collects the results was not synchronized.
2017-12-16 19:22:56 +01:00
ahr
cafaa7343c remove obsolete method 2017-12-16 19:20:38 +01:00
ahr
fd1479760a use same log format for console and file 2017-12-16 19:20:24 +01:00
ahr
a359652f8b log stdout/stderr of the gnuplot process 2017-12-16 19:19:35 +01:00
ahr
04b029e1be add trace logging 2017-12-16 19:19:12 +01:00
ahr
6ef4e7a96b reduce memory footprint of index by trimming IntLists
Reduced the memory usage of the IntLists in the index by 4.1MB (19.9MB
to 15.8MB) for 683,390 files and 4,046,250 values in the IntLists.
2017-12-16 17:57:15 +01:00
ahr
8225dd2077 update primitiveCollections to 0.1.20171216143737
Use intersection and union methods from IntList.
2017-12-16 17:35:16 +01:00
ahr
a2512b210f update to gradle 4.4 2017-12-16 17:33:45 +01:00
ahr
d63fabc85d prevent parallel plot requests
Plotting can take a long time and use a lot of resources. 
Multiple plot requests can cause the machine to run OOM.

We are now allowing plots for 500k files again. This is mainly to
prevent unwanted plots of everything.
2017-12-15 17:20:12 +01:00
ahr
8d48726472 remove unnecessary mapping to TagSpecificBaseDir 2017-12-15 16:52:20 +01:00
ahr
eb1f026c2f update spring-boot to 1.5.9 2017-12-11 08:28:21 +01:00
ahr
8860a048ff remove call of listRecursively on a file
The call was needed in a very early version.
2017-12-10 17:55:16 +01:00
ahr
c4dce942a6 parallelize csv generation
speedup 50% and more
2017-12-10 17:53:53 +01:00
ahr
3ee6336125 log time of query execution 2017-12-10 17:52:32 +01:00
ahr
f17bc55a8f hide prev/next image buttons when splitBy is not active 2017-12-10 17:28:29 +01:00
ahr
f2dfa92966 add refresh button 2017-12-10 17:21:59 +01:00
ahr
8e3213e2fc split by field 2017-12-10 17:00:45 +01:00
ahr
84084c3e08 remove logo 2017-12-10 09:34:10 +01:00
ahr
159c5ff371 write logs to logfile 2017-12-10 09:22:49 +01:00
ahr
06d25e7ceb do not allow search results with more than 100k docs
a) they take a long time to compute
b) danger of OOM
c) they should drill down
2017-12-10 09:19:28 +01:00
a6a2236d18 do not compute counts when proposing all keys 2017-11-18 13:03:45 +01:00
14d1367c4a remove duplicate enums 2017-11-18 12:30:45 +01:00
f2868fcc1b reduce memory footprint: old generation by 100 MB
This reduces the size of the old generation by 100MB (300MB down to
200MB). Unfortunately the total JVM size didn't change and is still
512MB.

Doc stores the path as byte array instead of Path.
2017-11-18 10:39:01 +01:00
cc49a8cf2a open PdbReaders only when reading
We used to open all PdbReaders in a search result and then interate over
them. This used a lot of heap space (> 8GB) for 400k files.
Now the PdbReaders are only opened while they are used. Heap usage was
less than 550 while reading more than 400k files.
2017-11-18 10:12:22 +01:00
a636f2b9bd update primitive collections to 0.1.20171007100354 2017-11-18 10:09:47 +01:00
0555691864 update gradle to 4.3.2 and spring boot to 1.5.8 2017-11-18 09:32:49 +01:00
995558588a add median and 90% percentile 2017-11-18 09:28:41 +01:00
ahr
f8c03c434e print thousand delimiter (of whatever they are called) 2017-11-06 17:21:45 +01:00
ahr
78671a2d8c use linespoints instead of line and make linewidth 2 instead of 1 2017-11-06 17:04:56 +01:00
ahr
64db4c48a2 add plots for percentiles 2017-11-06 16:57:22 +01:00
ahr
92dde94443 preparation to add plots for percentiles 2017-11-05 09:21:34 +01:00
ahr
870ff492d9 enable logging of metrics 2017-11-05 08:52:33 +01:00
ahr
27db9f934d increase entry buffer 2017-11-05 08:52:10 +01:00
11b3610971 make invaders better
add kill count
do not move all invaders at once
2017-10-01 19:08:59 +02:00
08f1961f51 replace spinner with a little game 2017-10-01 17:23:59 +02:00
386f211377 make it possible to draw the legend outside of the plot area 2017-09-30 17:51:33 +02:00
d4fd25dc4c replace LinkedHashMap with a more memory efficient implementation
This saves approximately 50MB of heap space.
2017-09-30 17:51:02 +02:00
7e00594382 add helper class that returns the size of objects 2017-09-30 17:49:21 +02:00
e0655f66fa skip invalid entries 2017-09-24 17:21:20 +02:00
a7cd918fc6 skip empty files 2017-09-24 17:12:17 +02:00
dc8262c37e replace deprecated API 2017-09-24 17:00:08 +02:00
4b53baacae add scrollbar for proposals, again 2017-09-24 13:32:51 +02:00