Commit Graph

102 Commits

Author SHA1 Message Date
979d3269fa remove obsolete classes and methods 2018-10-04 18:46:51 +02:00
8939332004 remove the wrapper class PdbDB
It did not serve any purpose and could be replaced by DataStore.
2018-10-04 18:43:27 +02:00
f07977c27a update java, gradle and third party libs 2018-09-29 09:08:29 +02:00
24fcfd7763 prepare the addition of a date index 2018-09-28 19:07:01 +02:00
1d88c8dfd7 update spring-boot to 2.0.5.RELEASE
update commons-lang3 to 3.8
2018-09-13 18:58:07 +02:00
a2e63cca44 cleanup 2018-09-13 08:11:15 +02:00
c6a1291ee6 the pattern must match the property value exactly,
when matching property values to the query. This is important when
you have a property value that is a prefix of another property value,
e.g., AuditService.logEvent and AuditService.logEvents.
2018-09-13 07:55:13 +02:00
61f131571a add CamelCase matching to the query language 2018-09-12 13:42:23 +02:00
86b8f93752 replace 'in' queries with a simpler syntax
field in (val1, val2)
was replaced with
field=val1,val2
or
field=(val1, val2)
2018-09-12 10:10:01 +02:00
1182d76205 replace the FolderStorage with DiskStorage
- The DiskStorage uses only one file instead of millions.
  Also the block size is only 512 byte instead of 4kb, which
  helps to reduce the memory usage for short sequences.
- Update primitiveCollections to get the new LongList.range
  and LongList.rangeClosed methods.
- BSFile now stores Time&Value sequences and knows how to
  encode the time values with delta encoding.
- Doc had to do some magic tricks to save memory. The path
  was initialized lazy and stored as byte array. This is no
  longer necessary. The patch was replaced by the
  rootBlockNumber of the BSFile.
- Had to temporarily disable the 'in' queries.
- The stored values are now processed as stream of LongLists
  instead of Entry. The overhead for creating Entries is
  gone, so is the memory overhead, because Entry was an
  object and had a reference to the tags, which is
  unnecessary.
2018-09-12 09:35:07 +02:00
ea5e16fad5 expressions now support in-queries 2018-08-18 10:31:49 +02:00
b01d267300 update primitiveCollections
The new version of primitiveCollections requires Java 10.
2018-08-18 08:32:27 +02:00
99dbf31d8a update 3rd party libs 2018-08-09 07:20:09 +02:00
182d1edd97 add a datetime picker
Unfortunately the datetime picker does not support seconds. But it is
one of the few that support date and time and are flexible enough to
be used with VueJS.
2018-08-04 08:32:04 +00:00
daaa0e6907 update dependencies
gradle to 4.8
jackson to 2.9.6
spring-boot to 2.0.3
guava to 25.1-jre
gradle-versions-plugin to 0.19.0
2018-06-17 08:59:48 +02:00
b61a34a0e6 use existing RandomAccessFile when updating the listing file
Ingestion speed dropped drastically with the old implementation.
In some situations to 7 entries per second over a 10 second period
(sic!). When using the already opened RandomAccessFile the speed
is back to previous values of 40k-50k entries per second on my 10 year
old machine on an encrypted spinning disk.
2018-05-10 17:41:50 +02:00
911062e26b use RandomAccessFile in FolderStorage.getPathByOffset()
The old implementation opened a new buffered reader everytime
getPathByOffset was called. This took 1/20th of a second or
longer. For queries that visited thousands of files this could
take a long time.
We are now using a RandomAccessFile, that is opened once. The
average time spend in getPathByOffset is now down to 0.11ms.
2018-05-10 10:22:25 +02:00
82b8a8a932 reduce memory footprint by lazily intializing the path in Doc
The path in Doc is not optional. This reduces memory consumption,
because we only have to store a long (the offset in the listing file).
This assumes, that only a small percentage of Docs is requested.
2018-05-06 12:58:10 +02:00
e3102c01d4 use listing.csv instead of iterating through all folders
The hope is, that it is faster to read a single file instead of listing
hundreds of folders.
2018-05-05 10:46:16 +02:00
6d85c56cb0 range definitions for the y-axis
Sometimes it is useful to specify the certain y-axis range. For example
when you are only interested in the values that take longer than a
threshold. Or when you want to exclude some outliers. When you want to
compare plots in a gallery, it is very handy when all plots have the
same data-area.
2018-05-01 10:18:06 +02:00
b06ccb0d00 update 3rd party libs
spring boot to 2.0.1
guava to 24.1-jre
jackson to 2.9.5
log4j2 to 2.10.0 (same version as pulled by spring boot)
testng to 6.14.3
2018-04-21 20:01:39 +02:00
57938d5269 do not check if we can find values when proposing keys
Counting the available values is quite expensive and there are only a
few corner cases where this makes sense. One of them is when the query
is for a method that is not project specific and therefore no project
values can be found.
2018-04-14 10:38:00 +02:00
23e16ff61d ignore null values in tags 2018-04-14 09:58:51 +02:00
1755562a84 do not move the cursor to the end when applying a proposal 2018-04-08 14:06:13 +02:00
68ee88bce0 rewrite autocomplete in vue.js 2018-04-08 08:44:28 +02:00
22c99f8517 fix null pointer exception
filename were generated without '$', but the parsing code expected
the '$'.
2018-03-28 19:34:48 +02:00
de0f8412bd show proposals for empty terminals 2018-03-25 19:17:49 +02:00
5343c0d427 reduce memory usage
Reduce memory usage by storing the filename as string instead of
individual tags.
2018-03-19 19:21:57 +01:00
b439c9d79a update third-party libs
antlr4: 4.7 -> 4.7.1
commons-lang3: 3.6 -> 3.7
2018-01-21 08:44:30 +01:00
ahr
d98c45e8bd add index for tags-to-documents
Now we can find writer much faster, because we don't have to execute
a query for documents that match the tags. We can just look up the 
documents in the map.
Speedup: 2-4ms -> 0.002-0.01ms
2018-01-14 09:51:37 +01:00
ahr
bcc30f0f3f add trace logging and make set of proposals synchronized
I checked if computing the proposals with a parallel stream would be
beneficial. Turns out the stream uses several threads, but the overall
computation is not faster, because each individual computation is
slower.
2017-12-30 10:08:54 +01:00
ahr
fc30ffd928 sort IntLists in DataStore
The IntLists were no longer sorted since we made the initialization run
in parallel. Therefore a much slower implementation for
intersection/union was used.
2017-12-30 09:45:50 +01:00
ahr
cc70f45c12 add different plot types
Step 1: 
Added PlotType enum and a drop down to the UI.
Extracted the code for scatter plots.
2017-12-29 08:57:34 +01:00
ahr
2df66c7b2f update primitiveCollections
This fixes a performance issue where the IntLists were not sorted and
therefore slow union/intersection algorithms were chosen.
2017-12-29 08:20:52 +01:00
ahr
e060e9761d cleanup 2017-12-23 10:06:52 +01:00
ahr
8037212145 synchronize docIdToDoc list
When we parallelized the initialization we forgot to
synchronize the docIdToDoc list.
Luckily there is a high probability, that queries return
results, that are obviously wrong.
2017-12-23 10:06:45 +01:00
ahr
888d25f7ea trim docIdToDoc list
This reduces memory usage by 1 or 2 MB.
33% of an ArrayList can be free. If the list is 1 million entries long,
then the list wastes 2.6 MB.
The Doc objects in the list are much bigger.
2017-12-23 09:42:08 +01:00
ahr
e59caa0f02 parallelize initialization of DataStore
When the files are already in the OS cache, then the initialization time
for 750k files went down from 35 seconds to 15 seconds.
2017-12-23 08:58:42 +01:00
ahr
a6251074cf add trace logging to ExpressionToDocIdVisitor 2017-12-20 11:14:41 +01:00
ahr
04b029e1be add trace logging 2017-12-16 19:19:12 +01:00
ahr
6ef4e7a96b reduce memory footprint of index by trimming IntLists
Reduced the memory usage of the IntLists in the index by 4.1MB (19.9MB
to 15.8MB) for 683,390 files and 4,046,250 values in the IntLists.
2017-12-16 17:57:15 +01:00
ahr
8225dd2077 update primitiveCollections to 0.1.20171216143737
Use intersection and union methods from IntList.
2017-12-16 17:35:16 +01:00
a6a2236d18 do not compute counts when proposing all keys 2017-11-18 13:03:45 +01:00
f2868fcc1b reduce memory footprint: old generation by 100 MB
This reduces the size of the old generation by 100MB (300MB down to
200MB). Unfortunately the total JVM size didn't change and is still
512MB.

Doc stores the path as byte array instead of Path.
2017-11-18 10:39:01 +01:00
a636f2b9bd update primitive collections to 0.1.20171007100354 2017-11-18 10:09:47 +01:00
347f1fdc74 update 3rd-party libraries 2017-09-23 18:24:51 +02:00
c9ff8b5586 only propose value if the existing prefix is a real prefix 2017-09-23 13:31:34 +02:00
87858a79c1 compute proposals for blank strings
Before we would only provide proposals for empty strings.
But blank and empty is not that different.
2017-04-20 19:05:21 +02:00
bcb2e6ca83 add query completion
We are using ANTLR listeners to find out where in the
query the cursor is. Then we generate a list of keys/values
that might fit at that position. With that information we
can generate new queries and sort them by the number
of results they yield.
2017-04-17 16:25:14 +02:00
f6a9fc2394 propose for an empty query 2017-04-16 10:39:17 +02:00