perfdb

Author	SHA1	Message	Date
Andreas Huber	b01d267300	update primitiveCollections The new version of primitiveCollections requires Java 10.	2018-08-18 08:32:27 +02:00
Andreas Huber	99dbf31d8a	update 3rd party libs	2018-08-09 07:20:09 +02:00
Andreas Huber	182d1edd97	add a datetime picker Unfortunately the datetime picker does not support seconds. But it is one of the few that support date and time and are flexible enough to be used with VueJS.	2018-08-04 08:32:04 +00:00
Andreas Huber	daaa0e6907	update dependencies gradle to 4.8 jackson to 2.9.6 spring-boot to 2.0.3 guava to 25.1-jre gradle-versions-plugin to 0.19.0	2018-06-17 08:59:48 +02:00
Andreas Huber	b61a34a0e6	use existing RandomAccessFile when updating the listing file Ingestion speed dropped drastically with the old implementation. In some situations to 7 entries per second over a 10 second period (sic!). When using the already opened RandomAccessFile the speed is back to previous values of 40k-50k entries per second on my 10 year old machine on an encrypted spinning disk.	2018-05-10 17:41:50 +02:00
Andreas Huber	911062e26b	use RandomAccessFile in FolderStorage.getPathByOffset() The old implementation opened a new buffered reader everytime getPathByOffset was called. This took 1/20th of a second or longer. For queries that visited thousands of files this could take a long time. We are now using a RandomAccessFile, that is opened once. The average time spend in getPathByOffset is now down to 0.11ms.	2018-05-10 10:22:25 +02:00
Andreas Huber	82b8a8a932	reduce memory footprint by lazily intializing the path in Doc The path in Doc is not optional. This reduces memory consumption, because we only have to store a long (the offset in the listing file). This assumes, that only a small percentage of Docs is requested.	2018-05-06 12:58:10 +02:00
Andreas Huber	e3102c01d4	use listing.csv instead of iterating through all folders The hope is, that it is faster to read a single file instead of listing hundreds of folders.	2018-05-05 10:46:16 +02:00
Andreas Huber	6d85c56cb0	range definitions for the y-axis Sometimes it is useful to specify the certain y-axis range. For example when you are only interested in the values that take longer than a threshold. Or when you want to exclude some outliers. When you want to compare plots in a gallery, it is very handy when all plots have the same data-area.	2018-05-01 10:18:06 +02:00
Andreas Huber	b06ccb0d00	update 3rd party libs spring boot to 2.0.1 guava to 24.1-jre jackson to 2.9.5 log4j2 to 2.10.0 (same version as pulled by spring boot) testng to 6.14.3	2018-04-21 20:01:39 +02:00
Andreas Huber	57938d5269	do not check if we can find values when proposing keys Counting the available values is quite expensive and there are only a few corner cases where this makes sense. One of them is when the query is for a method that is not project specific and therefore no project values can be found.	2018-04-14 10:38:00 +02:00
Andreas Huber	23e16ff61d	ignore null values in tags	2018-04-14 09:58:51 +02:00
Andreas Huber	1755562a84	do not move the cursor to the end when applying a proposal	2018-04-08 14:06:13 +02:00
Andreas Huber	68ee88bce0	rewrite autocomplete in vue.js	2018-04-08 08:44:28 +02:00
Andreas Huber	22c99f8517	fix null pointer exception filename were generated without '$', but the parsing code expected the '$'.	2018-03-28 19:34:48 +02:00
Andreas Huber	de0f8412bd	show proposals for empty terminals	2018-03-25 19:17:49 +02:00
Andreas Huber	5343c0d427	reduce memory usage Reduce memory usage by storing the filename as string instead of individual tags.	2018-03-19 19:21:57 +01:00
Andreas Huber	b439c9d79a	update third-party libs antlr4: 4.7 -> 4.7.1 commons-lang3: 3.6 -> 3.7	2018-01-21 08:44:30 +01:00
ahr	d98c45e8bd	add index for tags-to-documents Now we can find writer much faster, because we don't have to execute a query for documents that match the tags. We can just look up the documents in the map. Speedup: 2-4ms -> 0.002-0.01ms	2018-01-14 09:51:37 +01:00
ahr	bcc30f0f3f	add trace logging and make set of proposals synchronized I checked if computing the proposals with a parallel stream would be beneficial. Turns out the stream uses several threads, but the overall computation is not faster, because each individual computation is slower.	2017-12-30 10:08:54 +01:00
ahr	fc30ffd928	sort IntLists in DataStore The IntLists were no longer sorted since we made the initialization run in parallel. Therefore a much slower implementation for intersection/union was used.	2017-12-30 09:45:50 +01:00
ahr	cc70f45c12	add different plot types Step 1: Added PlotType enum and a drop down to the UI. Extracted the code for scatter plots.	2017-12-29 08:57:34 +01:00
ahr	2df66c7b2f	update primitiveCollections This fixes a performance issue where the IntLists were not sorted and therefore slow union/intersection algorithms were chosen.	2017-12-29 08:20:52 +01:00
ahr	e060e9761d	cleanup	2017-12-23 10:06:52 +01:00
ahr	8037212145	synchronize docIdToDoc list When we parallelized the initialization we forgot to synchronize the docIdToDoc list. Luckily there is a high probability, that queries return results, that are obviously wrong.	2017-12-23 10:06:45 +01:00
ahr	888d25f7ea	trim docIdToDoc list This reduces memory usage by 1 or 2 MB. 33% of an ArrayList can be free. If the list is 1 million entries long, then the list wastes 2.6 MB. The Doc objects in the list are much bigger.	2017-12-23 09:42:08 +01:00
ahr	e59caa0f02	parallelize initialization of DataStore When the files are already in the OS cache, then the initialization time for 750k files went down from 35 seconds to 15 seconds.	2017-12-23 08:58:42 +01:00
ahr	a6251074cf	add trace logging to ExpressionToDocIdVisitor	2017-12-20 11:14:41 +01:00
ahr	04b029e1be	add trace logging	2017-12-16 19:19:12 +01:00
ahr	6ef4e7a96b	reduce memory footprint of index by trimming IntLists Reduced the memory usage of the IntLists in the index by 4.1MB (19.9MB to 15.8MB) for 683,390 files and 4,046,250 values in the IntLists.	2017-12-16 17:57:15 +01:00
ahr	8225dd2077	update primitiveCollections to 0.1.20171216143737 Use intersection and union methods from IntList.	2017-12-16 17:35:16 +01:00
Andreas Huber	a6a2236d18	do not compute counts when proposing all keys	2017-11-18 13:03:45 +01:00
Andreas Huber	f2868fcc1b	reduce memory footprint: old generation by 100 MB This reduces the size of the old generation by 100MB (300MB down to 200MB). Unfortunately the total JVM size didn't change and is still 512MB. Doc stores the path as byte array instead of Path.	2017-11-18 10:39:01 +01:00
Andreas Huber	a636f2b9bd	update primitive collections to 0.1.20171007100354	2017-11-18 10:09:47 +01:00
Andreas Huber	347f1fdc74	update 3rd-party libraries	2017-09-23 18:24:51 +02:00
Andreas Huber	c9ff8b5586	only propose value if the existing prefix is a real prefix	2017-09-23 13:31:34 +02:00
Andreas Huber	87858a79c1	compute proposals for blank strings Before we would only provide proposals for empty strings. But blank and empty is not that different.	2017-04-20 19:05:21 +02:00
Andreas Huber	bcb2e6ca83	add query completion We are using ANTLR listeners to find out where in the query the cursor is. Then we generate a list of keys/values that might fit at that position. With that information we can generate new queries and sort them by the number of results they yield.	2017-04-17 16:25:14 +02:00
Andreas Huber	f6a9fc2394	propose for an empty query	2017-04-16 10:39:17 +02:00
Andreas Huber	44f30aafee	add a new facade in front of DataStore This is done in preparation for the proposal API. In order to compute proposals we need to consume the API of the DataStore, but the code does not need to be in the DataStore. Extracting the API allows us to separate these concerns.	2017-04-16 10:11:46 +02:00
Andreas Huber	ac1ee20046	replace ludb with data-store LuDB has a few disadvantages. 1. Most notably disk space. H2 wastes a lot of valuable disk space. For my test data set with 44 million entries it is 14 MB (sometimes a lot more; depends on H2 internal cleanup). With data-store it is 15 KB. Overall I could reduce the disk space from 231 MB to 200 MB (13.4 % in this example). That is an average of 4.6 bytes per entry. 2. Speed: a) Liquibase is slow. The first time it takes approx. three seconds b) Query and insertion. with data-store we can insert entries up to 1.6 times faster. Data-store uses a few tricks to save disk space: 1. We encode the tags into the file names. 2. To keep them short we translate the key/value of the tag into shorter numbers. For example "foo" -> 12 and "bar" to 47. So the tag "foo"/"bar" would be 12/47. We then translate this number into a numeral system of base 62 (a-zA-Z0-9), so it can be used for file names and it is shorter. That way we only have to store the mapping of string to int. 3. We do that in a simple tab separated file.	2017-04-16 09:07:28 +02:00

1 2 3

141 Commits