perfdb

Author	SHA1	Message	Date
Andreas Huber	e537e94d39	HotEntryCache will update Instants only once per second Calling Instant.now() several hundred thousand times per second can be expensive. In my measurements >10% of the time spend when loading new data was spend calling Instant.now(). Fixed this by storing an Instant as static member and updating it periodically in a separate thread.	2018-12-21 19:16:55 +01:00
Andreas Huber	253bbabd19	cleanup remove debug output	2018-11-25 07:49:23 +00:00
Andreas Huber	593752470c	cleanup	2018-11-25 07:46:58 +01:00
Andreas Huber	f78f69328b	add cache for docId to Doc mapping A Doc does not change once it is created, so it is easy to cache. Speedup was from 1ms per Doc to 3ms for 444 Docs (0.00675ms/Doc).	2018-11-22 19:51:07 +01:00
Andreas Huber	cc0157fe0b	update java 3rd-party libs	2018-11-20 19:13:59 +01:00
Andreas Huber	afd1e36066	fix unsupported operation exception when adding to an unmodifiable set	2018-11-19 19:19:51 +01:00
Andreas Huber	135ab42cd8	tags are now stored as variable length byte sequences of longs Replaced Tags.filenameBytes with a SortedSet<Tag>. Tags are now stored as longs (variable length encoded) in the PersistenMap. Tags.filenameBytes was introduced to reduce memory consumption, when all tags were hold in memory. Tags are now stored in a PersistentMap and only read when needed. Moved the VariableByteEncoder into its own project, because it was needed by pdb-api.	2018-11-17 20:03:46 +01:00
Andreas Huber	fce0f6a04d	use PersistentMap in DataStore Replaces the use of in-memory data structures with the PersistentMap. This is the crucial step in reducing memory usage for both persistent storage and main memory.	2018-11-17 09:45:35 +01:00
Andreas Huber	bd88c63aff	ensure BSFiles use blocks that are aligned to 512 Byte offsets	2018-10-14 09:00:26 +02:00
Andreas Huber	0539080200	use byte offsets instead of block numbers We want to allow arbitrary allocations in DiskStorage. The first step was to change the hard coded block size into a dynamic one.	2018-10-12 08:10:43 +02:00
Andreas Huber	979d3269fa	remove obsolete classes and methods	2018-10-04 18:46:51 +02:00
Andreas Huber	8939332004	remove the wrapper class PdbDB It did not serve any purpose and could be replaced by DataStore.	2018-10-04 18:43:27 +02:00
Andreas Huber	f07977c27a	update java, gradle and third party libs	2018-09-29 09:08:29 +02:00
Andreas Huber	24fcfd7763	prepare the addition of a date index	2018-09-28 19:07:01 +02:00
Andreas Huber	1d88c8dfd7	update spring-boot to 2.0.5.RELEASE update commons-lang3 to 3.8	2018-09-13 18:58:07 +02:00
Andreas Huber	a2e63cca44	cleanup	2018-09-13 08:11:15 +02:00
Andreas Huber	c6a1291ee6	the pattern must match the property value exactly, when matching property values to the query. This is important when you have a property value that is a prefix of another property value, e.g., AuditService.logEvent and AuditService.logEvents.	2018-09-13 07:55:13 +02:00
Andreas Huber	61f131571a	add CamelCase matching to the query language	2018-09-12 13:42:23 +02:00
Andreas Huber	86b8f93752	replace 'in' queries with a simpler syntax field in (val1, val2) was replaced with field=val1,val2 or field=(val1, val2)	2018-09-12 10:10:01 +02:00
Andreas Huber	1182d76205	replace the FolderStorage with DiskStorage - The DiskStorage uses only one file instead of millions. Also the block size is only 512 byte instead of 4kb, which helps to reduce the memory usage for short sequences. - Update primitiveCollections to get the new LongList.range and LongList.rangeClosed methods. - BSFile now stores Time&Value sequences and knows how to encode the time values with delta encoding. - Doc had to do some magic tricks to save memory. The path was initialized lazy and stored as byte array. This is no longer necessary. The patch was replaced by the rootBlockNumber of the BSFile. - Had to temporarily disable the 'in' queries. - The stored values are now processed as stream of LongLists instead of Entry. The overhead for creating Entries is gone, so is the memory overhead, because Entry was an object and had a reference to the tags, which is unnecessary.	2018-09-12 09:35:07 +02:00
Andreas Huber	ea5e16fad5	expressions now support in-queries	2018-08-18 10:31:49 +02:00
Andreas Huber	b01d267300	update primitiveCollections The new version of primitiveCollections requires Java 10.	2018-08-18 08:32:27 +02:00
Andreas Huber	99dbf31d8a	update 3rd party libs	2018-08-09 07:20:09 +02:00
Andreas Huber	182d1edd97	add a datetime picker Unfortunately the datetime picker does not support seconds. But it is one of the few that support date and time and are flexible enough to be used with VueJS.	2018-08-04 08:32:04 +00:00
Andreas Huber	daaa0e6907	update dependencies gradle to 4.8 jackson to 2.9.6 spring-boot to 2.0.3 guava to 25.1-jre gradle-versions-plugin to 0.19.0	2018-06-17 08:59:48 +02:00
Andreas Huber	b61a34a0e6	use existing RandomAccessFile when updating the listing file Ingestion speed dropped drastically with the old implementation. In some situations to 7 entries per second over a 10 second period (sic!). When using the already opened RandomAccessFile the speed is back to previous values of 40k-50k entries per second on my 10 year old machine on an encrypted spinning disk.	2018-05-10 17:41:50 +02:00
Andreas Huber	911062e26b	use RandomAccessFile in FolderStorage.getPathByOffset() The old implementation opened a new buffered reader everytime getPathByOffset was called. This took 1/20th of a second or longer. For queries that visited thousands of files this could take a long time. We are now using a RandomAccessFile, that is opened once. The average time spend in getPathByOffset is now down to 0.11ms.	2018-05-10 10:22:25 +02:00
Andreas Huber	82b8a8a932	reduce memory footprint by lazily intializing the path in Doc The path in Doc is not optional. This reduces memory consumption, because we only have to store a long (the offset in the listing file). This assumes, that only a small percentage of Docs is requested.	2018-05-06 12:58:10 +02:00
Andreas Huber	e3102c01d4	use listing.csv instead of iterating through all folders The hope is, that it is faster to read a single file instead of listing hundreds of folders.	2018-05-05 10:46:16 +02:00
Andreas Huber	6d85c56cb0	range definitions for the y-axis Sometimes it is useful to specify the certain y-axis range. For example when you are only interested in the values that take longer than a threshold. Or when you want to exclude some outliers. When you want to compare plots in a gallery, it is very handy when all plots have the same data-area.	2018-05-01 10:18:06 +02:00
Andreas Huber	b06ccb0d00	update 3rd party libs spring boot to 2.0.1 guava to 24.1-jre jackson to 2.9.5 log4j2 to 2.10.0 (same version as pulled by spring boot) testng to 6.14.3	2018-04-21 20:01:39 +02:00
Andreas Huber	57938d5269	do not check if we can find values when proposing keys Counting the available values is quite expensive and there are only a few corner cases where this makes sense. One of them is when the query is for a method that is not project specific and therefore no project values can be found.	2018-04-14 10:38:00 +02:00
Andreas Huber	23e16ff61d	ignore null values in tags	2018-04-14 09:58:51 +02:00
Andreas Huber	1755562a84	do not move the cursor to the end when applying a proposal	2018-04-08 14:06:13 +02:00
Andreas Huber	68ee88bce0	rewrite autocomplete in vue.js	2018-04-08 08:44:28 +02:00
Andreas Huber	22c99f8517	fix null pointer exception filename were generated without '$', but the parsing code expected the '$'.	2018-03-28 19:34:48 +02:00
Andreas Huber	de0f8412bd	show proposals for empty terminals	2018-03-25 19:17:49 +02:00
Andreas Huber	5343c0d427	reduce memory usage Reduce memory usage by storing the filename as string instead of individual tags.	2018-03-19 19:21:57 +01:00
Andreas Huber	b439c9d79a	update third-party libs antlr4: 4.7 -> 4.7.1 commons-lang3: 3.6 -> 3.7	2018-01-21 08:44:30 +01:00
ahr	d98c45e8bd	add index for tags-to-documents Now we can find writer much faster, because we don't have to execute a query for documents that match the tags. We can just look up the documents in the map. Speedup: 2-4ms -> 0.002-0.01ms	2018-01-14 09:51:37 +01:00
ahr	bcc30f0f3f	add trace logging and make set of proposals synchronized I checked if computing the proposals with a parallel stream would be beneficial. Turns out the stream uses several threads, but the overall computation is not faster, because each individual computation is slower.	2017-12-30 10:08:54 +01:00
ahr	fc30ffd928	sort IntLists in DataStore The IntLists were no longer sorted since we made the initialization run in parallel. Therefore a much slower implementation for intersection/union was used.	2017-12-30 09:45:50 +01:00
ahr	cc70f45c12	add different plot types Step 1: Added PlotType enum and a drop down to the UI. Extracted the code for scatter plots.	2017-12-29 08:57:34 +01:00
ahr	2df66c7b2f	update primitiveCollections This fixes a performance issue where the IntLists were not sorted and therefore slow union/intersection algorithms were chosen.	2017-12-29 08:20:52 +01:00
ahr	e060e9761d	cleanup	2017-12-23 10:06:52 +01:00
ahr	8037212145	synchronize docIdToDoc list When we parallelized the initialization we forgot to synchronize the docIdToDoc list. Luckily there is a high probability, that queries return results, that are obviously wrong.	2017-12-23 10:06:45 +01:00
ahr	888d25f7ea	trim docIdToDoc list This reduces memory usage by 1 or 2 MB. 33% of an ArrayList can be free. If the list is 1 million entries long, then the list wastes 2.6 MB. The Doc objects in the list are much bigger.	2017-12-23 09:42:08 +01:00
ahr	e59caa0f02	parallelize initialization of DataStore When the files are already in the OS cache, then the initialization time for 750k files went down from 35 seconds to 15 seconds.	2017-12-23 08:58:42 +01:00
ahr	a6251074cf	add trace logging to ExpressionToDocIdVisitor	2017-12-20 11:14:41 +01:00
ahr	04b029e1be	add trace logging	2017-12-16 19:19:12 +01:00

1 2 3

112 Commits