perfdb

Author	SHA1	Message	Date
Andreas Huber	bc520f97ed	move DateIndexExtension to DataStore	2021-05-12 19:00:25 +02:00
Andreas Huber	7adfc7029f	Revert "introduce indexes" This reverts commit `36ccc57db6`.	2021-05-12 18:18:57 +02:00
Andreas Huber	75857b553e	do tag to string conversion in StringCompressor instead of Tag	2021-05-09 10:44:24 +02:00
Andreas Huber	6dc335600e	do string compression in StringCompressor instead of Tag	2021-05-09 10:37:35 +02:00
Andreas Huber	36ccc57db6	introduce indexes	2021-05-09 10:33:28 +02:00
Andreas Huber	6dc0e3c250	performance improvement for queries with wildcards Computing the union of many LongLists was inefficient, because we were using a trivial algorithm. I replaced the algorithm with a multi way merge. The old algorithm had a runtime of O(n!m) where n is the number of lists and m the length or the longest list. The new algorithm has a runtime of O(log(n) n*m).	2020-11-15 13:02:15 +01:00
Andreas Huber	46070a31b9	add block size to the header of a PersistentMap and optimize storage usage for monotonically incrementing keys.	2020-10-17 10:13:46 +02:00
Andreas Huber	10155f9cdb	use special enum for DateBucket units Preparation step for having custom intervals.	2020-09-27 17:06:27 +02:00
Andreas Huber	6bb6cdaea7	count disk reads	2020-09-20 19:51:47 +02:00
Andreas Huber	50f555d23c	add interval splitting for bar charts	2020-04-05 08:14:09 +02:00
Andreas Huber	75391f21ff	extract code from DateIndexExtension to LongToDateBucket Making it possible to reuse the code to sort timestamps into date based buckets.	2020-04-03 19:46:08 +02:00
Andreas Huber	9a311313ec	use US locale to format strings This is especially important for all strings that are passed to gnuplot. Because gnuplot uses the US locale during parsing.	2020-03-12 19:40:20 +01:00
Andreas Huber	ffc3832bfa	fix: events are added to wrong partition The writerCache in DataStore did not use the partitionId in its cache key. Therefore the cache could return the wrong writer and events were written to the wrong partition. Fixed by changing the cache key.	2019-12-23 18:42:54 +01:00
Andreas Huber	5d8df6888d	move Entry and Entries to data-store	2019-12-13 18:15:10 +01:00
Andreas Huber	07ad62ddd9	use Junit5 instead of TestNG We want to be able to use @SpringBootTest tests that fully initialize the Spring application. This is much easier done with Junit than TestNG. Gradle does not support (at least not easily) to run Junit and TestNG tests. Therefore we switch to Junit with all tests. The original reason for using TestNG was that Junit didn't support data providers. But that finally changed in Junit5 with ParameterizedTest.	2019-12-13 14:33:20 +01:00
Andreas Huber	e931856041	merge projects file-utils, byte-utils and pdb-utils It turned out that most projects needed at least two of the utils projects. file-utils and byte-utils had only one class. Merging them made sense.	2019-12-08 18:47:54 +01:00
Andreas Huber	06b379494f	apply new code formatter and save action	2019-11-24 10:20:43 +01:00
Andreas Huber	4367323fcd	replace deprecated dependency configurations Using api and implementation instead of the deprecated compile configuration. Update to Gradle 6.0.	2019-11-10 11:08:50 +01:00
Andreas Huber	7636781315	fix StringIndexOutOfBounds when caret is in position 0	2019-10-26 10:30:02 +02:00
Andreas Huber	57ad6a1cee	update SpringBoot to 2.1.9 Also remove direct dependencies to log4j-api and log4j-core where possible. log4j-slf4j-impl is enough in many cases.	2019-10-04 20:15:09 +02:00
Andreas Huber	0e9e2cd53a	remove dependency to Guava	2019-09-01 15:44:36 +02:00
Andreas Huber	8579974051	performance improvement Queries like "firstname=John and lastname=???" were slightly inefficient. They fetched all firstnames, filtered to those that matched the prefix (e.g. John or Jonathan is this example) and then iterated over all those values and return the lastnames. Fixed by having two implementations. One for the case that only a few of the values in fieldA match and one for the case that many match.	2019-08-31 19:30:54 +02:00
Andreas Huber	d8a114dbaf	handle globlike patterns in in-expressions	2019-08-31 17:34:17 +02:00
Andreas Huber	f8e859fb6d	cleanup and javadoc	2019-08-31 16:52:13 +02:00
Andreas Huber	0eee012798	allow 'not' for negation in addition to '!'	2019-08-31 08:30:13 +02:00
Andreas Huber	4161cd7f98	only field prefixes returned instead of full values When using autocomplete to return field values I missed, that autocomplete had the feature that cut values at dots. So instead of returning full field values only the prefix up to the first dot was returned. Fixed by making the cut-at-dot feature optional.	2019-08-27 20:37:07 +02:00
Andreas Huber	2f35978184	fetch available values for gallery via autocomplete method We had a method that returned the values of a field with respect to a query. That method was inefficient, because it executed the query, fetched all Docs and collected the values. The autocomplete method we introduced a while back can answer the same question but much more efficiently.	2019-08-25 18:52:05 +02:00
Andreas Huber	6eaf4e10fc	add maxSize parameter to HotEntryCache	2019-08-24 19:24:20 +02:00
Andreas Huber	feda901f6d	remove event types We only have removal events. The additional complexity of having a generic interface for many different event types does not pay off.	2019-08-18 20:30:25 +02:00
Andreas Huber	4d9ea6d2a8	switch back to my own HotEntryCache implementation Guava's cache does not evict elements reliably by time. Configure a cache to have a lifetime of n seconds, then you cannot expect that an element is actually evicted after n seconds with Guava.	2019-08-18 20:14:14 +02:00
Andreas Huber	3252fcf42d	improve trace logging - Add filename for trace logs for read/write operations.	2019-08-18 09:25:49 +02:00
Andreas Huber	0b3eb97b96	Fix to string for maps with values of type Empty The MAX_KEY inserted into the tree had a value of one byte. This triggered an assertion for maps with values of type Empty, because they expected values to be empty. Fixed by using an empty array for the value of the MAX_KEY.	2019-08-12 08:35:40 +02:00
Andreas Huber	9fb1a136c8	cache last used date prefix The 99.9999% use case is to ingest data from the same month.	2019-04-22 09:51:44 +02:00
Andreas Huber	56085061ed	do not return anything if the field/value does not exist The computation of proposals is done by searching for values in a combined index. If one of the values didn't exist, then the algorithm returned all values. Fixed by checking that we query only existing field/values from the combined index.	2019-04-20 19:48:51 +02:00
Andreas Huber	dbe0e02517	rename cluster to partition We are not clustering the indices, we are partitioning them.	2019-04-14 10:10:16 +02:00
Andreas Huber	2a1885a77f	cluster the indices	2019-03-31 09:01:55 +02:00
Andreas Huber	95f2f26966	handle IOExceptions earlier	2019-03-17 11:13:46 +01:00
Andreas Huber	5d0ceb112e	add clustering for DiskStore	2019-03-17 10:53:02 +01:00
Andreas Huber	b5e2d0a217	introduce clustering for query completion indices	2019-03-16 10:19:28 +01:00
Andreas Huber	fb9f8592ac	make ClusteredPersistentMap easier to use	2019-02-24 19:20:44 +01:00
Andreas Huber	59aea1a15f	introduce index clustering (part 1) In order to prevent files from getting too big and make it easier to implement retention policies, we are splitting all files into chunks. Each chunk contains the data for a time interval (1 month per default). This first changeset introduces the ClusteredPersistentMap that implements this for PersistentMap. It is used for a couple (not all) of indices.	2019-02-24 16:50:57 +01:00
Andreas Huber	372a073b6d	PdbWriter is no longer in the API of DataStore	2019-02-16 16:24:14 +01:00
Andreas Huber	92a47d9b56	remove TagsToFile Remove one layer of abstraction by moving the code into the DataStore.	2019-02-16 16:06:46 +01:00
Andreas Huber	117ef4ea34	use guava's cache as implementation for the HotEntryCache My own implementation was faster, but was not able to implement a size limitation.	2019-02-16 10:23:52 +01:00
Andreas Huber	7b00eede86	refactoring: extract EncoderDecoders from DataStore	2019-02-16 09:16:15 +01:00
Andreas Huber	cbcb7714bb	split BSFile into a TimeSeries and a LongStream file BSFile was used to store two types of data. This makes the API complex. I split the API into two files with easier and more clear APIs. Interestingly the API of BSFile is still rather complex and has to consider both use cases.	2019-02-10 09:59:16 +01:00
Andreas Huber	27b83234cc	group proposal as if they were hierarchical We interpret dots ('.') as hierarchy delimiter in. That way we can reduce the number of proposed values and show only those for the next level.	2019-02-09 15:21:35 +01:00
Andreas Huber	493971bcf3	values used in queries were added to the keys.csv Due to a mistake in Tag which added all strings used by Tag into the String dictionary, the dictionary did contain all values that were used in queries.	2019-02-09 08:28:23 +01:00
Andreas Huber	ea5884a5e6	move creation of PdbWriter to the DataStore	2019-02-07 18:06:41 +01:00
Andreas Huber	99cdf557b3	add metric logger for query completion evaluation	2019-02-06 15:51:41 +00:00

1 2 3

119 Commits