perfdb

Author	SHA1	Message	Date
Andreas Huber	6d5cdbafca	FastTime a faster alternative to System.currentTimeMillis FastTime is 100 times faster (according to my primitive benchmark) than System.currentTimeMillis. It is less accurate.	2021-07-30 19:45:44 +02:00
Andreas Huber	ee79cb0022	cleanup after revert	2021-05-12 18:20:34 +02:00
Andreas Huber	7adfc7029f	Revert "introduce indexes" This reverts commit `36ccc57db6`.	2021-05-12 18:18:57 +02:00
Andreas Huber	36ccc57db6	introduce indexes	2021-05-09 10:33:28 +02:00
Andreas Huber	11beda5432	make logging of insertion speed a little nicer	2020-11-24 10:00:53 +01:00
Andreas Huber	3e77c2a103	various fixes	2020-08-11 16:12:18 +02:00
Andreas Huber	9a311313ec	use US locale to format strings This is especially important for all strings that are passed to gnuplot. Because gnuplot uses the US locale during parsing.	2020-03-12 19:40:20 +01:00
Andreas Huber	5d8df6888d	move Entry and Entries to data-store	2019-12-13 18:15:10 +01:00
Andreas Huber	550d7ba44e	add flag to make CSV upload wait until entries are flushed To make it easier/possible to write stable unit test the CSV upload can optionally wait until all entries have been flushed to disk. This is necessary for tests that ingest data and then read the data.	2019-12-13 18:05:20 +01:00
Andreas Huber	07ad62ddd9	use Junit5 instead of TestNG We want to be able to use @SpringBootTest tests that fully initialize the Spring application. This is much easier done with Junit than TestNG. Gradle does not support (at least not easily) to run Junit and TestNG tests. Therefore we switch to Junit with all tests. The original reason for using TestNG was that Junit didn't support data providers. But that finally changed in Junit5 with ParameterizedTest.	2019-12-13 14:33:20 +01:00
Andreas Huber	e931856041	merge projects file-utils, byte-utils and pdb-utils It turned out that most projects needed at least two of the utils projects. file-utils and byte-utils had only one class. Merging them made sense.	2019-12-08 18:47:54 +01:00
Andreas Huber	85679ca0c8	send CSV file via REST	2019-12-08 18:39:43 +01:00
Andreas Huber	06b379494f	apply new code formatter and save action	2019-11-24 10:20:43 +01:00
Andreas Huber	4367323fcd	replace deprecated dependency configurations Using api and implementation instead of the deprecated compile configuration. Update to Gradle 6.0.	2019-11-10 11:08:50 +01:00
Andreas Huber	57ad6a1cee	update SpringBoot to 2.1.9 Also remove direct dependencies to log4j-api and log4j-core where possible. log4j-slf4j-impl is enough in many cases.	2019-10-04 20:15:09 +02:00
Andreas Huber	2f35978184	fetch available values for gallery via autocomplete method We had a method that returned the values of a field with respect to a query. That method was inefficient, because it executed the query, fetched all Docs and collected the values. The autocomplete method we introduced a while back can answer the same question but much more efficiently.	2019-08-25 18:52:05 +02:00
Andreas Huber	dfe9579726	use DateTimeRange.max() instead of arbitrary relative range	2019-04-20 20:36:26 +02:00
Andreas Huber	dbe0e02517	rename cluster to partition We are not clustering the indices, we are partitioning them.	2019-04-14 10:10:16 +02:00
Andreas Huber	5d0ceb112e	add clustering for DiskStore	2019-03-17 10:53:02 +01:00
Andreas Huber	b5e2d0a217	introduce clustering for query completion indices	2019-03-16 10:19:28 +01:00
Andreas Huber	59aea1a15f	introduce index clustering (part 1) In order to prevent files from getting too big and make it easier to implement retention policies, we are splitting all files into chunks. Each chunk contains the data for a time interval (1 month per default). This first changeset introduces the ClusteredPersistentMap that implements this for PersistentMap. It is used for a couple (not all) of indices.	2019-02-24 16:50:57 +01:00
Andreas Huber	372a073b6d	PdbWriter is no longer in the API of DataStore	2019-02-16 16:24:14 +01:00
Andreas Huber	92a47d9b56	remove TagsToFile Remove one layer of abstraction by moving the code into the DataStore.	2019-02-16 16:06:46 +01:00
Andreas Huber	117ef4ea34	use guava's cache as implementation for the HotEntryCache My own implementation was faster, but was not able to implement a size limitation.	2019-02-16 10:23:52 +01:00
Andreas Huber	493971bcf3	values used in queries were added to the keys.csv Due to a mistake in Tag which added all strings used by Tag into the String dictionary, the dictionary did contain all values that were used in queries.	2019-02-09 08:28:23 +01:00
Andreas Huber	ea5884a5e6	move creation of PdbWriter to the DataStore	2019-02-07 18:06:41 +01:00
Andreas Huber	58bfba23bb	reset lastEpochMilli when opening a new export file	2019-02-06 15:52:37 +00:00
Andreas Huber	668d73c926	introduced a new custom file format used for backup and ingestion The new file format reduces repetition, is easy to parse, easy to generate in any language and is human readable.	2019-02-03 15:44:35 +01:00
Andreas Huber	f2d16b6758	make CacheKey comparable The CacheKey is used as a key in a HashMap. Lookup can be faster if the CacheKey is comparable when there are hash collisions. In this case I was not able to measure any effect. I am keeping the comparables nonetheless, because the can only have a positive effect.	2019-01-01 08:47:48 +01:00
Andreas Huber	e537e94d39	HotEntryCache will update Instants only once per second Calling Instant.now() several hundred thousand times per second can be expensive. In my measurements >10% of the time spend when loading new data was spend calling Instant.now(). Fixed this by storing an Instant as static member and updating it periodically in a separate thread.	2018-12-21 19:16:55 +01:00
Andreas Huber	d95a71e32e	batch entries between TcpIngestor and PerformanceDB One bottleneck was the blocking queue used to transport entries from the listener thread to the ingestor thread. Reduced the bottleneck by batching entries. Interestingly the batch size of 100 was better than batch size of 1000 and better than 10.	2018-12-21 13:11:35 +01:00
Andreas Huber	40f4506e13	use FastISODateParser.parseAsEpochMilli Compared to FastISODateParser.parse, which returns an OffsetDateTime object, parseAsEpochMilli returns the epoch time millis. The performance improvement for date parsing alone is roughly 100% (8m dates/s to 18m dates/s). Insertion speed improved from 13-14s for 1.6m entries to 11.5-12.5s.	2018-12-16 19:24:47 +01:00
Andreas Huber	f78f69328b	add cache for docId to Doc mapping A Doc does not change once it is created, so it is easy to cache. Speedup was from 1ms per Doc to 3ms for 444 Docs (0.00675ms/Doc).	2018-11-22 19:51:07 +01:00
Andreas Huber	cc0157fe0b	update java 3rd-party libs	2018-11-20 19:13:59 +01:00
Andreas Huber	eaa234bfa5	rename put to putEntries The method name put is used too often so that eclipse has a hard time finding references.	2018-10-11 19:25:01 +02:00
Andreas Huber	979e001efd	TcpIngestor can handle csv files	2018-10-11 18:56:16 +02:00
Andreas Huber	979d3269fa	remove obsolete classes and methods	2018-10-04 18:46:51 +02:00
Andreas Huber	8939332004	remove the wrapper class PdbDB It did not serve any purpose and could be replaced by DataStore.	2018-10-04 18:43:27 +02:00
Andreas Huber	01b93e32ca	replace EhCache with a custom implementation The cache must remove/evict writers after a few seconds, but EhCache only evicts entries when a new entry is added. That is not acceptable for us, because that would leave lots of files open and we would need a second mechanism to close them. Therefore I write a simple wrapper for a ConcurrentHashMap that evicts entries after timeToLive+5s.	2018-10-03 20:22:45 +02:00
Andreas Huber	c9dcc77b53	reuse existing PdbFiles	2018-10-03 16:49:46 +02:00
Andreas Huber	60578b45ec	PdbWriters are now closed by the cache TagsToFile we do not have to close the files when the input streams are idle.	2018-10-03 16:47:29 +02:00
Andreas Huber	ad630fc6b2	simplify caching in TagsToFile - PdbFiles no longer require dates to be monotonically increasing. Therefore TagsToFile does not have to ensure this. => We only have one file per Tags. - Use EhCache instead of HashMap.	2018-09-30 10:38:25 +02:00
Andreas Huber	f07977c27a	update java, gradle and third party libs	2018-09-29 09:08:29 +02:00
Andreas Huber	24fcfd7763	prepare the addition of a date index	2018-09-28 19:07:01 +02:00
Andreas Huber	84350c4dfb	move TimeStampDeltaDecoder to BSFile Now the encoding and decoding code is in the same class.	2018-09-13 13:08:45 +02:00
Andreas Huber	a2e63cca44	cleanup	2018-09-13 08:11:15 +02:00
Andreas Huber	1182d76205	replace the FolderStorage with DiskStorage - The DiskStorage uses only one file instead of millions. Also the block size is only 512 byte instead of 4kb, which helps to reduce the memory usage for short sequences. - Update primitiveCollections to get the new LongList.range and LongList.rangeClosed methods. - BSFile now stores Time&Value sequences and knows how to encode the time values with delta encoding. - Doc had to do some magic tricks to save memory. The path was initialized lazy and stored as byte array. This is no longer necessary. The patch was replaced by the rootBlockNumber of the BSFile. - Had to temporarily disable the 'in' queries. - The stored values are now processed as stream of LongLists instead of Entry. The overhead for creating Entries is gone, so is the memory overhead, because Entry was an object and had a reference to the tags, which is unnecessary.	2018-09-12 09:35:07 +02:00
Andreas Huber	89840cf9e9	update dependencies	2018-07-28 08:50:42 +02:00
Andreas Huber	daaa0e6907	update dependencies gradle to 4.8 jackson to 2.9.6 spring-boot to 2.0.3 guava to 25.1-jre gradle-versions-plugin to 0.19.0	2018-06-17 08:59:48 +02:00
Andreas Huber	911062e26b	use RandomAccessFile in FolderStorage.getPathByOffset() The old implementation opened a new buffered reader everytime getPathByOffset was called. This took 1/20th of a second or longer. For queries that visited thousands of files this could take a long time. We are now using a RandomAccessFile, that is opened once. The average time spend in getPathByOffset is now down to 0.11ms.	2018-05-10 10:22:25 +02:00

1 2 3

130 Commits