Commit Graph

131 Commits

Author SHA1 Message Date
19e6dd1102 add histogram plots 2019-12-27 12:25:25 +01:00
e931856041 merge projects file-utils, byte-utils and pdb-utils
It turned out that most projects needed at least
two of the utils projects. file-utils and byte-utils
had only one class. Merging them made sense.
2019-12-08 18:47:54 +01:00
4e1b7a46d8 fix x-axis labels overlap when zooming out
The x-axis labels overlapped when zooming out too far.
Fixed by increasing the step size and reducint the labels to
year only.
2019-11-30 08:44:29 +01:00
2dd513c380 Fix build by adding dependency to commons-lang3.
Classpathes in Eclipse are different from classpathes in Gradle due to
Gradle's 'implementation' configuration which does not provide
dependency transitively in Gradle.
2019-11-29 20:07:03 +01:00
06b379494f apply new code formatter and save action 2019-11-24 10:20:43 +01:00
e2a33ac6e2 make the code that determines which axis to use explicit
In the previous changeset the code that determined
which axis the plots used was implemented as a
side effect of getting the Gnuplot definition of
an axis.
Changed that to an explit update call with simpler
logic.
2019-11-24 09:08:36 +01:00
892d5a6d08 automatically determine which axis a plot needs 2019-11-24 08:18:52 +01:00
3048f67e9a cleanup 2019-11-23 19:12:23 +01:00
1cc39e3962 create y-axis settings in aggregate handlers 2019-11-23 18:40:31 +01:00
8d55ef4e5f make sure labels for cumulative distribution don't overlap 2019-11-23 14:36:32 +01:00
82a961dbaf move definition of x-axis to the aggregate handlers 2019-11-23 14:28:18 +01:00
5341b5e307 change color palette so that the first color is blue
The color scheme for the page is blue/grey, so it makes
sense to use blue as the main color for plots.
2019-11-17 19:36:48 +01:00
57c5cca688 fetch possible values for gallery view 2019-11-14 18:45:32 +01:00
0fb7a0aaf6 fix image creation for automatic y-range 2019-11-10 14:22:22 +01:00
4367323fcd replace deprecated dependency configurations
Using api and implementation instead of the
deprecated compile configuration.

Update to Gradle 6.0.
2019-11-10 11:08:50 +01:00
c83d0a3e1e fix overlapping x axis ticks
Gnuplot does not handle long x-axis ticks very good.
It should know how wide the labels are and could adapt
the increment size accordingly, but it doesn't.
Fixed by explicitly defining the increment for x-axis
labels.
2019-11-01 19:13:11 +01:00
7c122b5753 cleanup 2019-11-01 17:45:19 +01:00
78f9f3fe16 fix IndexArrayOutOfBounds in parallel request aggregator
The range fromEpochMilli to toEpochMilli contains
(toEpochMilli - fromEpochMilli +1) milli seconds
2019-11-01 09:37:03 +01:00
b734894253 fix parallel aggregate on non english locales
We generate CSV files with comma as separator.
When we write times with milli seconds, then
we use floating point numbers. Depending
on the locale those floating point numbers
may be written with comma instead of point.
If that happens, then the plots are messed up.
Fixed by enforcing the locale when formatting floats.
2019-11-01 09:04:26 +01:00
3303afd115 render images without any data on it instead of throwing errors
This makes it easier to use the mouse wheel
to zoom in. Without it you could zoom into
a region that had not data and then had to
use the date picker to change the date.
2019-10-31 19:30:29 +01:00
f28a67a5c1 make it possible to render any combination of plots 2019-10-20 10:16:25 +02:00
b7c4fe4c1f move scatter plot creation into an AggregateHandler 2019-10-20 08:11:09 +02:00
7c61686808 merge ScatterPlot into Plotter 2019-10-19 19:03:33 +02:00
d895eba47c remove duplicate values when rendering
Rendering plots with millions of values is expensive. Before this fix
we wrote all values into CSV files. The CSV files were then read by
Gnuplot that did the rendering. But in an image with n×m pixes there
can only be nm different values. In most realistic scenarios we will
have many values that will be drawn to the same pixels. So we are
wasting time yb first generation the CSV for too many values and then
by parsing that CSV again.
Fixed by using a sparse 2D array to de-duplicate many values before
they get written to the CSV. The additional time we spend de-duplicating
is often smaller than the time saved when writing the CSV, so that the
total CSV writing is about as 'fast' as before (sometimes a little
faster, sometimes a little slower). But the time Gnuplot needs for
rendering drastically reduces. The factor depends on the data, of
course. We have seen factor 50 for realistic examples. Making a 15s
job run in 300ms.
2019-09-29 18:57:57 +02:00
242c83e590 extract constants for gnuplot margins in px 2019-09-08 08:34:14 +02:00
162ef1626c reduce memory usage for computation of cumulative distribution
Before: To compute the cumulative distribution we added every duration
into a LongList. This requires O(n) memory, where n is the number of
values.

Now: We store the durations + the number of occurrences in a
LongLongHashMap. This has the potential to reduce the memory
requirements if durations occur multiple times. There are a lot of
durations with 0, 1, 2 milliseconds. In the worst case every duration
is different. In that case the memory usage doubled with this solution.

Future: We are currently storing durations with milli seconds precision.
We don't have to do that. We cannot draw 100 million different values
on the y-axis in an images with only 1000px.
2019-09-07 18:31:18 +02:00
0e9e2cd53a remove dependency to Guava 2019-09-01 15:44:36 +02:00
a174ec21ad increase contrast between scatter and cumulative distribution 2019-08-26 20:38:57 +02:00
79c28f2f9e hide tics in gallery view 2019-08-25 18:56:08 +02:00
4f61d91c79 draw tic for max value on y-axis only if it makes sense 2019-08-25 15:40:20 +02:00
a905c608aa increase maximal allowed duration for 'parallel requests' plot 2019-08-25 10:47:23 +02:00
5b57417f75 make the tics on the y-axis easier readable
People are having trouble to understand durations like
100000 or 2.7E+6 milliseconds. Therefore we are
hanging the labels on the y-axis to include the unit
in the tic's label. We also use multiples of seconds,
minutes, hours and days instead of multiples of 10.
2019-08-25 10:25:47 +02:00
4f57a29c3b increase cache of stringified longs
This should improve the csv generation a little bit.
2019-05-25 17:55:14 +02:00
2eb2a69c17 rename 'percentile' plots to 'cumulative distribution' 2019-05-12 14:30:16 +02:00
59aea1a15f introduce index clustering (part 1)
In order to prevent files from getting too big and
make it easier to implement retention policies, we
are splitting all files into chunks. Each chunk
contains the data for a time interval (1 month per
default).
This first changeset introduces the ClusteredPersistentMap
that implements this for PersistentMap. It is used
for a couple (not all) of indices.
2019-02-24 16:50:57 +01:00
cc0157fe0b update java 3rd-party libs 2018-11-20 19:13:59 +01:00
f07977c27a update java, gradle and third party libs 2018-09-29 09:08:29 +02:00
24fcfd7763 prepare the addition of a date index 2018-09-28 19:07:01 +02:00
a2e63cca44 cleanup 2018-09-13 08:11:15 +02:00
2e433ba969 cleanup 2018-09-13 07:52:14 +02:00
1182d76205 replace the FolderStorage with DiskStorage
- The DiskStorage uses only one file instead of millions.
  Also the block size is only 512 byte instead of 4kb, which
  helps to reduce the memory usage for short sequences.
- Update primitiveCollections to get the new LongList.range
  and LongList.rangeClosed methods.
- BSFile now stores Time&Value sequences and knows how to
  encode the time values with delta encoding.
- Doc had to do some magic tricks to save memory. The path
  was initialized lazy and stored as byte array. This is no
  longer necessary. The patch was replaced by the
  rootBlockNumber of the BSFile.
- Had to temporarily disable the 'in' queries.
- The stored values are now processed as stream of LongLists
  instead of Entry. The overhead for creating Entries is
  gone, so is the memory overhead, because Entry was an
  object and had a reference to the tags, which is
  unnecessary.
2018-09-12 09:35:07 +02:00
2a68fd72da fix: diagonal line in parallelRequests plot 2018-08-18 12:31:11 +02:00
b01d267300 update primitiveCollections
The new version of primitiveCollections requires Java 10.
2018-08-18 08:32:27 +02:00
c1974d21b2 replace startDate + dateRange with start and end date
The new datetimepicker can be used to specify date ranges. We no longer
need to define a start date and a range. This simplifies the code
for zooming and shifting considerably.
2018-08-11 17:45:20 +02:00
58623c480f switch date picker to http://www.daterangepicker.com version 3.0.3 2018-08-11 09:09:53 +02:00
14d9216e40 create margins with constant size
With this we will be able to zoom in by selecting a region. The constant
margins allow us to determine the exact timestampt for a pixel position.
2018-08-10 09:56:57 +02:00
786570503a make plot for parallel requests easier to digest
1. draw it below the scatter plot, so that you can see both
2. make the color lighter, so that you can see both
2018-08-10 09:27:19 +02:00
2fae877444 plot parallel requests with style filledcurve
this makes the duration of requests more obvious
2018-08-09 07:51:56 +02:00
7ece779469 do not draw diagonal lines in ParallelRequestsAggregator 2018-08-09 07:37:29 +02:00
f30a8a26d9 add aggregator for parallel requests
ParallelRequestsAggregator generates a line plot that shows the number
of parallel requests among the plotted events.
This plot has two issues:
1. It only considers events that are plotted. Events that occur later,
   but were started within the plotted time frame are not considered.
2. For performance reasons we are only plotting points when a value
   changed. This leads to diagonal lines.
2018-08-09 07:24:51 +02:00