We generate CSV files with comma as separator.
When we write times with milli seconds, then
we use floating point numbers. Depending
on the locale those floating point numbers
may be written with comma instead of point.
If that happens, then the plots are messed up.
Fixed by enforcing the locale when formatting floats.
This makes it easier to use the mouse wheel
to zoom in. Without it you could zoom into
a region that had not data and then had to
use the date picker to change the date.
Rendering plots with millions of values is expensive. Before this fix
we wrote all values into CSV files. The CSV files were then read by
Gnuplot that did the rendering. But in an image with n×m pixes there
can only be nm different values. In most realistic scenarios we will
have many values that will be drawn to the same pixels. So we are
wasting time yb first generation the CSV for too many values and then
by parsing that CSV again.
Fixed by using a sparse 2D array to de-duplicate many values before
they get written to the CSV. The additional time we spend de-duplicating
is often smaller than the time saved when writing the CSV, so that the
total CSV writing is about as 'fast' as before (sometimes a little
faster, sometimes a little slower). But the time Gnuplot needs for
rendering drastically reduces. The factor depends on the data, of
course. We have seen factor 50 for realistic examples. Making a 15s
job run in 300ms.
Before: To compute the cumulative distribution we added every duration
into a LongList. This requires O(n) memory, where n is the number of
values.
Now: We store the durations + the number of occurrences in a
LongLongHashMap. This has the potential to reduce the memory
requirements if durations occur multiple times. There are a lot of
durations with 0, 1, 2 milliseconds. In the worst case every duration
is different. In that case the memory usage doubled with this solution.
Future: We are currently storing durations with milli seconds precision.
We don't have to do that. We cannot draw 100 million different values
on the y-axis in an images with only 1000px.
People are having trouble to understand durations like
100000 or 2.7E+6 milliseconds. Therefore we are
hanging the labels on the y-axis to include the unit
in the tic's label. We also use multiples of seconds,
minutes, hours and days instead of multiples of 10.
In order to prevent files from getting too big and
make it easier to implement retention policies, we
are splitting all files into chunks. Each chunk
contains the data for a time interval (1 month per
default).
This first changeset introduces the ClusteredPersistentMap
that implements this for PersistentMap. It is used
for a couple (not all) of indices.
- The DiskStorage uses only one file instead of millions.
Also the block size is only 512 byte instead of 4kb, which
helps to reduce the memory usage for short sequences.
- Update primitiveCollections to get the new LongList.range
and LongList.rangeClosed methods.
- BSFile now stores Time&Value sequences and knows how to
encode the time values with delta encoding.
- Doc had to do some magic tricks to save memory. The path
was initialized lazy and stored as byte array. This is no
longer necessary. The patch was replaced by the
rootBlockNumber of the BSFile.
- Had to temporarily disable the 'in' queries.
- The stored values are now processed as stream of LongLists
instead of Entry. The overhead for creating Entries is
gone, so is the memory overhead, because Entry was an
object and had a reference to the tags, which is
unnecessary.
The new datetimepicker can be used to specify date ranges. We no longer
need to define a start date and a range. This simplifies the code
for zooming and shifting considerably.
ParallelRequestsAggregator generates a line plot that shows the number
of parallel requests among the plotted events.
This plot has two issues:
1. It only considers events that are plotted. Events that occur later,
but were started within the plotted time frame are not considered.
2. For performance reasons we are only plotting points when a value
changed. This leads to diagonal lines.
Unfortunately the datetime picker does not support seconds. But it is
one of the few that support date and time and are flexible enough to
be used with VueJS.
- split the 'sortby' select field into two fields
- sort by average
- legend shows plotted and total values in the date range
- removed InlineDataSeries, because it was not used anymore
In logarithmic plots it is not easy to spot which event are longer than
60 seconds. A thin grey line help with this.
Also: fixed the marker lines (zoom area) which broke when y-ranges were
introduced. The lines were not drawn correctly, when the y-axis offset
was greater than 0.
Gnuplot supports differen coordinate systems. 'graph' is relative to the
area within the axes, 0,0 is bottom left and 1,1 is top right. And
'first' is based on the values of the x1 axis.
By using "graph 0.25,0" we say that the starting point is 25% of the
x-axis and 0% of the y-axis without having to compute the exact values.
Sometimes it is useful to specify the certain y-axis range. For example
when you are only interested in the values that take longer than a
threshold. Or when you want to exclude some outliers. When you want to
compare plots in a gallery, it is very handy when all plots have the
same data-area.
Eventually we want to only support what is now called aggregate, but
not have to implement different plot types. So instead of supporting
percentile plots for dashboards I removed them. You can still get
percentile plots together with the scatter plot.
Scaling big plots to small thumbnails results in bad images that barely
show any details.
We solve this by calling gnuplot a second time to generate the
thumbnails. They don't have any labels and are rendered in the required
size, so that not scaling is necessary.
Thumbnails have to be requested explicitly, because it can be expensive
to compute them.