the heap refill code was recursively implemented with two methods.
I merged both methods.
replace recursion in heap refill method with iterative approach
use array for list of LongQueue
This way there is no precondition when accessing the elements
In many use cases it is more efficient to have a getter that returns
a default value instead of throwing an exception. The exception forces
every consumer to call containsKey() before calling get(). In cases
where the consumer wants to use a default value this is unnecessary.
Using the unsafe versions of add/get to improve performance.
Here are some numbers for 2^15 intersections of two random
sorted lists with 16k elements. The numbers were gathered with
perf stat -d -d -d --delay 2000 java -ea -cp bin/test:bin/main
org.lucares.collections.Test 16000 15 (the test class is not committed)
Duration: 9059ms -> 7293ms (80.5%)
Cycles: 26.084.812.051 -> 21.616.608.207 (82.9%)
Instructions: 68.045.848.666 -> 52.306.000.150 (76.9%)
Instructions per Cycle: 2,61 -> 2.42
Branches: 15.007.093.940 -> 9.839.481.658 (65.6%)
Branch Misses: 2.285.461 -> 1.551.906
Cycles per element: 24.87 -> 20.61 (82.9%)
Added getUnsafe and addUnsafe, two methods that skip checks.
The checks are not needed in unionSorted, because we made
sure the code works correctly. getUnsafe still has the checks,
but as assertions.
The speedup is roughly 20%. Here are some results for calling
union 2^16 times for two random sorted lists with 16k
elements:
Duration: 20170 -> 16857 (83.57%)
Instructions: 138.277.743.679 -> 108.474.027.213 (78.4%)
Instructions per cycle: 2,33 -> 2,18
Branches: 30.248.330.644 -> 19.908.207.057 (65.8%)
Branch Misses: 3.012.427 -> 3.477.433
Cycles per list element: 65.93 -> 51.72 (78.4%)
Created with (the test program is not committed) on a Core2Duo 8600:
perf stat -d -d -d --delay 2000 java -cp bin/test:bin/main
org.lucares.collections.Test 16000 16
Use case: The list is used as a buffer, that is re-used and in
each iteration the list is cleared.
Now that clear() does not replace the data array there is no
garbage collection and we do not have to allocated one or
several new arrays in the next iteration.
You can still free the associated memory by calling clear() + trim().