Numbers are scores for specific handler/warning level cases.
Scores are comprised of warnings counts per warning levels
mapped onto a 0..100 range. Higher is better.
Numbers are scores for specific handler/test cases.
Scores are comprised of benchmark speed and memory usage data
mapped onto a 0..100 range. Higher is better.
Higher numbers are better on both axes. The "good" zone is the upper right and the "bad" zone is the lower left.
The top is fast, the bottom is slow. Left is more warnings, right is less.
Scoring Algorithms
The algorithms behind the scores shown on this page are somewhat arbitrary.
The original scoring algorithm (Default) was deemed "good enough",
but later work has focused on enabling multiple scoring algorithms.
These can be found on the Home page or
in the Scoring drop-down in the upper right section of every page.
Algorithms are implemented by "scorekeepers".
Each scorekeeper is specified by the two axes shown in the scoring chart.
Each axis interprets test data according to its own algorithm.
The current scorekeeper and axis algorithms are described below:
Score Keeper: Default
The Default scoring algorithm is the original (and initially the only) scoring algorithm.
This algorithm uses benchmark and verification warnings to generate results.
The Default score chart graphs various slog handlers by speed versus functionality.
This concept was the impetus behind creating scoring algorithms and charts.
On this chart the X axis is a warning score and
the Y axis is a benchmark (performance) score for each handler.
Going by the "score" values can be misleading,
as they roll up a lot of different data items, hiding the detail.
Use the checkboxes on the Scores table at the top to make detail tables visible.
Further buttons above the detail tables show different classes of data.
X Axis: Warnings
The X-axis for the Default scoring chart shows the score derived from verification warnings.
The score is calculated using the score weights shown to the right
which are applied to the warning levels during calculation.
Handlers are scored based on how few warnings are generated.
Warnings are worth different amounts depending on their warning level.
The weights applied during this process are shown on the right.
Source Data
Each handler results in a lot of verification test output:
Warnings for slog/JSONHandler:
Suggested
2 [Duplicates] Duplicate field(s) found
TestAttributeDuplicate: map[alpha:2 charlie:3]
TestAttributeWithDuplicate: map[alpha:2 charlie:3]
The Suggested line is an example of a warning "level"
(warnings are grouped into levels on the warning page).
In this example there are two instances of the Duplicates warning.
Warnings Algorithm
Scoring is done for all handlers at the same time:
for each handler
score starts at zero
for each warning level
for each warning in level
if warning shows up during testing
score = score + weight(level) * len(warnings)
adjust score to range of
zero to maximum possible number of warnings
Where the weight(level) comes from the predefined table shown above and to the right.
The scores for each handler are then divided by the maximum possible number
of warnings that any handler might receive (if it were really awful)
and that number is subtracted from 100.0.
This results in a number from 0.0 (awful, all warnings logged) to 100.0 (no warnings at all).
That number is stored for use and displayed on this page and on each handler page.
Note that most scores are above ~40 as it is difficult to throw all the warnings.
Scores
Multiple scores are generated for each handler.
The main (or "default") score is shown in the data tables
with the column header Score with an associated checkbox.
The checkbox can be used to show several other "score" columns, as follows:
Default (Score)
This is the score that is shown in the overall chart
at the top of the page in the column labeled Warnings.
The default score is the same as the By Data score.
By Data
This score is calculated by rolling up scores calculated per warning level.
Original
This is the "original" score which has been overtaken by newer code.
The Original score is within 5% of by Data value.
Level
Weight
Required
8
Implied
4
Suggested
2
Administrative
1
Y Axis: Benchmarks
The Y-axis for the Default scoring chart shows the score derived from running benchmarks.
The score is calculated using the score weights shown to the right
which are applied to the several specific benchmark result values.
Handler benchmarks are scored based on several metrics on each of various tests.
Metrics are worth different amounts depending on what they are.
The weights applied during this process are shown on the right.
Source Data
Each combination of handler and test results in a single line of test output:
separate memory allocations per operation (1 allocs/op), and
estimated logging throughput per second (284.33 MB/s).
Benchmark Algorithm
For each handler/test combination (single line or test results)
we use one or more of the following three data items:
nanoseconds per operation,
memory bytes allocated per operation, and
separate memory allocations per operation.
These three items are combined over two steps.
First the test value ranges are acquired:
for each test
for each handler
for each of the three results described above
track the highest and lowest value for the test over all handlers
Then the test scores are calculated (this is the Original calculation):
for each handler
for each test
scorePerTest starts at zero
for each of the three results described above
convert the value to a fraction of
the range of values for the test from the previous step
scorePerTest = scorePerTest + weight(result) * 100.0 * the fraction
scorePerTest /= sum of weight(result)
scorePerHandler = average of scorePerTest for handler
Where the weight(result) comes from the predefined table shown above and to the right.
There is currently no weighting by test, all tests are considered equal.
Scores
Multiple scores are generated for each handler.
The main (or "default") score is shown in the data tables
with the column header Score with an associated checkbox.
The checkbox can be used to show several other "score" columns, as follows:
Default (Score)
This is the score that is shown in the overall chart
at the top of the page in the column labeled Benchmarks.
The default score is the average of the By Test and By Data scores.
By Test
This score is calculated by rolling up scores calculated per test.
By Data
This score is calculated by rolling up scores calculated per data item.
Original
This is the "original" score which has been overtaken by newer code.
The Original score is within 5% of by Data value.