Error and Cost Definitions
Error Definitions
When reliable reference solutions exist, errors in the submitted quantities of interest are calculated relative to that reference. The relative error of the tissue (and total) absorbed power is [1]
err¯Ptissue=|¯Pnumtissue−¯Preftissue|¯Preftissue.
This error measure only reflects the accuracy of the tissue (total) absorbed power, which are important quantities for calculating the tissue-specific (whole-body) specific absorption rate, but it does not reflect errors in the power distribution [1].
The L1-norm of the cell-averaged time-averaged absorbed power density ¯¯P, defined as
err¯¯PL1=∑∀k|¯¯Pnumcellk−¯¯Prefcellk|∑∀k¯¯Prefcellk,
quantifies the accuracy of the solution over the entire domain and is the preferred measure of error in the benchmark [1].
The bistatic radar cross section accuracy is measured in the benchmark using the L2-norm [9], [10]:
errσθθL2=√2π∫0π∫0|σnumθθ(θ,ϕ)−σrefθθ(θ,ϕ)|2sinθdθdϕ2π∫0π∫0|σrefθθ(θ,ϕ)|2sinθdθdϕ,
Cost Definitions
The following costs should be reported for each benchmark problem:
(i) Run time
This is the wall-clock time required to solve the problem and should be divided into three parts: preprocess, solve, and postprocess times. The preprocess time is defined as the time required to set up the problem, but is only required once for multiple excitations (e.g., reading geometry data or filling a matrix); the solve time is the time required to solve the problem; and the postprocess time is the time required to derive the quantities of interest after the solution is completed. Of these times, the solve time is often the most important quantity since the preprocessing operations can be amortized by solving for multiple excitations and postprocessing often outputs only a few quantities.
(ii) Peak memory requirement
The peak memory requirement is the maximum across all of the nodes of the maximum memory required in a single node.
(iii) Normalization
Of course, run times will depend on the computing environment used, e.g., using a laptop vs. a supercomputer, running sequentially vs. in parallel with 1000 processes, or compiling with different compiler/library options, will result in many orders of magnitude differences in run time. Moreover, most computational methods, especially parallel methods, can be used in between two extreme modes of operation: Fastest run time and highest efficiency (cheapest computational cost/power use). When the fastest run time is of interest, the method’s efficiency should be expected to be rather low and when the most efficient solution method is of interest, the run times should be expected to be rather high [11], [12].
To be able to compare numerical methods on different systems, a normalization stage is required to translate from run time to a “work” quantity. There are multiple ways to do this, such as reporting the actual number of flops used for each method, the theoretical number of flops that were available over the run (i.e., run time * theoretical performance in flops/s), the physical energy (Joules) required to run the equipment to solve the benchmark problem, or even the actual ownership cost of the hardware used to carry out the computations [13]. We normalize by the theoretical performance of the system, which is a straightforward number to report for each system:
normalized run cost (Flop)=observed run time (s)×theoretical peak available compute power (Flop/s).
normalized memory (MB)=observed peak memory per node (MB / node)×number of nodes used (node)