In certain situations, a novel but simple technique can be used to dramatically speed up floating-point to integer conversions. In particular, given a double-precision source and integer destination array, we may use either the first or the second two C statements below:
dest[i] = (int)floor(src[i]); /* Conventional version*/
float f = (float)(src[i] + convert_offset);
dest[i] = *(int32 *)&f; /* Optimized version*/
The reason is that an IEEE single precision floating-point number's
mantissa field overlaps the low bits of an integer,
as shown in Figure
.
If the grid size is a power of two, we can convert a
coordinate into a grid location by shifting and extracting
out the appropriate bits of the coordinate.
Floating-point numbers are always normalized
into the form
, so we can shift
the mantissa of a floating point number by simply
adding a constant. For example, given a
floating-point coordinate with binary representation
, after adding
, we get
By adding a constant, we have shifted
the grid location
into the low bits, and by rounding
to single precision we have removed the excess bits
that represent the sub-gridcell location.
The resulting floating-point number can be compared,
incremented, and decremented as if an integer.
The IEEE exponent field corrupts the high bits of this
integer, but these can easily
be subtracted off. Since the origin of the grid coordinates
is arbitrary anyway, we leave the exponent.
The mantissa is only 23 bits, but this is still enough to
represent 8 million grid cells along each axis, or
cells total. Converting a double-precision
floating point number to a 64-bit integer in the same way
would yield a 52-bit grid index.
By adding a carry-under protection bit to the constant, this method can accept both positive and negative input coordinates. By subtracting the coordinate origin, we can accomplish a shifting as well as scaling. We can also scale the constant to extract out a different set of bits, simulating any power-of-two grid size. Finally, we must compensate for IEEE rounding, which rounds to the nearest bitpattern rather than truncating. The ideal constant for this method thus has the form:
Thus given a power-of-two grid size and origin, the optimized C code to compute the conversion constant and then convert a floating point value src to an integer grid cell index dest is as follows.
double convert_offset = (1.5*(1<<23)-0.5)*gridsize-origin;
float f = (float)(src + convert_offset);
int dest = *(int32 *)&f;
Of course, the same method can be applied in any language that allows us to quickly interpret the bits of one data type as bits of another data type.
The table on the next page shows the time per floating point to integer conversion for both the conventional and optimized approaches. The performance difference between the two is dramatic- a factor of 2 to 40, depending on the architecture. Because mapping one bounding box onto a grid requires six such conversions, the total savings by using the optimized method is a microsecond or two per object- a significant speedup.