Document: WG14 N1317
Submitter: Fred Tydeman (USA)
Submission Date: 2008-07-13
Previous version of paper: N1303
Related WG14 documents: N1151, N1171
Subject: New macros for <float.h>
Existing practice: Many implementation have macros (with various spellings) for the minimum subnormal numbers. C99 has DECIMAL_DIG with the similar meaning as LDBL_MAXDIG10.
The committee rejected the idea of having the xxx_SUBNORMAL_MIN macros be conditionally defined, e.g., only if subnormals are supported.
The committee asked that they be changed to the smallest positive number (either smallest subnormal or smallest normal). In doing the asked for change, the author changed the name of the macros to a more meaningful name.
A problem with having the macros always defined is there is no way, at translation time, to determine if the xxx_TRUE_MIN is subnormal or normal. This is due to #if (xxx_MIN != xxx_TRUE_MIN) being a constraint violation (may only use integer values then). So, the test of subnormal support must happen at runtime (which is a performance penalty).
One solution to this problem is a macro, made up of sum of powers of 2, where each power of 2 represents a different floating type. Something like:
[new paragraph before paragraph 9]: The set of floating types that support subnormal numbers is characterized by the implementation defined value of SUBNORMAL_FLOAT_TYPES (which is the sum of the values, which are powers of 2, given for each type):
0x00 no floating types 0x01 float 0x02 double 0x04 long double 0x10 _Decimal32 0x20 _Decimal64 0x40 _Decimal128Of course, we could create names for all those powers of two (similar to MATH_ERR* for math_errhandling in 7.12 <math.h>).
Another solution is to add: [new bullet after xxx_MIN_EXP]: minimum negative integer such that FLT_RADIX raised to one less than that power is a floating-point number,
FLT_TRUE_MIN_EXP DBL_TRUE_MIN_EXP LDBL_TRUE_MIN_EXPWith this solution, do we also need the base 10 exponent macros?
Changes to C1x
Add new bullets to 5.2.4.2.2 Characteristics of floating types <float.h>
[bullet near DECIMAL_DIG] The number of base 10 digits required to ensure that floating-point numbers with /p/ radix /b/ digits which differ by only one unit in the last place (ulp) are always differentiated,
p log10 b if b is power of 10 ceil(1 + p log10 b) otherwise
[Note to editor: WG14 paper N1290 on printed page 9 has the correct symbols/fonts for the above two math expressions; it is also very similar to the existing math expressions for DECIMAL_DIG in C99.]
FLT_MAXDIG10 6 DBL_MAXDIG10 10 LDBL_MAXDIG10 10
[bullet after FLT_MIN] minimum positive floating-point number. If subnormal numbers are supported [footnote], their value is the minimum subnormal (also known as denormal) floating-point number, otherwise the minimum normal floating-point number, of the respective types.
FLT_TRUE_MIN 1E-37 DBL_TRUE_MIN 1E-37 LDBL_TRUE_MIN 1E-37
[footnote]: Support means that they are not flushed to zero when used as operands, nor, when an arithmetic operation produces them.
[paragraph 13, example 1] Add
FLT_MAXDIG10 9after FLT_RADIX
[paragraph 14, example 2] Remove "normalized" from just before IEC 60559.
Add
FLT_MAXDIG10 6 DBL_MAXDIG10 17after DECIMAL_DIG
Add
FLT_TRUE_MIN 1.40129846E-45 // decimal constant FLT_TRUE_MIN 0X1P-149F // hex constant DBL_TRUE_MIN 4.9406564584124654E-324 // decimal constant DBL_TRUE_MIN 0X1P-1074 // hex constantafter FLT_MIN and DBL_MIN.
Words for Rationale:
[add to 5.2.4.2.2 section] For applications that need to check, at translation time, if subnormal floating-point numbers are supported can do: {depends upon approach we settle on}
The values of the smallest subnormal floating-point numbers (if supported) are typically, but not always, FLT_MIN*FLT_EPSILON, DBL_MIN*DBL_EPSILON, LDBL_MIN*LDBL_EPSILON.