Document: N1092
Date: 29-Nov-04
The issue can be illustrated by the following example (_Decimal64 is a decimal floating type proposed in N1077):
_Decimal64 rate = 0.1;
0.1 has type double. In an implementation where binary representation is used for the floating types, and FLT_EVAL_METHOD is not -1, the internal representation of 0.1 cannot be exact. The variable 'rate' will get a value slightly different from 0.1. This defeated the purpose of decimal floating types. On the other hand, requiring programmers to write:
_Decimal64 rate = 0.1dd;
is inconvenient.
1/ The implementation is allowed to use a type different from double and long double as the type of unsuffixed floating constant. This is an implementation defined type. The intention is this type can represent the floating constant extactly. (A possible choice is a decimal floating type.)
2/ The range and precision of this type are implementation defined and are fixed throughout the program.
3/ TTDT is an arithmetic type. All arithmetic operations are defined for this type.
4/ Usual arithmetic conversion is extended to handle mixed operations between TTDT and other types. Roughly speaking, if an operation involves both TTDT and an actual type, the TTDT is converted to an actual type before the operation. This way, there is no "top-down" context information required when processing unsuffixed floating constants. For example:
double f;
f = 0.1;
Suppose the implementation uses _Decimal128 (a decimal floating type defined in N1077) as the TTDT. 0.1 is represented exactly after the constant is scanned. It is then converted to double in the assignment operator.
f = 0.1 * 0.3;
Here, both 0.1 and 0.3 are represented in TTDT. The result of the multiply operator also has type TTDT. It is then converted to double before the assignment. The result would be different from a current implementation where double is used as the type and a binary representation for the constants. The error due to conversion is incurred in both constants, and then propagated through the multiply. The TTDT provides more accurate result.
float g = 0.3f;
f = 0.1 * g;
When one operand is a TTDT and the other is one of float/double/long double, the TTDT is converted to double with an internal representation following the specification of FLT_EVAL_METHOD for constant of type double. Usual arithmetic conversion is then applied to the resulting operands.
_Decimal32 h = 0.1;
If one operand is a TTDT and the other a decimal floating type, the TTDT is converted to _Decimal64 with an internal representation specified by DEC_EVAL_METHOD (as specified in N1077). Usual arithmetic conversion is then applied.
If one operand is a TTDT and the other a fixed point type, the TTDT
is converted to the fixed point type. If the implementation supports fixed
point type, it should choose a representation for TTDT that can represent
floating and fixed point constants exactly.
In 6.2.5 after paragraph 28, add a paragraph:
[28a] There is an implementation defined data type called the translation time data type, or TTDT. TTDT is an arithmetic type and is used as the type for unsuffixed floating constants. There is no type specifier for TTDT.
Replace 6.4.4.2 paragraph 4 with the following:
[4] An unsuffixed floating constant has type TTDT. If suffixed by the letter f or F, it has type float. If suffixed by the letter l or L, it has type long double.
Add the following paragraphs after 6.3.1.7:
6.3.1.7a Translation Time Data Type
When a TTDT is converted to double, it is converted to the internal representation specified by FLT_EVAL_METHOD.
Recommended practice
The conversion of TTDT to double should match the execution-time conversion of character strings by library functions, such as strtod, given matching inputs suitable for both conversions, the same format and default execution-time rounding.
6.3.1.7b
Before the usual arithmetic conversions are carried out,
if one operand is TTDT and the other is not, the TTDT operand is converted
to double.