Recommendations for math library implementors

Document: WG14 N1630

Math library recommendations

Submitter: Fred J. Tydeman (USA)
Submission Date: 2012-9-1
Subject:Math library recommendations

Most C implementations run on hardware that uses IEEE-754 floating-point. However, only a few implementations claim conformance to Annex F. Therefore, there are many implementations that have no guidance as to what should happen with the math functions for things other than normalized floating-point numbers. This document has suggestions, done as recommended practice, for what should and should not happen for quiet NaNs, infinities, subnormal numbers, negative zero and a few specific error cases.

Also, there are a few places where the main body of the standard gives the implementor choices (not related to IEEE-754) and this document picks the recommended choice.

In addition, there are a few places where Annex F gives the implementor choices and this document picks the recommended choice.

This is not a repeat of Annex F, as annex F talks about floating-point exceptions (and not domain, range, or pole errors).

If and when support for signaling NaNs is added to C, this can be updated.

Add after 7.12.1 paragraph 2 (domain errors):

Recommended practice
The implementation-defined value should be a quiet NaN.

Add after 7.12.1 paragraph 3 (pole errors):

Recommended practice
The implementation-defined value should be an infinity with the correct sign.

Add Recommended Practice for each specific function in 7.12.* as follows:

acos
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be pi/2 (in round to nearest) and should not be an error condition.

f(-0.0) should be pi/2 and should not be an error condition.
asin
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
atan
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should +/-pi/2 and should not be an error condition.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
atan2
If both arguments (y, x) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (y, x) is a quiet NaN, the result should be that argument and should not be an error condition.

f(+/-infinity, -infinity) should be +/-3pi/4 and should not be an error condition.

f(+/-infinity, +infinity) should be +/-pi/4 and should not be an error condition.

f(+/-y, -infinity) for finite y > 0.0, should be +/-pi and should not be an error condition.

f(+/-y, +infinity) for finite y > 0.0, should be +/-0.0 and should not be an error condition.

f(+/-infinity, x) for finite x, should be +/-pi/2 and should not be an error condition.

If y/x is subnormal, a range error should occur and the result should be y/x (in round to nearest).

A domain error should not occur if both arguments are zero.

f(+/-0.0, -0.0) should be +/-pi and should not be an error condition.

f(+/-0.0, +0.0) should be +/-0.0 and should not be an error condition.

f(+/-0.0, x) for finite x < 0.0, should be +/-pi and should not be an error condition.

f(+/-0.0, x) for finite x > 0.0, should be +/-0.0 and should not be an error condition.

f(y, +/-0.0) for finite y < 0.0, should be -pi/2 and should not be an error condition.

f(y, +/-0.0) for finite y > 0.0, should be +pi/2 and should not be an error condition.

If y/x would be an overflow range error, a range error should not occur.

If y/x would be a pole error, a pole error should not occur.
cos
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be +1.0 (in round to nearest) and should not be an error condition.

f(-0.0) should be +1.0 and should not be an error condition.
sin
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
tan
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
acosh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(x), where x is -infinity, subnormal or -0.0, should be a domain error and should be a quiet NaN.
asinh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
atanh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.

f(+/-1.0) should be a pole error and should be +/-infinity.
cosh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +infinity and should not be an error condition.

f(subnormal) should be +1.0 (in round to nearest) and should not be an error condition.

f(-0.0) should be +1.0 and should not be an error condition.
sinh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
tanh
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-1.0 and should not be an error condition.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
exp
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be +0.0 and should not be an error condition.

f(subnormal) should be 1.0 (in round to nearest) and should not be an error condition.

f(-0.0) should be 1.0 and should not be an error condition.
exp2
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be +0.0 and should not be an error condition.

f(subnormal) should be 1.0 (in round to nearest) and should not be an error condition.

f(-0.0) should be 1.0 and should not be an error condition.
expm1
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be -1.0 and should not be an error condition.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
frexp
f(x, *exp), where x is a quiet NaN, +/-infinity, or -0.0, should be that value x, *exp should be zero, and there should not be an error condition.

f(subnormal) should not be an error condition and should be treated as if x were normalized.

If the integral power of 2 is outside the range of int, a domain error should occur, a quiet NaN should be the result and *exp should be closest integer to the correct value.
ilogb
f(x), where x is a quiet NaN, infinite, or zero, should be a domain error.

If the correct value is outside the range of the return type, a domain error should occur and the result should be the closest representable integer to the correct value.
ldexp
f(x, exp), where x is a quiet NaN, +/-infinity, or -0.0, should be that value x and there should not be an error condition.

f(x, exp), where x is a subnormal should treat x as if it were normalized.
log
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a domain error and should be a quiet NaN.

f(+subnormal) should treat x as if it were normalized and should not be an error condition.

f(-subnormal) should be a domain error and should be a quiet NaN.

f(+/-0.0) should be a pole error and should be -infinity.
log10
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a domain error and should be a quiet NaN.

f(+subnormal) should treat x as if it were normalized and should not be an error condition.

f(-subnormal) should be a domain error and should be a quiet NaN.

f(+/-0.0) should be a pole error and should be -infinity.
log1p
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should be a range error. f(subnormal) should be that same subnormal (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.

f(-1.0) should be a pole error and should be -infinity.
log2
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a domain error and should be a quiet NaN.

f(+subnormal) should treat x as if it were normalized and should not be an error condition.

f(-subnormal) should be a domain error and should be a quiet NaN.

f(+/-0.0) should be a pole error and should be -infinity.
logb
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +infinity and should not be an error condition.

f(subnormal) should treat x as if it were normalized and should not be an error condition.

f(+/-0.0) should be a pole error and should be -infinity.
modf
f(x, *iptr), where x is a quiet NaN or +/-0.0, should be that value x, *iptr should be x, and there should not be an error condition.

f(x, *iptr), where x is +/-infinity should be +/-0.0, *iptr should be x, and there should not be an error condition.

f(x, *iptr), where x is subnormal should be x, *iptr should be zero, and there should not be an error condition.
scalbn, scalbln
f(x, n), where x is a quiet NaN, +/-infinity, or -0.0, should be that value x and there should not be an error condition.

f(x, n), where x is a subnormal should treat x as if it were normalized.
cbrt
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(subnormal) should treat x as if it were normalized and should not be an error condition.

f(-0.0) should be -0.0 and should not be an error condition.
fabs
f(quiet NaN) should be that same quiet NaN (with the sign set positive) and should not be an error condition.

f(+/-infinity) should be +infinity and should not be an error condition.

f(-0.0) should be +0.0 and should not be an error condition.
hypot
If either x or y or both is an infinity (even if the other is a quiet NaN), f(x, y) should be +infinity and there should not be an error condition.

If both arguments are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments is a quiet NaN (and the other is not an infinity), the result should be that argument and should not be an error condition.

If either x or y is zero, f(x, y) should be absolute value of the other argument and there should not be an error condition.

If both x and y are zero, f(x, y) should be +0.0 and there should not be an error condition.

If either x*x or y*y would be a range error, but sqrt(x*x+y*y) is not a range error, then no range error should happen.
pow
If both arguments are quiet NaNs, the result should be one of those arguments and should not be an error condition.

f(+1.0, y) for any y, even a quiet NaN, should return 1.0 and should not be an error condition.

f(x, +/-0.0) for any x, even a quiet NaN, should return 1.0 and should not be an error condition.

A domain error should not occur if x is zero and y is zero.

If just one of the arguments is a quiet NaN, the result should be that argument and should not be an error condition.

f(-1.0, +/-infinity) should return 1.0 and should not be an error condition.

f(x, -infinity) for |x| < 1.0, should return +infinity and should not be an error condition.

f(x, -infinity) for |x| > 1.0, should return +0.0 and should not be an error condition.

f(x, +infinity) for |x| < 1.0, should return +0.0 and should not be an error condition.

f(x, +infinity) for |x| > 1.0, should return +infinity and should not be an error condition.

f(-infinity, y) for y an odd integer < 0.0, should return -0.0 and should not be an error condition.

f(-infinity, y) for y < 0.0 and not an odd integer, should return +0.0 and should not be an error condition.

f(-infinity, y) for y an odd integer > 0.0, should return -infinity and should not be an error condition.

f(-infinity, y) for y > 0.0 and not an odd integer, should return +infinity and should not be an error condition.

f(+infinity, y) for y < 0.0, should return +0.0 and should not be an error condition.

f(+infinity, y) for y > 0.0, should return +infinity and should not be an error condition.

f(+/-0.0, -infinity) should return +infinity and should be a pole error.

f(+/-0.0, y) for y an odd integer < 0.0, should return +/-infinity and should be a pole error.

f(+/-0.0, y) for y < 0.0, finite, and not an odd integer, should return +infinity and should be a pole error.

A pole error should occur if x is zero and y is less than zero.

f(+/-0.0, y) for y an odd integer > 0.0, should return +/-0.0 and should not be an error condition.

f(+/-0.0, y) for finite y > 0.0 and not an odd integer, should return +0.0 and should not be an error condition.

f(x, y) for finite x < 0.0 and finite non-integer y, should return a quiet NaN and should be a domain error.
sqrt
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a domain error and should be a quiet NaN.

f(subnormal) should treat x as if it were normalized and should not be an error condition.

f(-0.0) should be -0.0 and should not be an error condition.
erf
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-1.0 and should not be an error condition.

If 2*x/sqrt(pi) is subnormal, f(x) should be a range error and f(x) should be 2*x/sqrt(pi) (in round to nearest).

f(-0.0) should be that same -0.0 and should not be an error condition.
erfc
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +0.0 and should not be an error condition.

f(-infinity) should be +2.0 and should not be an error condition.

f(subnormal) should be +1.0 (in round to nearest) and should not be an error condition.

f(-0.0) should be +1.0 and should not be an error condition.
lgamma
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +infinity and should not be an error condition.

f(subnormal) should not be an error condition and should be ln(|(1/x)|) [even thou 1/x might be a range error].

f(x), where x is -0.0 or a negative integer, should be a pole error and should be +infinity.
tgamma
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+infinity) should be +infinity and should not be an error condition.

f(-infinity) should be a quiet NaN and should be a domain error.

f(subnormal) should be 1/x (in round to nearest), which should be a range error in most cases.

f(+/-0.0) should be a pole error and should be +/-infinity.

If x is a negative integer, a domain error should occur and should be a NaN.
ceil
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(+subnormal) should be 1.0 and should not be an error condition.

f(-subnormal) should be -0.0 and should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.

f(x) for x a finite non-integer should not raise inexact.
floor
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(+subnormal) should be +0.0 and should not be an error condition.

f(-subnormal) should be -1.0 and should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.

f(x) for x a finite non-integer should not raise inexact.
nearbyint
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(subnormal) should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.
rint
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.

They should raise "inexact" floating-point exception if the result differs from the argument.
lrint, llrint
f(quiet NaN) should be a domain error and be an unspecified value.

f(+/-infinity) should be a domain error and should be the largest integer in the return type with the same sign as the argument.

f(-0.0) should be 0 and should not be an error condition.

If the correct value is outside the range of the return type, a domain error should occur and the result should be the closest integer to the correct value.

They should raise "inexact" floating-point exception if the result differs from the argument and no other error condition happened.
round
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(+/-subnormal) should be +/-0.0 and should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.

f(x) for x a finite non-integer should not raise inexact.
lround, llround
f(quiet NaN) should be a domain error and be an unspecified value.

f(+/-infinity) should be a domain error and be the largest integer in the return type with the same sign as the argument.

f(subnormal) should be 0 and should not be an error condition.

f(-0.0) should be 0 and should not be an error condition.

If the correct value is outside the range of the return type, a domain error should occur and the result is the closest integer to the correct value.

f(x) for x a finite non-integer should not raise inexact.
trunc
f(quiet NaN) should be that same quiet NaN and should not be an error condition.

f(+/-infinity) should be +/-infinity and should not be an error condition.

f(+/-subnormal) should be +/-0.0 and should not be an error condition.

f(-0.0) should be that same -0.0 and should not be an error condition.

f(x) for x a finite non-integer should not raise inexact.
fmod
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

If x is a quiet NaN and y is zero, f(x, y) should be that same quiet NaN, and should be not be an error condition.

If x is an infinity or y is zero (and the other is not a quiet NaN), f(x, y) should be a domain error and should be a quiet NaN.

If x is finite and y is infinity, f(x, y) should be x and there should not be an error condition.

When subnormal results are supported, the returned value is exact, is independent of the current rounding direction mode, and should not be an error condition. Otherwise (subnormal results are not supported), if f(x, y) would be subnormal, this should be a range error.

f(-0.0, y) should return -0.0 for y not zero nor NaN and this should not be an error condition.
remainder
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

If x is a quiet NaN and y is zero, f(x, y) should be that same quiet NaN, and should be not be an error condition.

If x is an infinity or y is zero (and the other is not a quiet NaN), f(x, y) should be a quiet NaN and this should be a domain error.

If x is finite and y is infinity, f(x, y) should be x and there should not be an error condition.

When subnormal results are supported, the returned value is exact, is independent of the current rounding direction mode, and should not be an error condition. Otherwise (subnormal results are not supported), if f(x, y) would be subnormal, this should be a range error.

f(-0.0, y) should return -0.0 for y not zero nor NaN and this should not be an error condition.
remquo
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition. *quo is unspecified.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition. *quo is unspecified.

If x is a quiet NaN and y is zero, f(x, y) should be that same quiet NaN, and should be not be an error condition. *quo is unspecified.

If x is an infinity or y is zero (and the other is not a quiet NaN), f(x, y) should be a quiet NaN and this should be a domain error. *quo is unspecified.

If x is finite and y is infinity, f(x, y) should be x and there should not be an error condition. *quo should be zero.

When subnormal results are supported, the returned value is exact, is independent of the current rounding direction mode, and should not be an error condition. Otherwise (subnormal results are not supported), if f(x, y) would be subnormal, this should be a range error and *quo is unspecified.

f(-0.0, y) should return -0.0 for y not zero nor NaN and this should not be an error condition. *quo should be zero.
copysign
f(x, y), where x and/or y is a quiet NaN, +/-infinity, or -0.0, there should not be an error condition.
nan
f(x) for any x, should be a quiet NaN and should not be an error condition.
nextafter
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

f(x, y), where x is maximum magnitude finite and |x| < |y| (including infinite y), and f(x, y) is infinite, should be a range error.

If f(x, y) is subnormal or zero and x != y, there should be a range error.

f(+/-0.0,-0.0) is -0.0 and should not be an error condition.
nexttoward
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

f(x, y), where x is maximum magnitude finite and |x| < |y| (including infinite y), and f(x, y) is infinite, should be a range error.

If f(x, y) is subnormal or zero and x != y, there should be a range error.

f(+/-0.0,-0.0) is -0.0 and should not be an error condition.
fdim
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

f(+infinity, +infinity), f(-infinity, +infinity), f(any finite, +infinity), f(-infinity, -infinity), f(-infinity, any finite), should be +0.0 and should not be an error condition.

f(+infinity, -infinity), f(+infinity, any finite), f(any finite, -infinity), should be +infinity and should not be an error condition.

When subnormal results are supported, the returned value is exact, is independent of the current rounding direction mode, and should not be an error condition. Otherwise (subnormal results are not supported), if f(x, y) would be subnormal, this should be a range error.

f(+/-0.0, +/-0.0) should be +0.0
fmax
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be the other argument and should not be an error condition.

If one of the arguments (x, y) is +infinity, the result should be +infinity and should not be an error condition.

If both of the arguments (x, y) are -infinity, the result should be -infinity and should not be an error condition.

If one of the arguments (x, y) is -infinity, the result should be the other argument and should not be an error condition.

f(+0.0, +0.0), f(+0.0, -0.0), f(-0.0, +0.0), should be +0.0

f(-0.0, -0.0), should be -0.0
fmin
If both arguments (x, y) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be the other argument and should not be an error condition.

If one of the arguments (x, y) is -infinity, the result should be -infinity and should not be an error condition.

If both of the arguments (x, y) are +infinity, the result should be +infinity and should not be an error condition.

If one of the arguments (x, y) is +infinity, the result should be the other argument and should not be an error condition.

f(-0.0, -0.0), f(+0.0, -0.0), f(-0.0, +0.0), should be -0.0

f(+0.0, +0.0), should be +0.0
fma
If all three arguments (x, y, z) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If both arguments (x, y), (x, z), or (y, z) are quiet NaNs, the result should be one of those arguments and should not be an error condition.

If just one of the arguments (x, y) is a quiet NaN, the result should be that argument and should not be an error condition.

If the z argument is a quiet NaN, one of x and y is infinite and the other is zero, the result should be a NaN and it is implementation defined if a domain error happens.

If the z argument is not a quiet NaN, one of x and y is infinite and the other is zero, the result should be a NaN and a domain error should happen.

If x times y is an exact infinity and the z argument is an infinity of the opposite sign, the result should be a NaN and a domain error should happen.