JTC1/SC22/WG14
N698
N698 J11/97-061
Implementation Defined Integral Types
Randy Meyers and Doug Gwyn
23 June 1997
1 Introduction
Doug Gwyn distributed via the reflector a proposal (N713) to allow
implementation defined integral types to be used in the standard
headers. Doug and I discussed the proposed wording changes in N713
and produced this updated version.
Early versions of this paper were also distributed to Clive Feather,
Frank Farance, and Douglas Walls. Clive provided particularly
valuable feedback about issues with the representation of unsigned
integers, and issues raised in his paper N691.
This paper contains no new issues that have not been in previous
proposals before the committee. This version of the proposal
incorporates some ideas from N606 by Frank Farance and N669 by Clive
Feather.
2 Overview of Proposal
Implementation defined integral types are incorporated into the
Standard by allowing implementations to add additional types to the
set of "signed integer types." By existing wording in the Standard,
the implementation must supply corresponding unsigned integer types.
By definition, the implementation defined signed and unsigned integer
types are integral types, basic types, scalar types, and arithmetic
types. All of the statements made in the Standard about those type
classes automatically apply to the implementation defined integer
types. The same wording in the Standard that defines the properties
of the standard integer types defines the properties of the
implementation defined integer types as well.
For convenience, the terms "extended signed integer types", "extended
unsigned integer types", and "extended integer types" are defined.
The term "precision" is defined to solve an existing problem with the
Standard confusing "size" with an integer type's ability to represent
values. Two integer types of the same size might have different
padding, and thus not be able to represent the same values.
The integral promotions and usual arithmetic conversions have been
made less implementation defined than in Doug's original proposal.
The new usual arithmetic conversions have the following properties:
1. The results for the standard types do not change.
2. When a Standard type and implementation defined type meet, if
the signed or unsigned version of the standard type can
represent the values of the implementation defined type, then
the result is the (signed or unsigned) standard type.
3. The new rules are a generalization of the old rules, and
retain their spirit.
4. The new rules behave like the old rules even for unusual
implementations that use the same representation for all the
standard types, or have unsigned types that are just the
signed types with the sign bit ignored, or have unsigned
representations that are much "bigger" than their signed
counterparts.
This paper actually contains two equivalent alternative wordings for
the integral promotions and usual arithmetic conversions for the
committee to choose between. Sections 4.1, 5.1, and 6.1 of this paper
make up the first alternative wording. The second alternative
consists of either Section 4.1 or 4.2, plus section 5.2 and 6.2.
The paper contains some optional sections on constants (Section 7),
uniqueness of types (Section 8), preprocessor arithmetic (Section 9),
and the grammar (Section 10). These sections may be voted in or out
without hurting the integrity of the proposal.
Note: Text surrounded by *asterisks* should be italicized, while text
surrounded by {braces} should be set in Courier font.
3 Allow Implementation Defined Integral Types
Replace Section 6.1.2.5 (Types), paragraph 3:
There are five *signed integer types*, designated as {signed
char}, {short int}, {int}, {long int}, and {long long int}. (The
signed integer and other types may be designated in several
additional ways, as described in 6.5.2.)
with:
There are five *standard signed integer types*, designated as
{signed char}, {short int}, {int}, {long int}, and {long long
int}. (These and other types may be designated in several
additional ways, as described in 6.5.2.) There may also be
implementation-defined *extended signed integer types*.
[reference first new footnote] The standard and extended signed
integer types are collectively called just *signed integer
types*. [reference second new footnote]
Add first new footnote:
Implementation defined keywords must have the form of an
identifier reserved for any use as described in 7.1.3.
Add second new footnote:
Therefore, any statement in this Standard about the signed
integer types also applies to the extended signed integer types.
After the following in Section 6.1.2.5 (Types), paragraph 5:
For each of the signed integer types, there is a corresponding
(but different) *unsigned integer type* (designated with the
keyword {unsigned}) that uses the same amount of storage
(including sign information) and has the same alignment
requirements.
add:
The unsigned integer types that correspond to the standard signed
integer types are the *standard unsigned integer types*. The
unsigned integer types that correspond to the extended signed
integer types are the *extended unsigned integer types*. The
extended unsigned integer types and extended unsigned integer
types are collectively called the *extended integer types*.
4 Define Precision For Integer Types
Existing wording in the Standard refers to the "size" of integer types
in a problematical fashion. From Section 6.2.1.2 (Signed and unsigned
integers), paragraph 2, defining integer conversions:
When a signed integer is converted to an unsigned integer with
equal or greater size, if the value of the signed integer is
nonnegative, its value is unchanged.
If integers are allowed to have padding (bits in their representation
that do not participate in the value stored in the integer), then the
above section fails to consider the case of two integers that are the
same size, but use a different number of bits to store the value.
Frank Farance suggested in N606 that a new term, precision, be defined
for integer types. This proposal contains two alternative definitions
from which the committee can choose.
4.1 Precision Definition 1
This definition of "precision" special cases the definition for the
unsigned types in in order to make the first of the alternative
wordings below for the integral promotions and usual arithmetic
conversions work (this definition also works for the second
alternative for the promotions and conversions).
After the following in Section 6.1.2.5 (Types), paragraph 16:
The representations of integral types shall define values by use
of a pure binary numeration system.25
Add:
The *precision* of a signed integer type is the number of bits it
uses to represent values excluding the sign bit and any padding.
The precision of an unsigned integer type is considered to be the
same as the corresponding signed integer type, although the
number of bits used to represent values may be greater. The
precision of an enumerated type is the precision of the
compatible integral type. Regardless of its representation, the
precision of {char} is considered to be the precision of {signed
char} and {unsigned char}.
4.2 Precision Definition 2
This definition of precision contains no special cases. It only works
with the second alternative wording for the integral promotions and
usual arithmetic conversions.
After the following in Section 6.1.2.5 (Types), paragraph 16:
The representations of integral types shall define values by use
of a pure binary numeration system.25
Add:
The *precision* of an integral type is the number of bits it uses
to represent values excluding the sign bit (if any) and any
padding.
5 Integral Promotions
This section gives two alternative wordings for the integral
promotions. The first alternative is based exclusively on precision.
The second alternative is based on a new concept called the integral
conversion rank of types. This ranking, once defined, allows the
promotions and conversions to be expressed more succinctly.
5.1 Integral Promotions Alternative 1
Change Section 6.2.1.1 (Characters and integers), paragraph 1:
A {char}, a {short int}, or an {int} bit-field, or their signed
or unsigned varieties, or an enumeration type, may be used in an
expression wherever an {int} or {unsigned int} may be used. If
an {int} can represent all values of the original type, the value
is converted to an {int}; otherwise, it is converted to an
{unsigned int}. These are called the *integral promotions*.37
All other arithmetic types are unchanged by the integral
promotions.
to:
The following may be used in an expression wherever an {int} or
{unsigned int} may be used:
-- An integral type whose precision is less than or equal to
the precision of {int} and {unsigned int}
-- A bit-field of type {int}, {signed int}, or {unsigned
int}
If an {int} can represent all values of the original type, the
value is converted to an {int}; otherwise, it is converted to an
{unsigned int}. These are called the *integral promotions*.37
All other types are unchanged by the integral promotions.
Note that Section 6.1.2.5 paragraph 16 defines integral types as char,
the signed and unsigned integer types, and the enumerated types.
5.2 Integral Promotions Alternative 2
Replace Section 6.2.1.1 (Characters and integers), paragraph 1:
A {char}, a {short int}, or an {int} bit-field, or their signed
or unsigned varieties, or an enumeration type, may be used in an
expression wherever an {int} or {unsigned int} may be used. If
an {int} can represent all values of the original type, the value
is converted to an {int}; otherwise, it is converted to an
{unsigned int}. These are called the *integral promotions*.37
All other arithmetic types are unchanged by the integral
promotions.
with the following paragraphs:
Every integral type has a *integral conversion rank* defined as
follows:
-- No two signed integer types shall have the same rank, even
if they have the same representation.
-- The rank of a signed integer type shall be greater than the
rank of any signed integer type with less precision.
-- The rank of any standard signed integer type shall be
greater than the rank of any extended signed integer type
with the same precision.
-- The rank of {long long int} shall be greater than the rank
of {long int}, which shall be greater than the rank of
{int}, which shall be greater than the rank of {short int},
which shall be greater than the rank of {signed char}.
-- The rank of any unsigned integer type shall equal the rank
of the corresponding signed integer type.
-- The rank of {char} shall equal the rank of {signed char} and
{unsigned char}.
-- The rank of any enumerated type shall equal the rank of the
compatible integer type.
-- The rank of any extended signed integer type relative to
another extended signed integer type with the same precision
is implementation-defined, but still subject to the other
rules for determining the integral conversion rank.
-- For all integral types *T1*, *T2*, and *T3*, if *T1* has
greater rank than *T2* and *T2* has greater rank than *T3*
then *T1* has greater rank than *T3*.
The following may be used in an expression wherever an {int} or
{unsigned int} may be used:
-- An object or expression with an integral type whose
integral conversion rank is less than the rank of {int}
and {unsigned int}.
-- A bit-field of type {int}, {signed int}, or {unsigned
int}.
If an {int} can represent all values of the original type, the
value is converted to an {int}; otherwise, it is converted to an
{unsigned int}. These are called the *integral promotions*.37
All other types are unchanged by the integral promotions.
Note that Section 6.1.2.5 paragraph 16 defines integral types as char,
the signed and unsigned integral types, and the enumerated types.
6 Usual Arithmetic Conversions
This section gives two alternative wordings for the usual arithmetic
conversions. The first is based on precision. The second is based on
integral conversion rank.
6.1 Usual Arithmetic Conversions Alternative 1
Starting with the following text in Section 6.2.1.7 (Usual arithmetic
conversions), paragraph 1:
Otherwise, the integral promotions are performed on both
operands. Then the following rules are applied:
delete to the end of the paragraph 1 and replace with:
Otherwise, the integral promotions are performed on both
operands. Then the following rules are applied to the promoted
operands:
If the operands have different precisions, the operand with
less precision is converted to the type of other the
operand.
Otherwise, the operands have the same precision:
If either operand has type {long long int} or {unsigned
long long int}, then both operands are converted to
{unsigned long long int} if either operand has an
unsigned integer type. Otherwise, both operands are
converted to {long long int}.
Otherwise, if one operand has type {long int} or
{unsigned long int}, then both operands are converted
to {unsigned long int} if either operand has an
unsigned integer type. Otherwise, both operands are
converted to {long int}.
Otherwise, if one operand has type {int} or {unsigned
int}, then both operands are converted to {unsigned
int} if either operand has an unsigned integer type.
Otherwise, both operands are converted to {int}.
Otherwise, if both operands have the same type, then no
further conversion is needed.
Otherwise, if one operand has signed integer type and
the other operand has the corresponding unsigned
integer type, then the operand with the signed integer
type is converted to the type of the operand that has
unsigned integer type.
Otherwise, both operands are extended integer types
with the same precision. There shall be an
implementation defined ranking of all extended signed
integer types that have the same precision. No two
extended signed integer types shall have the same rank,
even if they have the same representation. The
unsigned integer type that corresponds to an extended
signed integer type shall have the same rank as that
signed integer type.
Then, if either operand has an unsigned integer
type, both operands are converted to the unsigned
integer type that is or corresponds to the operand
type with greater rank.
Otherwise, the operand with the type of lesser
rank is converted to the type of the operand whose
type has greater rank.
Take care in reading the above. Remember, the case where the types
have different precisions is handled before all of the conditional
clauses. This removed the need in the Standard's present wording for
discussing what happens when long long, long, and/or int have the same
versus different "sizes".
6.2 Usual Arithmetic Conversions Alternative 2
Starting with the following text in Section 6.2.1.7 (Usual arithmetic
conversions), paragraph 1:
Otherwise, the integral promotions are performed on both
operands. Then the following rules are applied:
delete to the end of the paragraph 1 and replace with:
Otherwise, the integral promotions are performed on both
operands. Then the following rules are applied to the promoted
operands:
If both operands have the same type, then no further
conversion is needed.
Otherwise, if both operands have signed integer types or
both have unsigned integer types, the operand with the type
of lesser integral conversion rank is converted to the type
of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has
rank greater or equal to the rank of the type of the other
operand, then operand with signed integer type is converted
to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer
type can represent all of the values of the type of the
operand with unsigned integer type, then the operand with
unsigned integer type is converted to the type of operand
with signed integer type.
Otherwise, both operands are converted to the unsigned
integer type corresponding to the type of the operand with
signed integer type.
7 Allow "Big" Constants To Have Extended Integral Type
The wording proposed in this section is optional. The rest of the
proposal is consistent if this section is not voted in.
The existing wording in the Standard permits an implementation to give
a constant that is too big for {long long} or {unsigned long long} an
extended integer type. No diagnostic is required.
Section 6.1.3.2 (Integer constants), paragraph 5, in Semantics says:
The type of an integer constant is the first of the corresponding
list in which its value can be represented. Unsuffixed decimal:
{int}, {long int}, {long long int}, {int}; unsuffixed octal or
hexadecimal: {int}, {unsigned int}, {long int}, {unsigned long
int}, {long long int}, {unsigned long long int}; suffixed by the
letter {u} or {U}: {unsigned int}, {unsigned long int},
{unsigned long long int}; suffixed by the letter {l} or {L}:
{long int}, {unsigned long int}, {long long int}, {unsigned long
long int}; suffixed by both the letters {u} or {U} and {l} or
{L}: {unsigned long int}, {unsigned long long int}; suffixed by
{ll} or {LL}: {long long int}, {unsigned long long int};
suffixed by both {u} or {U} and {ll} or {LL}: {unsigned long
long int}.
Section 6.1.3 (Constants), paragraph 2, is the only constraint:
The value of a constant shall be in the range of representable
values for its type.
If the constant is too big for the types in its list, then the program
violates a semantics rule and is not strictly conforming. An
implementation is allowed to extend the language to give meaning to
any program that is not strictly conforming. In this case, the
extension is to give the constant an extended integer type. As long
as the extended integer type can represent the value of the constant,
the constraint is not violated, and no diagnostic is required.
The Standard would benefit if it provided more direction to
implementations in which extended integer types are appropriate for
the different forms of constants.
At the end of Section 6.1.3.2 (Integer constants), paragraph 5 add:
If an integer constant can not be represented by a type in its
list, it may have an extended integer type, if the extended
integer type can represent its value. If all of the types in the
list for the constant are signed, the extended integer type shall
be signed. If all of the types in the list for the constant are
unsigned, the extended integer type shall be unsigned. If the
list contains both signed and unsigned types, the extended
integer type may be signed or unsigned.
Note Draft 10 erroneously has an extra {int} at the end of the list
for Unsuffixed decimal.
Also, most people that have reviewed the list object to the fact that
decimal constants suffixed by L or LL are allowed to be unsigned.
Perhaps the committee voted in undesirable wording.
8 Uniqueness of types
The wording proposed in this section is optional. The rest of the
proposal is consistent if this section is not voted in.
Section 6.1.2.5 (Types), paragraph 10 says:
The type {char}, the signed and unsigned integer types, and the
floating types are collectively called the *basic types*. Even
if the implementation defines two or more basic types to have the
same representation, they are nevertheless different types.
Microsoft has keywords that are synonyms for standard types. For
example, __int16 is a synonym for short and unsigned __int16 is a
synonym for short int. Such a synonyms are not different types:
merely funny names for existing types, similar in some ways to a
typedef. Such synonyms are not distinct types, and so the above
paragraph does not apply to them. A footnote would clarify this.
Add a new footnote to the end of Section 6.1.2.5 (Types), paragraph
10:
An implementation may define new keywords that provide alterative
ways to designate a basic (or any other) type. An alternate way
to designate a basic type does not violate the requirement that
all basic types be different. Implementation defined keywords
must have the form of an identifier reserved for any use as
described in 7.1.3.
9 Preprocessor arithmetic
The wording proposed in this section is optional. The rest of the
proposal is consistent if this section is not voted in.
It seems wise to require preprocessing arithmetic to be performed in
the largest integral type that the implementation supports.
Replace the following sentences from Section 6.8.1 (Conditional
inclusion), paragraph 4:
The resulting tokens comprise the controlling constant expression
which is evaluated according to the rules of 6.4 using arithmetic
that has at least the ranges specified in 5.2.4.2, except that
{int} and {long}, and {unsigned int} and {unsigned long}, act as
if they have the same representation as, respectively, {long
long} and {unsigned long long}.
with:
The resulting tokens comprise the controlling constant expression
which is evaluated according to the rules of 6.4 using arithmetic
that has at least the ranges specified in 5.2.4.2, except that
the signed integer types and the unsigned integer types, act as
if they have the same representation as, respectively, {intmax_t}
and {uintmax_t} defined in the <inttypes.h> header.
Add Forward reference:
Largest integral types (7.4.3)
Note, the above forward reference may have to be adjusted to reflect
the rewrite of the section on <inttypes.h>.
10 Syntax for Declarations
The wording proposed in this section is optional. The rest of the
proposal is consistent if this section is not voted in.
The Standard requires that a violation of a syntax rule cause an
implementation to issue a diagnostic. This section section proposes
extending the grammar to permit implementation defined keywords to be
type specifiers. This change is only needed if the committee wishes
to remove the requirement that an implementation issue a diagnostic
when user code (as opposed to headers) uses implementation defined
keywords as type specifiers.
Note that the standard headers are not files (Section 7.1.2 footnote
112), and the committee has always held that the headers may be
implemented as a binary representation of the specified contents of
the header. Issues of syntax do not really apply to headers, and so,
implementations are free to use extended syntax in the standard
headers without issuing a diagnostic (an implementation may use a
pragma to suppress such diagnostics while in the header). Thus,
implementations may use extended integer types in the implementations'
headers without this proposed change to the grammar.
Gwyn, Meyers, and Feather do not feel the wording change in this
section is necessary.
After the following line in Section 6.5.2 (Type specifiers), paragraph
1:
{long}
add new line:
*extended-signed-integer-type*
Add new Syntax rule:
*extended-signed-integer-type*:
*identifier*
Corresponding changes should be made in Section B.2.2, page 352.
After the following in Section 6.5.2 (Type specifiers), paragraph 2:
-- {unsigned long long}, or {unsigned long long int}
add two new list items:
-- an identifier reserved for any use in 7.1.3 that designates an
implementation-defined extended signed integer type, or the
same identifier preceded by {signed}
-- the same identifier preceded by {unsigned}
Add the following Forward reference:
reserved identifiers (7.1.3)
11 Index Entries
New entries in the index should be made for the following terms:
1. standard signed integer types
2. extended signed integer types
3. standard unsigned integer types
4. extended unsigned integer types
5. extended integer types
6. precision
7. integral conversion rank (if the corresponding change is
made)