Doc. no. 02-0014=N1356
Date: 12 March 2002
Project: Programming Language C++
Reply to: RWGrosse-Kunstleve@lbl.gov
It would be nice if every kind of numeric software could be written in C++ without loss of efficiency, but unless something can be found that achieves this without compromising the C++ type system it may be preferable to rely on Fortran, assembler or architecture-specific extensions (Bjarne Stroustrup).
Example 1: std::complex<T>
1(a): Language Interoperability
Since std::complex<T> includes several
constructors, it is not POD and the standard doesn't define how its
real and imaginary parts are stored. This inhibits portable use of a
large number of FORTRAN and C libraries which manipulate complex
numbers (e.g. FFTW, FFTPACK, BLAS), both as object libraries and as
source code ported to C++. In the latter case a complete rewrite would
be required for many libraries to achieve full portability
(e.g. FFTPACK), and in some cases the C++ version could not portably
achieve similar performance. Internally to these libraries, arrays of
complex numbers are commonly treated as arrays of real numbers
(reference: cctbx.sf.net, module fftbx).
The C99 standard includes a reserved keyword _Complex for
a family of types implementing complex numbers. The data layout is
precisely defined in 6.2.5/13 of ISO/IEC 9899:1999:
Each complex type has the same representation and alignment requirements as an array type containing exactly two elements of the corresponding real type; the first element is equal to the real part, and the second element to the imaginary part, of the complex number.The FORTRAN standard defines the same data layout for the
COMPLEX type.
In contrast, the definition of ISO/IEC 14882:1998 is much
less specific (26.2.2/1):
The classcomplexdescribes an object that can store the Cartesian components,real()andimag(), of a complex number.
The current standard does not define storage and
alignment requirements. Some have claimed that
the internal representation of complex values can be
arbitrarily transformed. For example, some people interpret
the standard as saying a polar internal representation
might be legal.
To our knowledge, the data layout of all current
implementations of std::complex<T> are actually
compatible with C99 and FORTRAN. However, as it stands C++
and C99 or C++ and FORTRAN programs cannot be interfaced
portably because of the liberal definition 26.2.2/1 in
ISO/IEC 14882:1998.
1(b): Optimization considerations
To facilitate the discussion, we will use a highly simplified outline of one of the most important algorithms in numerical applications: an inplace real-to-complex Fast Fourier Transform (FFT).
std::vector<double> vec;
// fill vec
std::complex<double>*
result = fft_real_to_complex(&*vec.begin(), vec.size());
std::complex<double>*
fft_real_to_complex(double* seq, std::size_t n)
{
  std::complex<double>*
  result = reinterpret_cast<std::complex<double>*>(seq);
  // Do the transform. In the process the array of real
  // values will become an array of complex values.
  return result;
}
To be able to do the transform truly in place (i.e.,
without copying an entire array at some point in the
algorithm) it is essential that either (a) the data layout
of std::complex<T> is predictable or (b) the real
and imaginary parts of the complex values are directly
accessible, such as through references. ISO/IEC 14882:1998
does not provide any of these prerequisites.
A predictable data layout or direct access through references is also a prerequisite to enabling essential speed optimizations, even for complex-to-complex transforms. Example: Any of the automatically generated codelets in FFTW, such as ftw_4.c.
For the algorithm above to work it is essential that
std::complex<T> has a trivial assignment operator to
avoid undefined behavior when the complex values are replaced by the
real values;
std::complex<T> has a trivial destructor to avoid
undefined behavior when vec in the example goes out of scope.
Example 2: Interfacing Python and C++
Python is a dynamically typed, object-oriented, interpreted language and therefore a powerful complement for the statically-typed, compiled C++ language. The most popular implementation of the Python programming language is written in ANSI C89. David Abrahams has been implementing a system for the integration of C-Python and C++ (reference: www.boost.org, module Boost.Python).
In the Python 'C' API, all objects are manipulated through pointers to a
"base" struct PyObject. The layout of every Python object which
participates in its cycle garbage-collection begins with the layout of a
PyObject. The PyObject contains a reference count and what is for all
intents and purposes a vtable. This arrangement provides a crude form of
object-orientation in 'C' and the basic idioms have been repeated in the
implementations of countless languages and systems.
The 'C' programmer wishing to implement a new object type in Python has the opportunity to employ two of the language's most-beloved features, macros and 'C'-style casts:
struct MyObject
{
    PyObject_HEAD   // MACRO providing the members of PyObject
    T1 additional_data_1;
    T2 additional_data_2;
};
// Return a Python string representing MyObject
PyObject* MyObject_print(PyObject* o)
{
    MyObject* x = (MyObject*)o; // downcast
    ...
}
// "vtbl"
PyTypeObject MyType = {
    ...
    MyObject_print,
    ...
};
// Creation function
PyObject* MyObject_new()
{
    // MACRO invocation which allocates memory and initializes
    MyObject* result = PyObject_New(MyObject, &MyObject_Type);
    ...more initialization...
    return (PyObject*)result;
}
In keeping with the design intention that C++ is "a better C", consider
how we might solve this problem in C++. Obviously, we'd use inheritance
to eliminate macros and casting as much as possible. We'd add
constructors for MyObject and PyObject to eliminate the need for
initialization in MyObject_new(). We'd use real virtual functions
instead of an ad-hoc PyTypeObject filled with functions using the 'C'
calling convention.
Unfortunately, the rest of Python is still written in 'C', so we really
can't expect to replace the PyTypeObject with real virtual functions
here. However, we are tantalizingly close to being able to do very much
better than shown above in C++:
// Base object for all Python extension types
struct PyBaseObject : PyObject
{
    // initializes refcount and vtbl
    PyObject(PyTypeObject const&);
    // allocates in Python's special GC area
    void* operator new(std::size_t n);
};
extern "C" PyObject* MyObject_print(PyObject* o) {
    MyObject* x = static_cast<MyObject*>(o);
}
PyTypeObject MyType = {
    ...
    MyObject_print,
    ...
};
struct MyObject : PyBaseObject
{
    MyObject() : PyBaseObject(MyType) {...}
};
// Just use operator new for allocation
Though the above works on every C++ implementation we know
of, it relies on an assumption which is technically
non-portable: that base classes in non-virtual
inheritance hierarchies are laid out as though they were the
first data members of a class. The assumption is invalid
because the classes involved are non-POD: they have both
base classes and constructors. In the absence of such a
guarantee, or a way to achieve it, the C++ programmer is
exposed to most of the same dangers as the 'C' programmer
when interfacing to many 'C' systems, and to Python in
particular.
The original considerations about POD focused strictly on being able to interoperate with types defined in 'C', but not on being able to leverage the power of C++ for interfacing with 'C' systems. The examples above illustrate the importance of a predictable data layout for this and other purposes. Therefore:
std::complex<T>.
std::complex<T> as an array of T and
vice versa, we propose that the member functions for data access
return references instead of copies.
By allowing constructors, and thus ensuring initialization to a valid state, the Enhanced POD concept encourages safer programming practices. Right now certain classes (endian arithmetic, for example) are often designed without constructors so that they can be used in contexts requiring POD types. This is neither as safe or convenient as if these classes had constructors.
Presumably many of the contexts now requiring POD types will be relaxed to require only Enhanced POD types. In particular, it would be very helpful if the requirements on implementations for POD types in 3.9 paragraphs 2-4 could also apply to Enhanced POD types.
We encourage the committee to consider the Enhanced POD proposal separately from the others.
The proposals will allow to (a) build arrays of std::complex<T> with a predictable data layout and (b) portably pass T* pointers to these arrays to other languages, e.g.:
void foo(double *data, long n); // C library function std::vector<std::complex<double> > vec; // fill vec foo(&vec[0].real(), vec.size());