ISO/IEC JTC1 SC22 WG21 N3533 - 2013-03-12
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
Chris Mysen, mysen@google.com, ccmysen@gmail.com
Introduction
Conceptual Interface
    Basic Operations
    Non-Waiting Operations
    Non-Blocking Operations
    Push Front Operations
    Closed Queues
    Empty and Full Queues
    Queue Names
    Element Type Requirements
    Exception Handling
    Queue Ordering
    Lock-Free Implementations
Concrete Queues
    Locking Buffer Queue
    Lock-Free Buffer Queue
Additional Conceptual Tools
    Fronts and Backs
    Streaming Iterators
    Storage Iterators
    Binary Interfaces
    Managed Indirection
Implementation
Revision History
Queues provide a mechanism for communicating data between components of a system.
The existing deque in the standard library
is an inherently sequential data structure.
Its reference-returning element access operations
cannot synchronize access to those elements
with other queue operations.
So, concurrent pushes and pops on queues
require a different interface to the queue structure.
Moreover, concurrency adds a new dimension for performance and semantics. Different queue implementation must trade off uncontended operation cost, contended operation cost, and element order guarantees. Some of these trade-offs will necessarily result in semantics weaker than a serial queue.
We provide basic queue operations, and then extend those operations to cover other important issues.
The essential solution to the problem of concurrent queuing is to shift to value-based operations, rather than reference-based operations.
The basic operations are:
void
queue::push(const Element&);void
queue::push(Element&&);
Push the Element onto the queue.
Element queue::value_pop();
Pop an Element from the queue.
The element will be moved out of the queue in preference to being copied.
These operations will wait when the queue is full or empty. (Not all queue implementations can actually be full.) These operations may block for mutual exclusion as well.
Waiting on a full or empty queue can take a while, which has an opportunity cost. Avoiding that wait enables algorithms to avoid queuing speculative work when a queue is full, to do other work rather than wait for a push on a full queue, and to do other work rather than wait for a pop on an empty queue.
queue_op_status
queue::try_push(const Element&);queue_op_status
queue::try_push(Element&&);
If the queue is full, return queue_op_status::full.
Otherwise, push the Element onto the queue.
Return queue_op_status::success.
queue_op_status
queue::try_pop(Element&);
If the queue is empty, return queue_op_status::empty.
Otherwise, pop the Element from the queue.
The element will be moved out of the queue in preference to being copied.
Return queue_op_status::success.
These operations will not wait when the queue is full or empty. They may block for mutual exclusion.
For cases when blocking for mutual exclusion is undesirable,
we have non-blocking operations.
The interface is the same as the try operations
but is allowed to also return queue_op_status::busy
in case the operation is unable to complete without blocking.
queue_op_status
queue::nonblocking_push(const Element&);queue_op_status
queue::nonblocking_push(Element&&);
If the operation would block, return queue_op_status::busy. 
Otherwise, if the queue is full, return queue_op_status::full.
Otherwise, push the Element onto the queue.
Return queue_op_status::success.
queue_op_status
queue::nonblocking_pop(Element&);
If the operation would block, return queue_op_status::busy. 
Otherwise, if the queue is empty, return queue_op_status::empty.
Otherwise, pop the Element from the queue.
The element will be moved out of the queue in preference to being copied.
Return queue_op_status::success.
These operations will neither wait nor block. They may, however, not do anything.
The non-blocking operations highlight a terminology problem.
In terms of synchronization effects,
nonwaiting_push on queues
is equivalent to try_lock on mutexes.
And so one could conclude that
the existing try_push
should be renamed nonwaiting_push
and nonblocking_push
should be renamed try_push.
However, at least Thread Building Blocks uses the existing terminology.
Perhaps better is to not use try_push
and instead use nonwaiting_push and nonblocking_push.
Occasionally, one may wish to return a popped item to the queue.
We can provide for this with push_front operations.
void
queue::push_front(const Element&);void
queue::push_front(Element&&);
Push the Element onto the back of the queue,
i.e. in at the end of the queue that is normally popped.
Return queue_op_status::success.
queue_op_status
queue::try_push_front(const Element&);queue_op_status
queue::try_push_front(Element&&);
If the queue was full, return queue_op_status::full.
Otherwise, push the Element onto the front of the queue,
i.e. in at the end of the queue that is normally popped.
Return queue_op_status::success.
queue_op_status
queue::nonblocking_push_front(const Element&);queue_op_status
queue::nonblocking_push_front(Element&&);
If the operation would block, return queue_op_status::busy. 
Otherwise, if the queue is full, return queue_op_status::full.
Otherwise, push the Element onto the front queue.
i.e. in at the end of the queue that is normally popped.
Return queue_op_status::success.
This feature was requested at the Spring 2012 meeting. However, we do not think the feature works.
The name push_front is inconsistent
with existing "push back" nomenclature.
The effects of push_front
are only distinguishable from a regular push
when there is a strong ordering of elements.
Highly concurrent queues will likely have no strong ordering.
The push_front call may fail
due to full queues, closed queues, etc.
In which case the operation will suffer contention,
and may succeed only after interposing push and pop operations.
The consequence is that
the original push order is not preserved in the final pop order.
So, push_front cannot be directly used as an 'undo'.
The operation implies an ability to reverse internal changes at the front of the queue. This ability implies a loss efficiency in some implementations.
In short, we do not think that in a concurrent environment
push_front provides sufficient semantic value
to justify its cost.
Threads using a queue for communication need some mechanism to signal when the queue is no longer needed. The usual approach is add an additional out-of-band signal. However, this approach suffers from the flaw that threads waiting on either full or empty queues need to be woken up when the queue is no longer needed. To do that, you need access to the condition variables used for full/empty blocking, which considerably increases the complexity and fragility of the interface. It also leads to performance implications with additional mutexes or atomics. Rather than require an out-of-band signal, we chose to directly support such a signal in the queue itself, which considerably simplifies coding.
To achieve this signal, a thread may close a queue.
Once closed, no new elements may be pushed onto the queue.
Push operations on a closed queue
will either return queue_op_status::closed
(when they have a status return)
or throw queue_op_status::closed
(when they do not).
Elements already on the queue may be popped off.
When a queue is empty and closed,
pop operations will either
return queue_op_status::closed
(when they have a status return)
or throw queue_op_status::closed
(when they do not).
The additional operations are as follows. They are essentially equivalent to the basic operations except that they return a status, avoiding an exception when queues are closed.
void queue::close();Close the queue.
bool queue::is_closed();Return true iff the queue is closed.
queue_op_status
queue::wait_push(const Element&);queue_op_status
queue::wait_push(Element&&);
If the queue was closed, return queue_op_status::closed.
Otherwise, push the Element onto the queue.
Return queue_op_status::success.
queue_op_status
queue::wait_push_front(const Element&);queue_op_status
queue::wait_push_front(Element&&);
If the queue was closed, return queue_op_status::closed.
Otherwise, push the Element onto the front of the queue,
i.e. in at the end of the queue that is normally popped.
Return queue_op_status::success.
queue_op_status
queue::wait_pop(Element&);
If the queue is empty and closed, return queue_op_status::closed.
Otherwise, if the queue is empty, return queue_op_status::empty.
Otherwise, pop the Element from the queue.
The element will be moved out of the queue in preference to being copied.
Return queue_op_status::success.
The push and pop operations will wait when the queue is full or empty. All these operations may block for mutual exclusion as well.
There are use cases for opening a queue that is closed. While we are not aware of an implementation in which the ability to reopen a queue would be a hardship, we also imagine that such an implementation could exist. Open should generally only be called if the queue is closed and empty, providing a clean synchronization point, though it is possible to call open on a non-empty queue. An open operation following a close operation is guaranteed to be visible after the close operation and the queue is guaranteed to be open upon completion of the open call. (But of course, another close call could occur immediately thereafter.)
void queue::open();Open the queue.
Note that when is_closed() returns false,
there is no assurance that
any subsequent operation finds the queue closed
because some other thread may close it concurrently.
If an open operation is not available, there is an assurance that once closed, a queue stays closed. So, unless the programmer takes care to ensure that all other threads will not close the queue, only a return value of true has any meaning.
It is sometimes desirable to know if a queue is empty.
bool queue::is_empty();Return true iff the queue is empty.
This operation is useful only during intervals when the queue is known to not be subject to pushes and pops from other threads. Its primary use case is assertions on the state of the queue at the end if its lifetime, or when the system is in quiescent state (where there no outstanding pushes).
We can imagine occasional use for knowing when a queue is full, for instance in system performance polling. The motivation is significantly weaker though.
bool queue::is_full();Return true iff the queue is full.
Not all queues will have a full state, and these would always return false.
It is sometimes desirable for queues to be able to identify themselves. This feature is particularly helpful for run-time diagnotics, particularly when 'ends' become dynamically passed around between threads. See Managed Indirection below. There is some debate on this facility, but we see no way to effectively replicate the facility.
const char* queue::name();Return the name string provided as a parameter to queue construction.
The above operations require element types with a default constructor, copy/move constructors, copy/move assignment operators, and destructor. These operations may be trivial. The default constructor and destructor shall not throw. The copy/move constructors and copy/move assignment operators may throw, but must must leave the objects in a valid state for subsequent operations.
Concurrent queues cannot completely hide the effect of exceptions, in part because changes cannot be transparently undone when other threads are observing the queue.
Queues may rethrow exceptions from storage allocation, mutexes, or condition variables.
If the element type operations required do not throw exceptions, then only the exceptions above are rethrown.
When an element copy/move may throw, some queue operations have additional behavior.
Construction shall rethrow, destroying any elements allocated.
A push operation shall rethrow and the state of the queue is unaffected.
A pop operation shall rethrow and the element is popped from the queue. The value popped is effectively lost. (Doing otherwise would likely clog the queue with a bad element.)
The conceptual queue interface makes minimal guarantees.
The queue is not empty if there is an element that has been pushed but not popped.
A push operation synchronizes with the pop operation that obtains that element.
A close operation synchronizes with an operation that observes that the queue is closed.
There is a sequentially consistent order of operations.
In particular, the conceptual interface does not guarantee that the sequentially consistent order of element pushes matches the sequentially consistent order of pops. Concrete queues could specify more specific ordering guarantees.
Lock-free queues will have some trouble
waiting for the queue to be non-empty or non-full queues.
Therefore, we propose a two closely-related concepts.
A full concurrent queue concept as described above,
and a non-waiting concurrent queue concept
that has all the operations except
push, push_front,
wait_push, wait_push_front,
value_pop and wait_pop.
That is, it has blocking operations (presumably emulated with busy wait)
but not waiting operations.
We propose naming these WaitingConcurrentQueue
and NonWaitingConcurrentQueue,
respectively.
Note: Adopting this conceptual split requires splitting some of the facilities defined later.
It may be helpful to know if a concurrent queue has a lock free implementation.
bool queue::is_lock_free();Return true iff the has a lock-free implementation.
In addition to the concept, the standard needs at least one concrete queue. We describe two concrete queues.
We provide a concrete concurrent queue
in the form of a fixed-size buffer_queue.
It meets the WaitingConcurrentQueue concept.
It provides for construction of an empty queue,
and construction of a queue from a pair of iterators.
Constructors take a parameter
specifying the maximum number of elements in the buffer.
Constructors may also take a parameter
specifying the name of the queue.
If the name is not present, it defaults to the empty string.
The buffer_queue
deletes the default constructor, the copy constructor,
and the copy assignment operator.
We feel that their benefit might not justify their potential confusion.
We provide a concrete concurrent queue
in the form of a fixed-size lock_free_buffer_queue.
It meets the NonWaitingConcurrentQueue concept.
The queue is still under development,
so details may change.
There are a number of tools that support use of the conceptual interface. These tools are not part of the queue interface, but provide restricted views or adapters on top of the queue useful in implementing concurrent algorithms.
Restricting an interface to one side of a queue
is a valuable code structuring tool.
This restriction is accomplished with
the classes generic_queue_front
and generic_queue_back
parameterized on the concrete queue implementation.
These act as pointers
with access to only the front or the back of a queue.
The front of the queue is where elements are popped.
The back of the queue is where elements are pushed.
void send( int number, generic_queue_back<buffer_queue<int>> arv );
These fronts and backs
are also able to provide begin and end operations
that unambiguously stream data into or out of a queue.
In order to enable the use of existing algorithms streaming through concurrent queues, they need to support iterators. Output iterators will push to a queue and input iterators will pop from a queue. Stronger forms of iterators are in general not possible with concurrent queues.
Iterators implicitly require waiting for the advance,
so iterators are only supportable
with the WaitingConcurrentQueue concept.
void iterate(
    generic_queue_back<buffer_queue<int>>::iterator bitr,
    generic_queue_back<buffer_queue<int>>::iterator bend,
    generic_queue_front<buffer_queue<int>>::iterator fitr,
    generic_queue_front<buffer_queue<int>>::iterator fend,
    int (*compute)( int ) )
{
    while ( fitr != fend && bitr != bend )
        *bitr++ = compute(*fitr++);
}
Note that contrary to existing iterator algorithms, we check both iterators for reaching their end, as either may be closed at any time.
Note that with suitable renaming, the existing standard front insert and back insert iterators could work as is. However, there is nothing like a pop iterator adapter.
In addition to iterators that stream data into and out of a queue, we could provide an iterator over the storage contents of a queue. Such and iterator, even when implementable, would mostly likely be valid only when the queue is otherwise quiecent. We believe such an iterator would be most useful for debugging, which may well require knowledge of the concrete class.
The standard library is template based,
but it is often desirable to have a binary interface
that shields client from the concrete implementations.
For example, std::function is a binary interface
to callable object (of a given signature).
We achieve this capability in queues with type erasure.
We provide a queue_base class template
parameterized by the value type.
Its operations are virtual.
This class provides the essential independence
from the queue representation.
We also provide queue_front and queue_back
class templates parameterized by the value types.
These are essentially
generic_queue_front<queue_base<Value>> and
generic_queue_front<queue_base<Value>>,
respectively.
Indeed, they probably could be template aliases.
To obtain a pointer to queue_base
from an non-virtual concurrent queue,
construct an instance the queue_wrapper class template,
which is parameterized on the queue
and derived from queue_base.
Upcasting a pointer to the queue_wrapper instance
to a queue_base instance
thus erases the concrete queue type.
extern void seq_fill( int count, queue_back<int> b );
buffer_queue<int> body( 10 /*elements*/, /*named*/ "body" );
queue_wrapper<buffer_queue<int>> wrap( body );
seq_fill( 10, wrap.back() );
Long running servers may have the need to reconfigure the relationship between queues and threads. The ability to pass 'ends' of queues between threads with automatic memory management eases programming.
To this end, we provide
shared_queue_front and
shared_queue_back template classes.
These act as reference-counted versions
of the queue_front and
queue_back template classes.
These shared versions
act on a queue_counted class template,
which is a counted version of queue_base.
The concrete counted queues are
the queue_owner class template,
which takes ownership of a raw pointer to a queue,
and the queue_object class template,
which constructs and instance of the implementation queue within itself.
Both class templates are derived from queue_counted.
queue_owner<buffer_queue<int>> own( new buffer_queue<int>(10, "own") ); seq_fill( 10, own.back() );queue_object<buffer_queue<int>> obj( 10, "own" ); seq_fill( 10, obj.back() );
The share_queue_ends(Args ... args) template function
will provide a pair of
shared_queue_front and shared_queue_back
to a dynamically allocated queue_object instance
containing an instance of the specified implementation queue.
When the last of these fronts and backs are deleted,
the queue itself will be deleted.
Also, when the last of the fronts or the last of the backs is deleted,
the queue will be closed.
auto x = share_queue_ends<buffer_queue<int>>( 10, "shared" );
shared_queue_front<int> f(x.first);
shared_queue_back<int> b(x.second);
f.push(3);
assert(3 == b.value_pop());
A free, open-source implementation of these interfaces
is avaliable at the Google Concurrency Library project
at 
http://code.google.com/p/google-concurrency-library/.
The concrete buffer_queue is in
..../source/browse/include/buffer_queue.h.
The concrete lock_free_buffer_queue is in
..../source/browse/include/lock_free_buffer_queue.h.
The corresponding implementation of the conceptual tools is in
..../source/browse/include/queue_base.h.
This paper revises N3434 = 12-0043 - 2012-01-14 as follows.
Add more exposition.
Provide separate non-blocking operations.
Add a section on the lock-free queues.
Argue against push-back operations.
Add a cautionary note
on the usefulness of is_closed().
Expand the cautionary note
on the usefulness of is_empty().
Add is_full().
Add a subsection on element type requirements.
Add a subsection on exception handling.
Clarify ordering constraints on the interface.
Add a subsection on a lock-free concrete queue.
Add a section on content iterators, distinct from the existing streaming iterators section.
Swap front and back names, as requested.
General expository cleanup.
Add an 'Revision History' section.
N3434 revised N3353 = 12-0043 - 2012-01-14 as follows.
Change the inheritance-based interface to a pure conceptual interface.
Put 'try' operations into a separate subsection.
Add a subsection on non-blocking operations.
Add a subsection on push-back operations.
Add a subsection on queue ordering.
Merge the 'Binary Interface' and 'Managed Indirection' sections into a new 'Conceptual Tools' section. Expand on the topics and their rationale.
Add a subsection to 'Conceptual Tools' that provides for type erasure.
Remove the 'Synopsis' section.
Add an 'Implementation' section.