An Introduction to Libaio
William K. Josephson
wkj@CS.Princeton.EDU
Princeton University
Princeton, NJ 08544
ABSTRACT
This document is a short introduction to
the libaio event-driven programming libraries, the
9unix libraries upon which they depend, and config,
the basis of the libaio build system. The reader is
expected to be reasonably well versed with
the
UNIX�
operating system,
make,
C and C++, and concurrent programming.
1.
Introduction
This document is a short introduction to the libaio event-driven
programming libraries, the 9unix libraries upon which they depend,
and config, the basis of the libaio build system. The
9unix libraries provide a consistent, portable, Unix-like programming
interface inspired by Plan 9, while config provides a flexible,
configurable build system, and libaio provides a C++-based event-driven
infrastructure intended for modern CMPs. The interface to libaio, although
similar to some existing event-driven libraries, is new, relatively
untried, and still a work in progress. The author is less than fond of
continuation-passing style (aka: event-driven style) as a programming
model for end users and hopes eventually to develop a source-to-source
compiler in the spirit of Tame
[
10
]
to ease the task of programming with libaio. Unlike the SMP version of
libasync
[
6
],
libaio makes concurrency explicit through barriers and locks and sports
a variety of more sophisticated schedulers.
Documentation and the source for the most recently released versions of
9unix, libaio, and config may be obtained from the author’s web
page:
http://www.morphisms.net/~wkj/software
2.
The 9unix libraries
The 9unix libraries provide a consistent, system independent, Unix-like
interface across multiple platforms. The basic interface was inspired
by Plan 9 and much of the code has either come directly from Plan 9 or
come via Plan 9 Ports
[
5
],
although early work on 9unix significantly predated Plan 9 Ports.
9unix also provides a set of standard make files for program maintenance
and scripts to support automated linking on a variety of platforms.
2.1.
Building Programs with 9unix
9unix uses BSD
for program maintenance. The Makefiles distributed in the
mk subdirectory are responsible for computing dependencies
and providing the rules for transforming program and documentation
source into executable code and formatted text.
An internal 9unix make file must, at a minimum, specify the root
of the build directory tree as a relative path in the root
variable and include Makefile.inc from the root. A Makefile
that is not part of 9unix itself sets p9root to point to
the root of the 9unix installation tree and include the copy of
Makefile.inc residing in
$(p9root)/mk
instead. By convention, the path to the root of a 9unix installation
is exported to the environment via the P9UNIX environment variable.
Most users will want to set this variable in their shell’s initialization
scripts (~/.profile for Bourne shell users).
Most Makefiles will need to set the OFILES, TARG, and
LIB libraries. The Makefiles in src/cmd and
src/libbio provide good working examples. The intrepid may
venture into mk/Makefile.inc for further details.
The 9unix makefiles set the variables ARCH, OS,
PICEXT, and SOEXT for the user. Operating system
specific configuration resides in
mk/Makefile.${OS},
architecture specific configuration in
mk/Makefile.${ARCH},
and general configuration, particularly for the compiler, resides in
mk/Makefile.cfg. Note that what some systems call x86-64
(i.e. IA-32 with 64-bit extensions), 9unix consistently calls
amd64 on all platforms; similarly, what some systems call
i386 or i686, 9unix consistently calls 386.
Furthermore, the MacOS X port on 386 and amd64 defaults
to amd64 mode even though current versions of the system do not
support 64-bit mode as fully as 32-bit mode.
The default user-callable make targets are:
∙
all: Build all programs, libraries, shared libraries, and mandatory
documentation. Optional documentation may require additional programs such
as groff or Plan 9 Ports and is not built by default.
∙
install: Execute the all target if necessary,
construct the necessary installation directory tree, and copy
programs, scripts, libraries, shared libraries, headers, macros,
and installed documentation into the installation tree.
∙
clean: Remove most intermediate files.
∙
nuke: Remove all generated files.
2.2.
Linking
9unix exports a C preprocessor macro AUTOLIB which inserts
a weak symbol into any source file containing this macro. This weak
symbol has a special form that can be extracted with the nm(1)
program at link time. The 9l shell script is responsible for
using these extracted symbols to build a list of library dependencies
and topologically sort it for linking. For the most part, 9l
knows enough about various systems to build both static and dynamic
shared libraries and to insert library search paths into binaries on
each of the supported systems. It is also responsible for enabling
any other platform-specific linker flags and linking against auxiliary
libraries as necessary (some platforms require programs using floating
point mathematics routines link against the libm.a math
library while others do not, for instance). For the gory details,
read the 9l source. Originally 9l was developed
simultaneously for 9unix and Plan 9 Ports; with the advent of
AUTOLIB and dynamic linking support, 9unix switched to a lightly
modified version of the Plan 9 Ports implementation.
2.3.
lib9c
Lib9c provides the core portability layer. It is similar in
spirit to Plan 9 Port’s lib9 but eschews emulation of
many Plan 9 specific features in favor of a more Unix-like interface.
Whereas Plan 9 Port is primarily for porting Plan 9 software to Unix,
9unix has been used primarily for new software.
The following is a list of note-worthy functions and macros available
in 9unix:
queue(3):
queue(3)
contains a recent version of the BSD list macros from FreeBSD.
These macros implement various singly- and double-linked lists.
tree(3):
tree(3)
contains a recent version of the BSD tree macros from FreeBSD.
These macros implement red-black and splay trees.
ARGBEGIN, ARGEND: Macros for argument parsing; see
arg(3)
for details.
dial, announce, listen:
dial(3)
provides a sane interface to the Berkeley socket API; should be used
instead of the raw socket API in new programs. libaio provides an
asynchronous implementation1.
open and opentemp: 9unix overrides the open(2) function
with
open(3)
and implements the
opentemp(3)
function for safely creating temporary files.
errstr and werrstr: Retrieve and set, respectively, the
current error string. This is the textual equivalent of Unix’s
errno variable, but is human readable and may be set to an
arbitrary string by applications. The r format verb takes
no arguments and returns the current value of the error string.
sysfatal: signal abnormal program termination,
printing an error message.
wait, waitnohang, waitfor, etc.:
wait(3)
is the Plan 9 equivalent of the Unix wait(2) family of functions,
but with a more pleasant interface.
quote*, unquote*, etc.:
quote(3)
provides routines for producing quoted strings which are
often convenient in configuration files, environment variables, and
simple text-based network protocols. The unquote* routines
evaluate the quotes.
tokenize and getfields:
getfields(3)
parses simply delimited strings, including quote(3) delimited
strings.
gmtime, localtime, etc.: 9unix overrides the standard
Unix time related functions. See
ctime(3)
and
time(3)
for details.
nsec: return the number of nanoseconds since the Unix epoch.
dirread, dirreadall:
dirread(3)
provides a simplified, portable interface for reading directory entries.
sendfd, recvfd:
sendfd(3)
provides a sane interface to file descriptor passing over Unix domain
sockets.
USED: A macro that when applied to a variable indicates to the
compiler that it should be treated as if it were used.
2.4.
libutf, libfmt, and libbio
9unix uses the Plan 9 formatted printing and Unicode
libraries. The interfaces to these libraries are
documented in
print(3),
fmtinstall(3),
utf(7),
and
rune(3).
These interfaces support Unicode (UTF-8) natively
and provide for user-specified printing verbs, unlike
stdio(3); new software using 9unix must not use
stdio. For buffered I/O, including formatted,
buffered printing, use
bio(3).
2.5.
A Potpourri of Further Resources
∙
libbin: A binned allocator; useful for small objects.
See
bin(3)
for details.
∙
libxbio: An extensible version of
bio(3)
that allows the programmer to install function pointers for read,
write, and seek operations.
∙
libencode: simple-minded endian-aware serialization primitives;
often used for wire protocols or stable storage (e.g. Berkeley DB
key/value pairs).
∙
libip: Various useful network routines, particularly a portable
readipifc, udpread/udpwrite, and formatting verbs for IP
and Ethernet addresses and functions to parse them. See
ip(3)
for details.
∙
libflate: The deflate algorithm used by gzip, among others;
see
flate(3)
for details.
∙
libString: Reference counted strings for C; probably not
so useful to libaio users (see below).
∙
libmangle: A library interface to the standard C++ demangling
algorithms; see also the 9unix demangle command.
∙
libmempool: Arena and pool allocators that uses sbrk and
mmap; useful if you want a file-backed heap.
∙
libmux: A protocol message multiplexor; see
mux(3)
for details.
∙
libobjload: Some basic portable run-time linker routines;
incomplete, but still useful if you need to run on a number of Unix-like
systems.
∙
libregexp9:
regexp(3)
is an efficient implementation of
regexp(7)
regular expressions. The Plan 9 implementation is typically far
more robust than Perl, Python, Ruby, and PCRE implementations. See
Russ Cox’s recent essay on the subject
[
4
].
There are a few libraries in 9unix that remain mostly for backwards
compatibility and which should not be used in new software. These
include:
∙
tls: A simple wrapper around OpenSSL; not as portable as
one might like as recent MacOS X versions don’t ship with 64-bit
OpenSSL. It also includes a select(2) state machine and
therefore is not suitable for use with libaio.
∙
misc: A collection of useful, but now-obsolete routines;
fair game for poaching.
Some useful programs:
∙
lex: A port of Plan 9 lex; useful for legacy
applications, but it is often better to write your own lexical
analyser.
∙
yacc: A port of Plan 9 yacc; used by rpcc,
among others.
∙
demangle: A shell utility that identifies mangled C++
symbols in its input and demangles them.
∙
fn.awk and cfn.pl: Two utilities for extracting
prototypes from properly formatted C code.
3.
Config
config is a program for software configuration and build
management inspired by the BSD kernel configuration program of the
same name
[
12
].
It takes a project description, or configuration file, and a number
of additional files describing individual sources, modules, libraries,
and programs, and automatically constructs a build environment and
Makefile tailored for the target architecture, operating system, and
configuration.
To build config, you will need to have a copy of a BSD-compatible
make and 9unix installed. With these prerequisites, all that should be
necessary is to unpack the source and type make; make install;
don’t forget to have the P9UNIX environment variable set
appropriately.
To construct a default configuration for libaio, unpack the source
distribution and type make config. The build environment
will be constructed in a sub-directory of the compile
directory named after the configuration, operating system, and
target architecture. A default build may be initiated by walking
into this directory and invoking make.
3.1.
Configuration Specifications
At the top level, the following directives are permitted:
ident:
A string name identifying the configuration.
arch:
Explicitly set the target architecture.
includeconfig:
Causes config to embed a copy of the root configuration
(but not any portions incorporated via the include mechanism) in
config.c, which is linked with every program. The configuration
may be retrieved from the binary with the command:
strings -n 4 program | grep ’____’ | sed -e ’s;^____;;’
makeoption:
Passes a variable definition through to the generated Makefile.
A typical use is to set the value of p9root or to turn
on additional compiler or debugging flags:
makeoption p9root ${P9UNIX}
param:
Takes one or two arguments. The first argument is the name of a
macro to define in the generated config.h. The second,
optional argument, is the value of the macro. If no value is
given, it defaults to 1.
option:
Takes one or two arguments. The first argument is the name of an
option to enable. The second, optional argument, is the value of
the option. If no value is given, it defaults to 1. When applied
at the top level, the directive defines and enables an option. When
attached to another directive, it enables that directive if and
only if the option is enabled and its value matches the second
argument.
include:
The include directive takes a single argument for inclusion
in the configuration and further processing. It is subject to variable
and automatic expansion; see below for details.
The following directives are permitted in included files:
include:
See above.
option:
May be attached to a source, module, library, or program, which is
then enabled and compile only if the option is enabled.
module:
Declares the sources and headers belonging to a module.
library:
Declares the sources and modules required to build a library.
program:
Declares the sources, modules, and libraries required to build a program.
Other options, which are roughly analogous to those in BSD’s
config(8) are listed below. See config’s source
and the libaio configuration directory for documentation and
examples.
no-obj:
No object files are generated for this source.
no-implicit-rule:
Do not emit implicit rules for this target.
before-dep:
Indicates that a target must be built before the dependency phase.
depend:
Explicit list of files upon which the source depends.
compile-with:
The command to use to generate the target from the source.
link-with:
Use specified command for linking; applies to programs only.
clean:
List of files for the clean target to remove.
nuke:
List of files for the nuke target to remove.
The ARCH and OS make variables are filled in
automatically from the values returned by uname unless
overridden by the configuration itself or by the appropriate command
line switches.
Some directives can occur only at the top level. These directives
lack the IncOK flag in parse.c. The utility of
this is unclear and the number of such directives is shrinking.
Include file names are subject to variable expansion. Variable
expansion occurs in expandvars() in misc.c.
The option directive tags a source, module, or library. If
the option is enabled, any source, module, or library it tags is
built; otherwise it is not. The value of the option is made available
via config.h.
The module directive applied to a library indicates that the
library depends upon the specified module. It acts in this context
as a sort of ‘‘include’’ directive that includes the sources
(including headers) contained in the module. If the module is
optional and not enabled, the directive is ignored.
The module directive applied to a source (including headers)
indicates that the source is a member of the module. A source may
belong to at most one module. A source is optional either if it
is explicitly marked as such or if it is a member of a module marked
as such. An optional source is built only if all the corresponding
options (source and module) are enabled.
Include directives invoke autoinc(). If ‘‘file’’ is
included, then ‘‘file.${OS}’’ is attempted as well,
followed by ‘‘file.${ARCH}’’. If a file has already been
included, either explicitly or implicitly, it is not included again.
Explicit inclusion of a file multiple times is permitted, but is
almost certainly an error. More confusing still, the key used to
prevent multiple inclusion is the base name of the file, not any
prefix introduced by the include path. The alternative seems more
surprising still.
3.2.
Config and Makefiles
Config does not use the 9unix build system’s makefiles once
it has been built and installed. Rather, it has a separate set in
the 9unix installation tree under the conf subdirectory.
Makefile.${ARCH} contains the per-architecture
makefile template used by config. The remaining makefiles,
config.*, are used by the makefiles generated from the
templates to drive the build. You may need to edit some of these
files if you want to change the compiler options. Note that exporting
DEBUG=1 into the environment is sufficient to get a debug
build. Further changes will require editing, but beware that the
installed makefiles are shared by all configurations and builds.
4.
The libaio libraries
Unlike 9unix, libaio is written in C++ as opposed to (almost)
pure C99. The two primary motivations for this are described
further below. Briefly, heavy use of templates are made to
construct reference-counted closures. The closures and the
automatic reference counting both depend upon sugar provided
by C++. The libraries also make some use of templates and
classes to provide polymorphic implementations of a number of
standard data-structures as well as an asynchronous DNS resolver.
The public libaio header files may be found in include/aio.
The private libaio header files in inclue/aioimpl may be
included indirectly via the public ones but should not be
included by user applications. In general, there is a separate
header file for each ‘‘module’’ and the interface to the module
is documented by a comment at the top of the file. When in doubt,
look there for documentation.
4.1.
A Note on C++
Why not C++?
∙
Performance: BWK’s cenk (or: I’m OK, STL’s not so hot),
exceptions, others?
∙
Code bloat: huge binaries are the norm and there is a
tendency to over-inline with static inline member functions,
causing unnecessary I-cache pressure.
∙
Language features:
In no particular order:
1. Operator new is a botch -- even malloc and free are
better
[
13
].
2. Exceptions are expensive -- really expensive.
3. Constructors and destructors can’t return errors without exceptions.
4. Copy and assignment constructor problems are frequent and slicing
is usually detected only at run time and often manifests itself
as bizarre behavior.
5. There are often problems forcing initialization/call order for
constructors and destructors of static instances.
6. No modules whatever, which is particularly bad in the presence of
inline functions and templates.
7. Template instantiation is a mess.
8. Inscrutable syntax, especially operator overloading and templates.
9. The language changes far too quickly and so the half-life for new
software is short.
4.2.
C++ Utilities
libaio provides a number of additions to the standard C++ environment.
1. An array type that wraps C-style arrays is provided since
C-style arrays interact poorly with C++ templates. See the comments
in <aioimpl/array.h> for details.
2. C++ templates can be abused to implement compile-time optimizations.
The <aioimpl/type.h> header contains a number of templates for
reflection and type manipulation at compile time. Some of the libaio
data structures use these facilities to improve performance.
3. The Equals and Compare templates in <aioimpl/equals.h>
provide standard interfaces for comparison objects used by the hash
table and tree implementations as well as implementations for
primitive types. Several other classes, including the String
class extend these interfaces.
4.2.1.
Reference Counting
XXX: somewhat complicated; will document implementation later.
Given a type T, ref<T> is a non-nullable reference
counted T. Similarly, ptr<T> is a nullable reference
counted T. Reference counted objects are allocated like
so:
ref<T> poot = New refcounted<T>();
Note that a refcounted<T> can only take a limited number of
constructor arguments since it uses automaticaly generated template
code. The template code is generated with bin/va.awk and
by default supports up to eight arguments to the constructor.
Passing a ref or ptr as a function argument appears
to be roughly comparable in cost to passing an ordinary argument
(within ten to twenty percent), assuming exceptions are not enabled.
If a class has a finalize method and has refcount
as a virtual base class, then each time the reference count on the
object reaches zero, the finalization method will be called instead
of deleting the object and calling its destructor. You can also
manipulate the reference counts on an object directly if refcount
is a public virtual base class. To prevent this and the use of
mkref to create refs or ptrs from vanilla
pointers, make the virtual base class private.
4.2.2.
Closures (Curry)
The curry function and Curry template classes provide
syntax and semantics analogous to function currying in modern
programming languages. A Curry<R, T1, T2, ..., Tn> object
represents a closure for a function of n arguments of types
T1 through Tn returning a value of type R.
Unused right-most arguments may be omitted from the template. As
with reference counted objects, the number of arguments is fixed
at configuration time in bin/curry.awk.
Suppose we have a function R fn(T1, T2, T3). Then:
Curry objects have operator() and
apply() functions that take any remaining arguments
and evaluate the closure with the given arguments.
Beware passing references to heap allocated objects to curry
since the resulting Curry object may evaluated multiple
times in the future. In typical usage, heap allocated objects that
are passed to curry are reference counted to avoid the
potential for memory corruption.
Some additional note-worthy features of the currying implementation are:
1. Notice that currying is a right-to-left operation.
2. Unfortunately, in the current implementation, repeated application of
curry may require the definition of trivial functions to forward
arguments.
3. curry_func is similar to curry, but the function
to apply is the last argument.
4. curry_func_placement is to curry what placement new
is to ordinary new. A Curry object is recycled rather than
freshly allocated. This may be beneficial in performance critical
applications but should be avoided in general as it is not type safe: the
programmer asserts that the target object is of the correct type and size.
5. curry_func_placement_whack takes curry_func_placement
a step further, allowing the programmer to change only the function
inside a Curry object. Note that the implementation requires
variadic macros, which requires a C99 preprocessor. In general,
whacking closures is a bad idea and should be treated as a mere
novelty.
4.2.3.
Memory Allocation
All C++-style memory allocation should use the macro New
and not ordinary new. New expands to code
that permits limited memory debugging using libdmalloc (the
design of C++ is fundamentally crippled when it comes to memory
allocation).
All C-style memory allocation should use aio_malloc,
aio_free, and aio_realloc to facilitate debugging
and statistics gathering.
For pooled allocation, consider the allocators in <aioimpl/pool.h>.
1. SimplePool: A fixed-size pool of bytes. Explicit deallocation
is not supported and neither constructors nor destructors for individual
objects are run, even when the entire pool is destroyed.
2. ExtensibleSimplePool: A set of SimplePools that expands on demand.
3. ChunkPool: A fixed-size pool of objects of a single type, T.
ChunkPools support explicit deallocation.
4. ImmutablePool: A fixed-size pool of objects of a single type,
T. Explicit deallocation of individual objects is not possible,
but destructors are run when the entire pool is destroyed.
4.3.
Algorithms and Data Structures
libaio provides a variety of standard algorithms and data structures.
In general, we eschew the STL for a variety of reasons, including
portability and reproducible, consistent performance.
The standard data structures that are available are:
1. Bit sets: Bit sets with platform-specific optimizations.
See the comments at the top of <aioimpl/bitset.h> for details.
2. Heaps: A polymorphic heap implementation is provided. See
the comments at the top of <aioimpl/heap.h> for details.
3. Priority Queues: PriorityQueue extends Heap
with push, top, and pop operations. See
<aioimpl/queue.h> for details.
4. Red-Black Trees: A polymorphic red-black tree implementation
very similar to that of
queue(3).
See <aioimpl/rbtree.h> for details.
5. Hash Tables and Maps: <aioimpl/hash.h> contains
implementations of polymorphic, transparently resizing, chaining
hash tables: HashTables, HashMaps, and (CWBoolMaps
(a specialization of HashMap for boolean valued entries).
The comments at the top of the header file describe the interfaces
(notice that all implementations inhert HashTableCore’s
interface). The chain links are embedded in values in HashTables
where as a separate bucket is allocated for HashMaps.
6. Linked Lists and Tail-Queues: Similar to
queue(3)
but polymorphism is achieved using C++ templates.
See the comments at the top of <aioimpl/list.h> for details.
7. Vectors and stacks: More or less what one would expect;
see the comments at the top of <aioimpl/vector.h> for details.
libaio provides several other data structures useful for systems
programming in particular:
8. Strings: Strings are reference counted and immutable. Both
Equals and Compare are defined for Strings.
See <aioimpl/string.h> for details. A mutable version and
an implementation of ropes are planned.
9. Retry: Retry is a class that implements a generic
interface for periodic ‘‘retry’’ (e.g. for retransmission).
The Retry class implements start, retry,
and timeout, which add a new RetryEntry, force an
entry to ‘‘keep trying’’, and remove an entry, respectively.
RetryEntry is a template for a field embedded in another
class that stores the state necessary for Retry. Implementation
details may be found in <aio/retry.h> and an example usage
in src/test/retry.C and the asynchronous DNS library.
10. IOVec: IOVec is the libaio equiavlent of Unix’s struct
iovec for scatter-gather I/O. Unlike the Unix variant, however, it
provides some automatic memory management facilities. IOVec is
used by the RPC implementation and some user applications. The interface
is documented in the comments at the top of <aioimpl/iovec.h>.
We expect to continue adding to the repertoire of algorithms and
data structures. Although the efficient low-level primitives
already exist for various types of Bloom Filters, we have not yet
implemented the higher level abstraction. We also expect to implement
Radix trees (aka: Patrica trees or crit-bit trees) as replacements for
hash tables in suitations where randomization is unacceptable, for
instance where hash table attacks are a legitimate worry. We also
plan to implement ARC
[
14
]
and possibly radix sort, suffix trees, and Google-style sparse hashes.
Finally, note that
queue(3)
and
tree(3)
are still available to programs written using libaio and may be better
alternatives in some cases than the equivalent C++ implementations,
particularly when constructors for statically initialized objects
are involved.
4.3.1.
Cryptography
XXX: not documented yet.
1. Cryptographic hashes (aka: Sechash): MD5, SHA1, SHA1, SH256, SHA384, SHA512
2. Random numbers: ARC4 PRG; TODO: SHA PRGs, AES PRG, non-uniform variates.
3 Symmetric encryption: AES in CBC and CTR modes; probably also want DES/3DES.
4 Secret sharing.
5 Possibly RSA/DH, but padding/timing attacks.
6 Possibly homomorphic encryption such as Paillier; private matching
(Freedman, EUROCRYPT’04); maybe Schnorr; some of the EC variants?
7 No SSL/TLS
4.4.
Threads
The following routines are used to synchronized threads of control
running in shared memory. Locks are typically spin locks, although
depending upon the platform they may be blocking locks. QLocks are
queing locks and RWLocks are queueing locks that support multiple
readers.
Lock blocks until the lock has been obtained. Canlock is non-blocking.
It tries to obtain a lock and returns a non-zero value if it was
successful and zero otherwise. Unlock releases a lock.
QLocks have the same interface but are not spin locks; if the lock
cannot be acquired, the caller is suspended until the lock is
released.
RWLocks manage access to a data structure that has distinct readers
and writers. There may be any number of simultaneous readers, but
only one writer. Moreover, if write access is granted no one may
have read access until write access is released.
All types of locks must be initialized with the appropriate *lockinit
function and should be deallocated with the corresponding *lockfree
function. Failure to free a lock results in resource leaks.
Rendezes are rendezvous points. Each Rendez, r, is protected
by a QLock, r->l, which must be held by the callers of rsleep,
rwakeup, and rwakeupall. Rsleep atomically releases the lock and
suspends the caller. Upon resumption, the caller holds the lock.
Rwakeup wakes up a single thread sleeping on the Rendez, if there
are any. Rwakeupall wakes up all threads sleeping on a Rendez.
Neither form of wakeup releases the lock associated with the Rendez
nor do they block.
A Ref contains a long that can be incremented and decremented
atomically with incref and decref, respectively. Decref returns
zero if the resulting value is zero and non-zero otherwise.
1. Ref: Atomic reference counts.
i. incref(Ref *r): increment the reference count r.
ii. decref(Ref *r): decrement the reference count r.
2. Lock: Mutual exclusion.
i. void lockinit(Lock*): Initialize a lock (on heap, stack, or BSS)
and register it with the performance monitoring subsystem. Must be
called before the lock can be used.
ii. void lockfree(Lock*): Deallocate a lock and deregister it with
the performance monitoring subsystem. Failing to call lockfree will
leak resources.
iii. void lock(Lock*): Acquire a lock atomically. The Lock
variant may, but need not, be a spin lock; it is intended for very
short critical sections.
iv. void unlock(Lock*): Release a lock.
v. int canlock(Lock*): Attempt to acquire a lock atomically. If
the lock can be acquired, return non-zero; otherwise do not block, but
return zero immediately instead.
3. QLock: Supports the same operations as Locks. QLocks
are intended for longer critical sections; callers blocking on these locks
are guaranteed to sleep rather than spin.
qlockinit, qlockfree, qlock, qunlock, canqlock
4. RWLock: Reader-writer locks. r* variants lock for reading
while w* variants lock for writing. RWLocks locked for
reading must be unlocked with runlock and those locked for
writing must be unlocked with wunlock.
rwlockinit,
rwlockfree,
rlock,
runlock,
canrlock,
wlock,
wunlock,
canwlock
5. Rendez: Rendezvous points (condition variables).
rsleep, rwakeup, rwakeupall
6. Thread: Thread objects are private to the thread library,
but a number of related functions are exported to users. The thread
library provides a definition of main: users instead implement the
threadmain function, which has the same signature.
i. int threadcreate(void (*fn)(void*), void *arg, u32int stksize):
create a new thread that will execute the function fn with the
argument arg on a stack of size at least stksize bytes.
Returns zero on success and a negative integer on failure.
ii. uint threadid(void): return the opaque ID for the current thread.
iii. void threadyield(void): cause the current thread to yield the
processor.
iv. void threadexits(char*): cause the current thread to exit with the
given status string.
v. void threadexitsall(char*): cause all threads to exit with the
given status string.
vi. void **threaddata(void): return a pointer to a per-thread pointer.
Note that libaio allocates an array of void pointers and installs it
into the per-thread pointer. Indicies in this array are reserved in
<aio.h>.
vii. void threadmain(int, char**): user-supplied entry point.
4.5.
Events
This section describes the event-drive core of libaio. The basic
unit of work is the Task, which can be thought of as either a ‘‘run
to completion thread’’ or ‘‘one-shot continuation’’.
4.5.1.
Types
The following is a list of a number of types that appear frequently
throughout the libaio interface. Additional types are described below
when the corresponding subsystems are introduced.
1. typedef Curry<void>::ptr CallBack: A convenient name for nullary