The posix_spawn() function and its close relation posix_spawnp()
have been introduced to overcome the following perceived
difficulties with fork(): the fork() function is difficult or
impossible to implement without swapping or dynamic address
translation.
* Swapping is generally too slow for a realtime environment.
* Dynamic address translation is not available everywhere that
POSIX might be useful.
* Processes are too useful to simply option out of POSIX
whenever it must run without address translation or other MMU
services.
Thus, POSIX needs process creation and file execution primitives
that can be efficiently implemented without address translation
or other MMU services.
The posix_spawn() function is implementable as a library routine,
but both posix_spawn() and posix_spawnp() are designed as kernel
operations. Also, although they may be an efficient replacement
for many fork()/exec pairs, their goal is to provide useful
process creation primitives for systems that have difficulty with
fork(), not to provide drop-in replacements for fork()/exec.
This view of the role of posix_spawn() and posix_spawnp()
influenced the design of their API. It does not attempt to
provide the full functionality of fork()/exec in which arbitrary
user-specified operations of any sort are permitted between the
creation of the child process and the execution of the new
process image; any attempt to reach that level would need to
provide a programming language as parameters. Instead,
posix_spawn() and posix_spawnp() are process creation primitives
like the Start_Process and Start_Process_Search Ada language
bindings package POSIX_Process_Primitives and also like those in
many operating systems that are not UNIX systems, but with some
POSIX-specific additions.
To achieve its coverage goals, posix_spawn() and posix_spawnp()
have control of six types of inheritance: file descriptors,
process group ID, user and group ID, signal mask, scheduling, and
whether each signal ignored in the parent will remain ignored in
the child, or be reset to its default action in the child.
Control of file descriptors is required to allow an independently
written child process image to access data streams opened by and
even generated or read by the parent process without being
specifically coded to know which parent files and file
descriptors are to be used. Control of the process group ID is
required to control how the job control of the child process
relates to that of the parent.
Control of the signal mask and signal defaulting is sufficient to
support the implementation of system(). Although support for
system() is not explicitly one of the goals for posix_spawn() and
posix_spawnp(), it is covered under the ``at least 50%'' coverage
goal.
The intention is that the normal file descriptor inheritance
across fork(), the subsequent effect of the specified spawn file
actions, and the normal file descriptor inheritance across one of
the exec family of functions should fully specify open file
inheritance. The implementation need make no decisions regarding
the set of open file descriptors when the child process image
begins execution, those decisions having already been made by the
caller and expressed as the set of open file descriptors and
their FD_CLOEXEC flags at the time of the call and the spawn file
actions object specified in the call. We have been assured that
in cases where the POSIX Start_Process Ada primitives have been
implemented in a library, this method of controlling file
descriptor inheritance may be implemented very easily.
We can identify several problems with posix_spawn() and
posix_spawnp(), but there does not appear to be a solution that
introduces fewer problems. Environment modification for child
process attributes not specifiable via the attrp or file_actions
arguments must be done in the parent process, and since the
parent generally wants to save its context, it is more costly
than similar functionality with fork()/exec. It is also
complicated to modify the environment of a multi-threaded process
temporarily, since all threads must agree when it is safe for the
environment to be changed. However, this cost is only borne by
those invocations of posix_spawn() and posix_spawnp() that use
the additional functionality. Since extensive modifications are
not the usual case, and are particularly unlikely in time-
critical code, keeping much of the environment control out of
posix_spawn() and posix_spawnp() is appropriate design.
The posix_spawn() and posix_spawnp() functions do not have all
the power of fork()/exec. This is to be expected. The fork()
function is a wonderfully powerful operation. We do not expect to
duplicate its functionality in a simple, fast function with no
special hardware requirements. It is worth noting that
posix_spawn() and posix_spawnp() are very similar to the process
creation operations on many operating systems that are not UNIX
systems.
Requirements
The requirements for posix_spawn() and posix_spawnp() are:
* They must be implementable without an MMU or unusual
hardware.
* They must be compatible with existing POSIX standards.
Additional goals are:
* They should be efficiently implementable.
* They should be able to replace at least 50% of typical
executions of fork().
* A system with posix_spawn() and posix_spawnp() and without
fork() should be useful, at least for realtime applications.
* A system with fork() and the exec family should be able to
implement posix_spawn() and posix_spawnp() as library
routines.
Two-Syntax
POSIX exec has several calling sequences with approximately the
same functionality. These appear to be required for compatibility
with existing practice. Since the existing practice for the
posix_spawn*() functions is otherwise substantially unlike POSIX,
we feel that simplicity outweighs compatibility. There are,
therefore, only two names for the posix_spawn*() functions.
The parameter list does not differ between posix_spawn() and
posix_spawnp(); posix_spawnp() interprets the second parameter
more elaborately than posix_spawn().
Compatibility with POSIX.5 (Ada)
The Start_Process and Start_Process_Search procedures from the
POSIX_Process_Primitives package from the Ada language binding to
POSIX.1 encapsulate fork() and exec functionality in a manner
similar to that of posix_spawn() and posix_spawnp(). Originally,
in keeping with our simplicity goal, the standard developers had
limited the capabilities of posix_spawn() and posix_spawnp() to a
subset of the capabilities of Start_Process and
Start_Process_Search; certain non-default capabilities were not
supported. However, based on suggestions by the ballot group to
improve file descriptor mapping or drop it, and on the advice of
an Ada Language Bindings working group member, the standard
developers decided that posix_spawn() and posix_spawnp() should
be sufficiently powerful to implement Start_Process and
Start_Process_Search. The rationale is that if the Ada language
binding to such a primitive had already been approved as an IEEE
standard, there can be little justification for not approving the
functionally-equivalent parts of a C binding. The only three
capabilities provided by posix_spawn() and posix_spawnp() that
are not provided by Start_Process and Start_Process_Search are
optionally specifying the child's process group ID, the set of
signals to be reset to default signal handling in the child
process, and the child's scheduling policy and parameters.
For the Ada language binding for Start_Process to be implemented
with posix_spawn(), that binding would need to explicitly pass an
empty signal mask and the parent's environment to posix_spawn()
whenever the caller of Start_Process allowed these arguments to
default, since posix_spawn() does not provide such defaults. The
ability of Start_Process to mask user-specified signals during
its execution is functionally unique to the Ada language binding
and must be dealt with in the binding separately from the call to
posix_spawn().
Process Group
The process group inheritance field can be used to join the child
process with an existing process group. By assigning a value of
zero to the spawn-pgroup attribute of the object referenced by
attrp, the setpgid() mechanism will place the child process in a
new process group.
Threads
Without the posix_spawn() and posix_spawnp() functions, systems
without address translation can still use threads to give an
abstraction of concurrency. In many cases, thread creation
suffices, but it is not always a good substitute. The
posix_spawn() and posix_spawnp() functions are considerably
``heavier'' than thread creation. Processes have several
important attributes that threads do not. Even without address
translation, a process may have base-and-bound memory protection.
Each process has a process environment including security
attributes and file capabilities, and powerful scheduling
attributes. Processes abstract the behavior of non-uniform-
memory-architecture multi-processors better than threads, and
they are more convenient to use for activities that are not
closely linked.
The posix_spawn() and posix_spawnp() functions may not bring
support for multiple processes to every configuration. Process
creation is not the only piece of operating system support
required to support multiple processes. The total cost of support
for multiple processes may be quite high in some circumstances.
Existing practice shows that support for multiple processes is
uncommon and threads are common among ``tiny kernels''. There
should, therefore, probably continue to be AEPs for operating
systems with only one process.
Asynchronous Error Notification
A library implementation of posix_spawn() or posix_spawnp() may
not be able to detect all possible errors before it forks the
child process. POSIX.1‐2008 provides for an error indication
returned from a child process which could not successfully
complete the spawn operation via a special exit status which may
be detected using the status value returned by wait(), waitid(),
and waitpid().
The stat_val interface and the macros used to interpret it are
not well suited to the purpose of returning API errors, but they
are the only path available to a library implementation. Thus, an
implementation may cause the child process to exit with exit
status 127 for any error detected during the spawn process after
the posix_spawn() or posix_spawnp() function has successfully
returned.
The standard developers had proposed using two additional macros
to interpret stat_val. The first, WIFSPAWNFAIL, would have
detected a status that indicated that the child exited because of
an error detected during the posix_spawn() or posix_spawnp()
operations rather than during actual execution of the child
process image; the second, WSPAWNERRNO, would have extracted the
error value if WIFSPAWNFAIL indicated a failure. Unfortunately,
the ballot group strongly opposed this because it would make a
library implementation of posix_spawn() or posix_spawnp()
dependent on kernel modifications to waitpid() to be able to
embed special information in stat_val to indicate a spawn
failure.
The 8 bits of child process exit status that are guaranteed by
POSIX.1‐2008 to be accessible to the waiting parent process are
insufficient to disambiguate a spawn error from any other kind of
error that may be returned by an arbitrary process image. No
other bits of the exit status are required to be visible in
stat_val, so these macros could not be strictly implemented at
the library level. Reserving an exit status of 127 for such
spawn errors is consistent with the use of this value by system()
and popen() to signal failures in these operations that occur
after the function has returned but before a shell is able to
execute. The exit status of 127 does not uniquely identify this
class of error, nor does it provide any detailed information on
the nature of the failure. Note that a kernel implementation of
posix_spawn() or posix_spawnp() is permitted (and encouraged) to
return any possible error as the function value, thus providing
more detailed failure information to the parent process.
Thus, no special macros are available to isolate asynchronous
posix_spawn() or posix_spawnp() errors. Instead, errors detected
by the posix_spawn() or posix_spawnp() operations in the context
of the child process before the new process image executes are
reported by setting the child's exit status to 127. The calling
process may use the WIFEXITED and WEXITSTATUS macros on the
stat_val stored by the wait() or waitpid() functions to detect
spawn failures to the extent that other status values with which
the child process image may exit (before the parent can
conclusively determine that the child process image has begun
execution) are distinct from exit status 127.