Like a kernel thread, a user-level thread includes a set of registers and a stack, and shares the entire address space with the other threads in the enclosing process. Unlike a kernel thread, however, a user-level thread is handled entirely in user code, usually by a special library that provides at least start, swap and suspend calls. Because the OS is unaware of a user-level thread's existence, a user-level thread can not separately receive signals or use operating system scheduling calls such as sleep(). Many implementations of user-level threads exist, including: GNU Portable Threads (Pth) [1], FreeBSD's userland threads, QuickThreads [26] and those developed by us for the Charm++ system [25].
The primary advantages of user-level threads are efficiency and flexibility. Because the operating system is not involved, user-level threads can be made to use very little memory, and can be created and scheduled very quickly. User-level threads are also more flexible because the thread scheduler is in user code, which makes it much easier to schedule threads in an intelligent fashion -- for example, the application's priority structure can be directly used by the thread scheduler [9].
The primary disadvantage of user-level threads compared to kernel threads is
the lack of operating system support. For example, when a user-level thread
makes a blocking call, the kernel does not start running another user-level
thread. Instead, the kernel suspends the entire calling kernel thread or
process, even though another user-level thread might be ready to run.
To avoid this blocking problem, some systems such as AIX and Solaris support
``N:M'' thread scheduling, which maps some number
of application threads
onto a (usually smaller) number
of kernel entities.
There are two parties, the kernel and the user parts of the thread system,
involved in each thread operation for N:M threading, which is complex.
The blocking problem can also be avoided by building a smarter
runtime layer which intercepts blocking calls, replaces them with a
non-blocking call, and starts another user-level thread while the call
proceeds [1]. Yet another approach is to provide support in the kernel to
notify the user-level scheduler when a thread blocks, often called
``scheduler activations'' [3,38].
Since user-level threads are controlled in user code, there is virtually no limit on the maximum number of threads as long as the resource allows. In practice, one can create 50,000 user-level threads on a Linux box very easily (see Table 2).