How to get rid of threads
It appears that school, tutorials, or whatever is teaching
about threads a whole lot these days, and most designs I see
from people learning the ropes involves a number of threads.
However, threads cause bugs, and cause extra synchronization
cost that's not easily visible, and not easily redeemable.
I worked on the BeOS, where we used threads out the wazoo.
We made threads very, very cheap, but in the end, we were
still actually too heavy on threads in our designs -- live
and learn.
A typical threaded server design proposal looks something
like:
- Task Scheduler Thread
- Socket Reader Thread
- Processing Thread
- Socket Writer Thread
|
Here is a proposal that implements the same functionality
in a single thread.
for(;;) {
while( (t = next_event_time()) <= now() ) {
remove_and_run_one_event();
}
// this if() and the next while() can be combined with select()
if( there_is_no_input_data() && outgoing_queue_is_empty() ) {
sleep_until_there_is_data_or_time_happens( t );
}
while( there_is_more_input_data() ) {
read_data_from_socket();
put_data_in_todo_queue();
}
while( todo_queue_is_not_empty() ) {
remove_and_run_one_todo_item();
}
while( outgoing_queue_is_not_empty() && outgoing_socket_is_writable() ) {
remove_and_send_from_outgoing_queue();
}
}
|
|
This approach accomplishes the same thing, but it does it
without the subtle race conditions and synchronization penalties
that come from a threaded design.
There is a danger that scheduled tasks will be started after
their scheduled time. The total time they could be delayed by is
the sum of time it takes to read things from input, process them,
and send to the output. However, this is a bounded amount of time,
and on a single-CPU system, this bound is lower than the
maximum latency you'd get with a four-thread program on a dual-CPU
system in the worst case. If you need near-real-time scheduled
tasks, then a general-purpose OS may not be your best choice. If
you still need this, then perhaps those special tasks should
actually go in their own thread, because that might be one of the
special cases...
When should you use threads?
I've seen threads really work well in the following cases:
- For scientific computing that's CPU bound, not memory bound,
on multi-CPU systems.
- For turning synchronous system APIs (gethostbyname(), file I/O
etc) into asynchronous APIs.
- For very-low-latency tasks that are device dependent, such as
feeding sound cards or tracking the mouse. However, this typically
happens in the OS, and the API at the user level is properly
asynchronous.
- When there are truly independent tasks in an application that
don't need to talk to each other; i e a word processor with two
separate documents open.
|
It's clear that future CPUs will be heavily multi-cored. However,
taking true advantage of that functionality means a completely new
programming paradigm of some sort. Languages like Erlang, or structures
like the Actor (which carries an operation between data instances), or
systems based on many independent state machines might be a better
approach than a naive threading of an application.
| |
|