How to get rid of threads

It appears that school, tutorials, or whatever is teaching about threads a whole lot these days, and most designs I see from people learning the ropes involves a number of threads. However, threads cause bugs, and cause extra synchronization cost that's not easily visible, and not easily redeemable.

I worked on the BeOS, where we used threads out the wazoo. We made threads very, very cheap, but in the end, we were still actually too heavy on threads in our designs -- live and learn.

A typical threaded server design proposal looks something like:

  • Task Scheduler Thread
  • Socket Reader Thread
  • Processing Thread
  • Socket Writer Thread

Here is a proposal that implements the same functionality in a single thread.


  for(;;) {
    while( (t = next_event_time()) <= now() ) {
      remove_and_run_one_event();
    }
    // this if() and the next while() can be combined with select()
    if( there_is_no_input_data() && outgoing_queue_is_empty() ) {
      sleep_until_there_is_data_or_time_happens( t );
    }
    while( there_is_more_input_data() ) {
      read_data_from_socket();
      put_data_in_todo_queue();
    }
    while( todo_queue_is_not_empty() ) {
      remove_and_run_one_todo_item();
    }
    while( outgoing_queue_is_not_empty() && outgoing_socket_is_writable() ) {
      remove_and_send_from_outgoing_queue();
    }
  }

This approach accomplishes the same thing, but it does it without the subtle race conditions and synchronization penalties that come from a threaded design.

There is a danger that scheduled tasks will be started after their scheduled time. The total time they could be delayed by is the sum of time it takes to read things from input, process them, and send to the output. However, this is a bounded amount of time, and on a single-CPU system, this bound is lower than the maximum latency you'd get with a four-thread program on a dual-CPU system in the worst case. If you need near-real-time scheduled tasks, then a general-purpose OS may not be your best choice. If you still need this, then perhaps those special tasks should actually go in their own thread, because that might be one of the special cases...

When should you use threads?

I've seen threads really work well in the following cases:

  • For scientific computing that's CPU bound, not memory bound, on multi-CPU systems.
  • For turning synchronous system APIs (gethostbyname(), file I/O etc) into asynchronous APIs.
  • For very-low-latency tasks that are device dependent, such as feeding sound cards or tracking the mouse. However, this typically happens in the OS, and the API at the user level is properly asynchronous.
  • When there are truly independent tasks in an application that don't need to talk to each other; i e a word processor with two separate documents open.

It's clear that future CPUs will be heavily multi-cored. However, taking true advantage of that functionality means a completely new programming paradigm of some sort. Languages like Erlang, or structures like the Actor (which carries an operation between data instances), or systems based on many independent state machines might be a better approach than a naive threading of an application.