The scheduling success story that is perhaps the easiest to explain is that of web servers. If consider a web server that serves primarily static requests, its operation is very simple at a high level. Static requests are typically of the form “get me a file.” In order to fulfill such a request, the web server must retrieve the file and then send the file over the outgoing link. Typically the amount of bandwidth at the web server is the bottleneck device since purchasing more bandwidth is much more expensive than upgrading the disks or CPUs at the web server [143, 59]. Even a modest web server
1.1: SCHEDULING SUCCESS STORIES 5 can saturate a T3 or 100Mbps Ethernet connection. Thus, much of the delay experienced by requests for files is a result of queueing for bandwidth.
In standard web server designs, such as Apache [228] and Flash [169] servers, the bandwidth is allocated by cycling through the queued files, giving each a small slice of service. Specifically, each connection between a client and the web server has a corresponding socket buffer into which the web server writes the contents of the requested file. The sockets are then drained in a cyclic manner where a handful of packets from each socket are sent before moving to the next socket. This behavior is typically modeled using the Processor Sharing (PS) scheduling policy, which gives an equal share of the service capacity to each job in the queue at all times.
Now comes the success story. Harchol-Balter et al. [97] have recently achieved dramatic reductions in user response times at static web servers by adjusting the scheduling policy used in web servers. They modified the way sockets are drained in order to implement a version of SRPT and found that not only were response times much smaller [97], but also the performance in periods of overload was improved [204] and the response times of large files did not suffer as a result of the bias towards small files [26].