Queueing Theory

Literally what makes queues appear and how will they behave when they go away

Note

This is applicable to lots of fields, industrial design, call centres, computer executing transactions

Given a choice between a single machine with speed s, or n machines, each with speed $\frac{s}{n}$ , which should we choose.
If the arrival rate and service rate double, how does the mean response time change.
Should we try to balance load or is that a waste of time or effort
Can we give priority to certain operations without harming another category of job
How do job size variability and heavy-tailored workloads affect our choices of scheduling policy

Formal terms

Server (S): The thing that fulfills requests
Customer (V): The initiator of the service requests
Wait time (W): The time a customer spends waiting in line (average length of the queue)
Service time (s): The time from when a worker starts to serve a customer up to the time when the next customer is called forward. This in questions is described for some reason as the server completion time.
Arrival rate ( $λ$ ): The rate at which customers arrive
Service rate ( $μ$ ): The rate at which customers are serviced
Utilization (U): The fraction of the workers time used actually handling requests (not idling)
Queue length (N): The total number of customers waiting, or currently with a worker
Response time (R): The sum of the wait and service time for a single visit
Residence time (R’): The total response time if a customer visits several workers or the same one multiple times
Throughput (X): The rate at which customers get their requests serviced and dealt with
E: Usually means expected and is a function of something else in the system

Computer Context

The CPU uses a time sharing scheduler to run as many concurrent programs as possible. A router has a queue for packets that has a maximum size, and if this is exceeded packets are dropped. Queueing theory gives us a formal framework to analyze our problems instead of guessing. See Simple Scaling.

Closed vs Open Systems

In an open system, customers arrive from the outside and depart permanently after being served. This is like sub requests hitting a server

In a closed system, there is a fixed population of N customers and there is always N in the system and we keep cycling these customers. This is like a database connection pool.

We typically require that $λ \leq μ$ and assume that $λ < μ$ . These values are averages so it could happen that we temporarily fall behind a bit and make up for it later on. We do need to make up for fall backs though otherwise we become overloaded.

Lets say time is t, $N (t)$ is the number of jobs in the system at time t, $A (t)$ represents arrival by time t, and $D (t)$ represents departures by time t.

$E [N (t)] = E [A (t)] - E (D (t)) \geq λ t - μ t = t (λ - μ)$

If arrivals exceed departures $t (λ - μ)$ also goes to infinity.

We typically want to raise $μ$ . This might not necessarily improve X since there might just not be work to do. This would just increase the maximum potential.

Note that in a closed system, we are always at capacity since there is always more work to do. As soon as one finishes, another enters the queue. In this case, $μ$ is the controlling factor. The throughput is exactly the service rate.

It is advantageous to know, given the arrival rate $λ$ , what is the mean job size $E [S]$ . The open system strategy is to ramp up $λ$ , Once the completion rate levels off, we hit a limit and have $μ$ . For the closed system rate, we keep the system totally busy in the stress test and we can directly measure jobs completing per second, giving us $μ$ directly.

Queueing Processes

Problems when Queues Get Long

Balking: Don’t get in line at all
Reneging: Get in line, then leave
Loss: No capacity, request to enqueue refused ( $M / M / k / k$ system) Customers estimate wait time and compare it against their willingness to wait

Note

What if people can queue ahead?

Priorities

Priorities for queueing may exist.

How much if any, does giving priority to one group over another help the group being given priority?
How much, if any, does giving priority to one group disadvantage the group not being given priority
Does this incentivize people to choose less popular things
If everyone has priority, no one does

Complexity

Queues become complex when we consider agents that operate independently.

Different arrival and departure times
Different willingnesses to wait
Requests are time bounded
Rate limiting
Different services that have independent service rates
Services have fixed maximum capacities

Simulation Configuration

We typically need to simulate results for queues because of complexity, we can configure:

User Types

How long users stay
When they balk
What they want to do

Systems

No priority
Priority
Better priority? (Virtual queueing where you can do other stuff)

Summary

Knowledge leads to a huge inequality. Complexity allows for exploitation.

🤖 Dan Huynh

Recent Notes

Dan Huynh

Linearity

CAP Theorem

Causality

Quorum Reads and Writes

Explorer

Queueing Theory

Computer Context

Problems when Queues Get Long

Priorities

Complexity

Simulation Configuration

User Types

Systems

Graph View

Recent Notes

Dan Huynh

Linearity

CAP Theorem

Table of Contents

Backlinks