Sometimes when you run out for lunch, one restaurant has 5 cashiers and 1 cook.[1] So you get your order in fast, but then you wait almost your entire lunch time to get your food. Another restaurant has 5 cooks and 1 cashier. Here you wait a long time to make your order, but then your food arrives very fast. This last example is a single queue system. It is stable before the lunch rush when the cashier’s capacity can handle one or two customers. When 40 people arrive on a bus, in a large batch, the restaurant has a bottleneck, a traffic jam. Some customers leave rather than wait in a long queue because:

  1. The queue is visible to all involved.

  2. Waiting in a queue is not valuable to customers.

A queue is stable when it does not grow to become infinite over time. The queue system is stable if the mean service time < mean inter-arrival time. On any road, the queuing becomes unstable after a car crash. Then the queue of cars lengthens and lengthens.

One store we went to had long single queue, with multiple cashiers, or servers. That went pretty fast. Wal-Mart stores tend to have multiple queues each with a single server.

These are examples of queuing. On the highway we call it a traffic jam. In the restaurant we call it a frustratingly long line.

Queuing theory helps minimize customer time wasted or work item time wasted in queues or waiting lines, pending a server or a service.

Little’s Law [2]

Cycle time is work in process (WIP) divided by throughput.

Cycle Time = WIP / Throughput

Cycle time is an important metric in queuing theory. It consists of the sum of the queue time (spent waiting) and the service time (spent adding value by working on it).

For a given arrival rate, the time in the system is proportional to customer occupancy.

Little’s Law tells us that the average number of customers [in line] L, is the effective arrival rate λ, times the average time that a customer spends in the [line] T, or simply:[2]

N = λ T [2]

The ready queue contains a queue of work item cards ready to be started. It is a queue because the work item cards are "standing in line," with the first card at the top being the first serviced. When started, that card leaves the ready queue, and all those remaining in the queue will shift up.

Bottlenecks happen at access points and result from overloads caused by high load. For learning experience development, the bottleneck of our system is the workstation or process step with the lowest throughput. The longest queue shows you your bottlenecks. Read Goldratt’s Theory of Constraints for more detail on this. Since the other workstations or process steps complete their work at an higher rate by definition, they push their completed items to that station which is unable to process it with the same speed. The bottleneck is the most important place to focus improvement attention because it dictates an upper limit for the throughput of the entire system.

Little’s law tells us that measuring queue length is equivalent to measuring cycle time as they are proportional in a stable system.

Variability in rate of arrival or in service time increases the cycle time and the queue. Larger batches take longer to service, so moving to smaller, more similarly sized batches results in decreased cycle time. Large batches make queues grow, and by Little’s Law, increase cycle time.

To improve cycle times and reduce queues, target an even, stable rate of arrival, like restaurants do by introducing discounts at certain times of the day. The LA traffic lights restrict how many cars can enter the highway, making the rate of arrival more stable.

Queuing service choices include:

  • FIFO: First In First Out (generally not used in Agile)

  • PQ: Priority Queuing (used in Agile)

  • Lean pull-based queue systems (used in Scrumban as a sub-method to priority queuing)

An example of priority queuing is airline check-in. Typically there is a first class queue and a standard queue. If you have a first class ticket, you have preemptive queuing and do not have to wait. So it is with queuing for Lean-Agile. You can prioritize some work items as higher priority than other work items.

The throughput of a training courseware production system is the average number of finished products per unit time. Products here mean work items on the Kanban board, typically learning objectives.

The rate of arrival is the amount of work entering the system in the unit of time–for example, the number of work items you accept for an iteration. To minimize cycle time, using Little’s Law, the rate of arrival should even out as much as possible. This is why we drop the batch size to Learning Objectives instead of Lessons.

How you choose to handle queues can have a large effect on the team’s performance. You may need to have service differentiation. Many Agile teams in software development use priority queuing, so the work item card at the top of the queue gets pulled or served first.

Queuing theory explains why WIP limits on our Kanban board help to reduce waiting waste and work item bottlenecks.

Handling and tracking partially finished larger batches costs quite a lot more than managing smaller batches of work items which are not started (in a queue) or are completely done. Leaning out these costs speeds development cycle time.

Reducing the batch size from one very large batch (course) to five or six iteration batches (increments of LOs) has large positive effects. This why even early efforts at applying incremental methods usually cause significant improvement, despite not having the rest of Agile or Lean working yet.

Queue Summary

  • A queue or buffer between work stations (process steps) absorbs timing variation.

  • The queue itself should have a limit, so that if the queue fills up, the upstream producers will halt.

  • The queue allows for slack. Optimum flow means just enough slack.

1. This restaurant example is not our original idea, but we don’t remember who first mentioned it because we came across it early. The idea created a mental picture that stuck. We’re not sure who to attribute. If you know who said this first, let us know and we’ll attribute it appropriately.
2. Reproduced from Wikipedia article under a Creative Commons Attribution-ShareAlike 3.0 license.

Line By Line

Here a Little, There a Little, Layer by Layer.

Back to Overview