Ata, Baris and
Itai Gurvich. Forthcoming. On Optimality Gaps in the Halfin-Whitt Regime.
Annals of Applied Probability.
We consider optimal control of a multi-class queue in the Halfin-Whitt regime, and revisit the notion of asymptotic optimality and the associated optimality gaps. The existing results in the literature for such systems provide asymptotically optimal controls with optimality gaps of o( SQRT(n)) where n is the system size, e.g. the number of servers.We construct a sequence of asymptotically optimal controls where the optimality gap grows logarithmically with the system size. Our analysis relies on a sequence of Brownian control problems, whose refined structure helps us achieve the improved optimality gaps.
Gurvich, Itai and Ohad Perry. Forthcoming. Overflow Networks: Approximations and Implications to Call Center Outsourcing.
Operations Research.
Motivated by call center co-sourcing problems, we consider a service network operated under an overflow mechanism.
Calls are first routed to an in-house (or dedicated) service station that has a finite waiting room. If the waiting room is
full, the call is overflowed to an outside provider (an overflow station) that might also be serving overflows from other
stations. We establish approximations for overflow networks with many-servers under a resource-pooling assumption
which stipulates, in our context, that the fraction of overflowed calls is non-negligible. Our two main results are (i) an
approximation for the overflow processes via limit theorems and (ii) asymptotic independence between each of the in-house
stations and the overflow station. In particular, we show that, as the system becomes large, the dependency between
each in-house station and the overflow station becomes negligible. Independence between stations in overflow networks
is assumed in the literature on call centers, and we provide a rigorous support for those useful heuristics.

Allon, Gad,
Achal Bassamboo and
Itai Gurvich. Forthcoming. "We Will be Right with You:" Managing Customer Expectations with Vague Promises and Cheap Talk.
Operations Research.
Delay announcements informing customers about anticipated service delays are prevalent in service-oriented systems. How delay announcements can influence customers in service systems is a complex problem which depends on both the dynamics of the underlying queueing system and on the customers’ strategic behavior. We examine this problem of information communication by considering a model in which both the firm and the customers act strategically: the firm in choosing its delay announcement while anticipating customer response, and the customers in interpreting these announcements and in making the decision about when to join the system and when to balk. We characterize the equilibrium language that emerges between the service provider and her customers. The analysis of the emerging equilibria provides new and interesting insights into customer-firm information sharing. We show that even though the information provided to customers is non-verifiable, it improves the profits of the firm and the expected utility of the customers. The robustness of the results is illustrated via various extensions of the model. In particular, studying models with incomplete information on the system parameters allows us also to highlight the role of information provision in managing customer expectations regarding the congestion in the system. Further, the information could be as simple as “High Congestion”/“Low Congestion” announcements, or could be as detailed as the true state of the system.We also show that firms may choose to shade some of the truth by using intentional vagueness to lure customers.

Deo, Sarang and
Itai Gurvich. 2011. Centralized vs. Decentralized Ambulance Diversion: A Network Perspective.
Management Science. 57(7): 1300-1319.
In recent years, growth in the demand for emergency medical services along with decline in the number of hospitals with emergency departments (EDs) has led to overcrowding. In periods of overcrowding, an ED can request the Emergency Medical Services (EMS) agency to divert incoming ambulances to neighboring hospitals, a phenomenon known as “ambulance diversion”. The EMS agency will accept this request provided that at least one of the neighboring EDs is not on diversion. From an operations perspective, properly executed ambulance diversion should result in resource pooling and reduce the overcrowding and delays in a network of EDs. Recent evidence indicates, however, that this potential benefit is not always realized. In this paper, we provide one potential explanation for this discrepancy and suggest potential remedies. Using a queueing game between two EDs that aim to minimize their own waiting time, we find that decentralized decisions regarding diversion explain the lack of pooling benefits. Specifically, we find the existence of a defensive equilibrium, wherein each ED does not accept diverted ambulances from the other ED. This defensiveness results in a de-pooling of the network and, in turn, in delays that are significantly higher than when a social planner coordinates diversion. The social optimum is, itself, difficult to characterize analytically and has limited practical appeal as it depends on problem parameters such as arrival rates and length of stay. Instead, we identify an alternative solution that is more amenable to implementation and can be used by the EMS agencies to coordinate diversion decisions even without the exact knowledge of these parameters. We show that this solution is approximately optimal for the social planner’s problem. Moreover, it is Pareto improving over the defensive equilibrium whereas the social optimum, in general, might not be.

Gurvich, Itai, James Luedtke and Tolga Tezcan. 2010. Staffing Call-Centers With Uncertain Demand Forecasts: A Chance-Constraints Approach.
Management Science. 56(7): 1093-1115.
We consider the problem of staffing call-centers with multiple customer classes and agent types operating under quality-of-service (QoS) constraints and demand rate uncertainty. We introduce a formulation of the staffing problem that requires that the QoS constraints are met with high probability with respect to the uncertainty in the demand rate. We contrast this chance-constrained formulation with the average-performance constraints that have been used so far in the literature. We then propose a two-step solution for the staffing problem under chance constraints. In the first step, we introduce a Random Static Planning Problem (RSPP) and discuss how it can be solved using two different methods. The RSPP provides us with a first-order (or fluid) approximation for the true optimal staffing levels and a staffing frontier. In the second step, we solve a finite number of staffing problems with known arrival rates–the arrival rates on the optimal staffing frontier. Hence, our formulation and solution approach has the important property that it translates the problem with uncertain demand rates to one with known arrival rates. The output of our procedure is a solution that is feasible with respect to the chance constraint and nearly optimal for large call centers.

Gurvich, Itai and Ward Whitt. 2010. Service-Level Differentiation in Many-Server Service System Via Queue-Ratio Routing.
Operations Research. 58(2): 316-328.
Motivated by telephone call centers, we study large-scale service systems with multiple customer classes and multiple agent pools, each with many agents. To minimize sta±ng costs subject to service-level constraints, where we delicately balance the service levels (SLs) of the di®erent classes, we propose a family of routing rules called Fixed-Queue-Ratio (FQR) rules. With FQR, a newly available agent next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a speci¯ed proportion of the total queue length. The proportions can be set to achieve desired SL targets. The FQR rule achieves an important state-space collapse (SSC) as the total arrival rate increases, in which the individual queue lengths evolve as ¯xed proportions of the total queue length. In the current paper we consider a variety of service-level types and exploit SSC to construct asymptotically optimal solutions for the sta±ng-and-routing problem. The key assumption in the current paper is that the service rates depend only on the agent pool.

Allon, Gad and
Itai Gurvich. 2010. Pricing and Dimensioning Competing Large-Scale Service Providers.
Manufacturing and Service Operations Management. 12(3): 449-469.
The literature on many-server approximations provides significant simplifications towards the optimal capacity sizing of large-scale monopolists but falls short of providing similar simplifications for a competitive setting in which each firm’s decision is affected by its competitors’ actions. In this paper, we introduce a framework that combines many-server heavy-traffic analysis with the notion of epsilon-Nash equilibrium and apply it to the study of equilibria in a market with multiple large-scale service providers that compete on both prices and response times. In an analogy to fluid and diffusion approximations for queueing systems, we introduce the notions of fluid game and diffusion game. The proposed framework allows us to provide first-order and second-order characterization results for the equilibria in these markets. We use our results to provide insights into the price and service-level choices in the market and, in particular, into the impact of the market scale on the interdependence between these two strategic decisions.

Gurvich, Itai, Mor Armony and Constantinos Maglaras. 2009. Cross-Selling in a Call Center with a Heterogeneous Customer Population.
Operations Research. 57(2): 299-313.
Cross-selling is becoming an increasingly prevalent practice in call centers, due, in part, to its unique capability to allow firms to dynamically segment their callers and customize their product offerings accordingly. This paper considers a call center with cross-selling capability that serves a pool of customers that are differentiated in terms of their revenue potential and delay sensitivity. It studies the operational decisions of staffing, call routing, and cross-selling under various forms of customer segmentation. It derives near-optimal controls in each of the settings analyzed, and characterizes the impact of a more refined customer segmentation on the structure of these policies and the center’s profitability.
Gurvich, Itai and Ward Whitt. 2009. Queue-and-Idleness-Ratio Controls in Many-Server Service Systems.
Math of OR. 34(2): 363-396.
Motivated by call centers, we study large-scale service systems with multiple customer classes and multiple agent pools, each with many agents. We propose a family of routing rules called Queue-and- Idleness-Ratio (QIR) rules. A newly available agent next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a specified state-dependent proportion of the total queue length. An arriving customer is routed to the agent pool whose idleness most exceeds a specified state-dependent proportion of the total idleness. We identify regularity conditions on the network structure and system parameters under which QIR produces an important state-space collapse (SSC) result in the Quality-and-Efficiency-Driven (QED) many-server heavy-traffic limiting regime. The SSC result is applied here to prove stochastic-process limits and in subsequent papers to solve important staffing and control problems for large-scale service systems.

Gurvich, Itai and Ward Whitt. 2009. Scheduling Flexible Servers with Convex Delay Costs in Many-Server Service Systems.
Manufacturing and Service Operations Management. 11(2): 237-253.
In a recent paper we introduced the fixed-queue-ratio (FQR) family of routing rules for many-server service systems with multiple customer classes and server pools. A newly available server next serves the customer from the head of the queue of the class (from among those he is eligible to serve) whose queue length most exceeds a specified proportion of the total queue length. Under fairly general conditions, FQR produces an important state-space collapse as the total arrival rate and the numbers of servers increase in a coordinated way. That state-space collapse was previously used to delicately balance service levels for the different customer classes. In this sequel, we show that a special version of FQR stochastically minimizes convex holding costs in a finite-horizon setting when the service rates are restricted to be pool-dependent. Under additional regularity conditions, the special version of FQR reduces to a simple policy: Linear costs produce a priority-type rule, in which the least-cost customers are given low priority. Strictly convex costs (plus other regularity conditions) produce a many-server analogue of the generalized-c¹ (Gc¹) rule, under which a newly available server selects a customer from the class experiencing the greatest marginal cost at that time.

Gurvich, Itai and Mor Armony. 2010. When Promotions Meet Operations: Cross Selling and Its Effect on Call-Center Performance.
Manufacturing and Service Operations Management. 12(3): 470-488.
We study cross-selling operations in call centers. The following question is addressed: How many customer-service representatives are required (staffing) and when should cross-selling opportunities be exercised (control) in a way that will maximize the expected profit of the center while maintaining a pre-specified service level target. We tackle this question by characterizing control and staffing schemes that are asymptotically optimal in the limit, as the system load grows large. Our main finding is that a threshold priority (TP) control, in which cross-selling is exercised only if the number of callers in the system is below a certain threshold, is asymptotically optimal in great generality. The asymptotic optimality of TP reduces the staffing problem to a solution of a simple deterministic problem, in one regime, and to a simple search procedure in another. We show that our joint staffing and control scheme is nearly optimal for large systems. Furthermore, it performs extremely well even for relatively small systems.

Armony, Mor,
Itai Gurvich and Avishai Mandelbaum. 2005. Service Level Differentiation in Call Centers with Fully Flexible Servers.
Management Science. 54(2): 279-294.
We study large-scale service systems with multiple customer classes and many statistically identical servers. The following question is addressed: How many servers are required (staffing) and how does one match them with customers (control) in order to minimize staffing cost, subject to class level quality of service constraints? We tackle this question by characterizing scheduling and staffing schemes that are asymptotically optimal in the limit, as system load grows to infinity. The asymptotic regimes considered are consistent with the Efficiency Driven (ED), Quality Driven (QD) and Quality and Efficiency Driven (QED) regimes, first introduced in the context of a single class service system.
Our main findings are: a) Decoupling of staffing and control, namely (i) Staffing disregards the multi-class nature of the system and is analogous to the staffing of a single class system with the same aggregate demand and a single global quality of service constraint, and (ii) Class level service differentiation is obtained by using a simple Idle server based Threshold-Priority (ITP) control (with state-independent thresholds), b) Robustness of the staffing and control rules: Our proposed Single-Class Staffing (SCS) rule and ITP control are approximately optimal under various problem formulations and model assumptions. Particularly, although our solution is shown to be asymptotically optimal for large systems, we numerically demonstrate that it performs well also for relatively small systems.
