INFORMS 2021 Program Book

First page Previous page 4 Next page Last page

INFORMS Anaheim 2021

SB10

2 - The Value of Knowing Drivers’ Opportunity Cost in Ride Sharing Systems

3 - Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems Yuwei Luo, Stanford University, Stanford, CA, United States, Varun Gupta, Mladen Kolar We consider the problem of controlling an LQR system over a finite horizon T with fixed and known cost matrices Q, R, but unknown and non- stationary dynamics {At, Bt}. The sequence of dynamics matrices can be arbitrary, but with a total variation, VT, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of O(VT2/5T3/5). With piece-wise constant dynamics, our algorithm achieves the optimal regret of O( √ ST) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection and restart approach developed for contextual multi-armed bandit problems. We argue that non-adaptive restart or static window size based approaches may not be regret optimal for the LQR problem. SB11 CC Room 304C In Person: Matching Markets General Session Chair: Sasa Pekec, Duke University, Durham, NC, United States 1 - Rank Dominance of Tie-Breaking Rules Maxwell Allman, Stanford University, Stanford, CA, United States, Itai Ashlagi, Afshin Nikzad : In many settings where scarce resources must be rationed, agents have given priorities for the resources and lotteries are used to break ties amongst agents with equal priority. Two commonly used and simple tie-breaking rules are Single Tie-Breaking (STB), where a common lottery is used to break ties for all resources, and Multiple Tie-Breaking (MTB), where an independent lottery is used to break ties for each individual resource. We show that under a multinomial-logit (MNL) choice model, if the resources are sufficiently over- demanded then STB dominates MTB in the sense that agents with any preferences prefer STB ex-ante. Furthermore, we show that under a nested-MNL choice model with multiple resource types, a hybrid tie-breaking rule that uses a common lottery amongst over-demanded types will dominate MTB. 2 - Search Approximates Optimal Matching We consider matching settings where agents are long-lived, match repeatedly, and have heterogeneous, unknown, but persistent preferences. Match compatibility is probabilistic, is realized the first time agents are matched, and persists in the future. We show that a decentralized stable matching process gives a constant- factor approximation to the optimal online matching. Specifically, stable matching provides a 0.316-approximation to the optimal online algorithm for matching on general graphs, a $1/7$-approximation for many-to-one matching, a $1/11$- approximation for capacitated matching, and a $1/2k$ approximation for forming teams of size $k$. Our results rely on a novel coupling argument that decomposes the successful edges of the optimal online algorithm in terms of their round-by- round comparison with stable matching. 3 - Matching Costs in Centralized and Decentralized Markets Naomi Utgoff, USNA, Annapolis, MD, United States I explore the relationship between payments in a static matching mechanism and the opportunity cost of singlehood in a decentralized search and matching model. A number of auction-like matching mechanisms exist in which a central matchmaker announces a payment rule which incentivizes participants to reveal private information to the matchmaker, who in turn matches participants efficiently. (See Hoppe, Moldovanu and Sela, 2009; Johnson, 2013; Utgoff, 2020). A common criticism of these centralized markets is that matching outside the mechanism in a decentralized setting may be preferable to avoid paying the matchmaker. Existing results supporting this criticism disregard the cost of time and optimal stopping in a decentralized search and matching model. I offer a preliminary comparison of the two and suggest that the high cost of static mechanisms is not necessarily prohibitive. Mobin Y. Jeloudar, Stanford University, Stanford, CA, United States, Irene Y. Lo, Tristan Pollner, Amin Saberi

Ran I. Snitkovsky, Columbia Business School, New York, NY, United States, SRIBD, Shenzhen, China, Costis Maglaras, Jim Dai We consider a ride sharing platform, and a large population of strategic potential drivers, heterogeneous in terms of their opportunity costs, who choose whether or not to work for the platform. The platform is endowed with knowledge about the different drivers’ opportunity costs. How can the platform implement a matching policy that uses this knowledge in order to improve system efficiency? Can such improvement be quantified? We introduce an analytically-tractable mean field model and show that by integrating knowledge about drivers’ opportunity costs in its matching policy, the platform can perform up to two times more efficiently than when not doing so. 3 - Learning the Scheduling Policy in Time-varying Multiclass Many Server Queues Yueyang Zhong, The University of Chicago Booth School of Business, Chicago, IL, United States, John R. Birge, Amy R. Ward We consider a scheduling problem with minimizing the long-run average abandonment and holding costs as objective, in a time-varying multiclass Mt/M/N+M queueing system, when the model parameters (arrival, service and reneging rates) are a priori unknown. We evaluate the performance by means of regret against the benchmark asymptotically optimal c / rule with parameter knowledge.We propose a Learn-Then-Schedule algorithm over T periods, which is composed of a learning phase where maximum likelihood estimators of the parameters are formed, and an exploitation phase where an empirically learned c / rule is followed. We show that the smallest regret for static priority policies is O(log T), and that our algorithm achieves a regret upper bound of O(log T), which matches the lower bound. We extend the analysis to time-homogeneous multiclass GI/M/N+GI queues. SB10 CC Room 304B In Person: Stochastic Online Optimization General Session Chair: Jiashuo Jiang, New York University, New York, NY, 10012-1106, United States 1 - Dynamic Matching: Characterizing and Achieving Constant Regret Süleyman Kerimov, Stanford University, Stanford, CA, United States, Itai Ashlagi, Itai Gurvich We study how to optimally match agents in a dynamic market with heterogeneous match values. A network topology determines the feasible matches in the market. We consider networks that are two-sided when all matches include two agents, or acyclic otherwise. An inherent trade-off arises between generating short- and long-term value.We find that when the network satisfies a general position condition, this trade-off is limited, and a simple periodic clearing policy (nearly) maximizes the total value simultaneously at all times. Central to our results is the general position gap, , which quantifies the stability or the imbalance in the network. No policy can achieve a regret that is lower than the order of 1/ at all times. This lower bound is achieved by a policy, which periodically resolves a natural LP, provided that the delay between periods is of the order of 1/ . 2 - Online Stochastic Optimization under Wasserstein-based Non-stationarity Jiashuo Jiang, New York University We consider a general online stochastic optimization problem with multiple budget constraints. In each time period, a reward function and multiple cost functions are drawn from an non-stationary unknown distribution, and the decision maker needs to specify an action. The objective of the decision maker is to maximize the total reward subject to the budget constraints. In this paper, we consider a data-driven setting where the true distribution is unknown but a prior estimate (possibly inaccurate) is available. We propose a Wasserstein-distance based measure to quantify the inaccuracy of the prior estimate. We propose a new algorithm, which takes a primal-dual perspective and integrates the prior information of the underlying distributions into an online gradient descent procedure in the dual space. We show the corresponding algorithm achieves a regret of optimal order.

Made with FlippingBook Online newsletter creator