INFORMS 2021 Program Book
INFORMS Anaheim 2021
MB34
2 - Dynamic Exploration and Exploitation: The Case of Online Lending
MB31 CC Room 208A In Person: Machine Learning and Discrete Optimization General Session Chair: Bistra Dilkina, University of Southern California, Los Angeles, CA, 90089, United States 1 - Learning to Generate Graphs End-to-end with Combinatorial Objectives Haoming Li, University of Southern California, Los Angeles, CA, United States, Aaron Ferber, Bistra Dilkina With applications in drug discovery, social network modeling and algorithm benchmarking, graph generation considers learning and generating synthetic graphs that effectively mirror those of the real-world graphs in terms of graph statistics. In this work, we focus on graph generation models optimized to match specific combinatorial graph properties, e.g. to minimize the distance between the maximum modularity solutions in the given graphs and those in the generated graphs. We propose to train a graph generator that explicitly integrates the specific combinatorial metrics in a differentiable manner. Our end-to-end learning approach, combining convolutional neural network and differentiable optimization, demonstrates promising performance on generating graphs according to various combinatorial properties. 2 - Toward a Unified Framework for Branch and Bound and Reinforcement Learning Mohammad Hesam Shaelaie, Lehigh University, Bethlehem, PA, United States, Ted K. Ralphs, Lawrence V. Snyder In this research, we are aiming to develop a framework unifying Branch and Bound (BB) and Reinforcement Learning (RL). Each of these algorithmic approaches has its own strengths and weakness. Our effort is to design a unified framework to benefit from the strengths while addressing the weakness of each algorithm. BB searches the feasible region systematically and employs powerful pruning methods, but this comes at a high cost per iteration. In contrast, RL has a relatively low iteration cost but is also quite myopic. We explore the tradeoffs inherent in these two approaches and the possibilities for developing a unified framework. 3 - Efficient Active Search for Combinatorial Optimization Problems Kevin Tierney, Bielefeld University, Bielefeld, 33615, Germany, André Hottung, Yeong-Dae Kwon Deep learning-based machine learning approaches have recently seen increasing success at solving difficult combinatorial optimization problems. Combining deep networks with effective search procedures has proven critical to finding good solutions. We propose three generic search strategies that can be combined with many neural network models extending the active search approach proposed in Bello et al. (2016). Our search strategies involve updating network weights or output probabilities while searching for a solution, thus allowing deep learning approaches to better generalize to problems they have not been trained on. Our approach offers significant performance improvements on two routing problems and the job shop scheduling problem. MB32 CC Room 208B In Person: Learning and Data-driven Algorithms General Session Chair: Ningyuan Chen, University of Toronto, Mississauga, ON, L5L 1C6, Canada 1 - Corruption-robust Exploration in Episodic Reinforcement Learning Thodoris Lykouris, Assistant Professor, Massachusets Institute of Technology, Cambridge, MA, 10011-2014, United States, Max Simchowitz, Aleksandrs Slivkins, Wen Sun We initiate the study of episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of multi-armed bandits. We provide a framework which modifies the aggressive exploration enjoyed by existing reinforcement learning approaches based on “optimism in the face of uncertainty”, by complementing them with principles from “action elimination”. Importantly, our framework circumvents the major challenges posed by naively applying action elimination in the RL setting, as formalized by a lower bound we demonstrate. It yields efficient algorithms which (a) attain near-optimal regret in the absence of corruptions and (b) adapt to unknown levels of corruption, enjoying regret guarantees which degrade gracefully in the total corruption encountered.
Mingxi Zhu, Graduate School of Business, Stanford University, Stern School of Business, Stanford, CA, 10012, United States This paper studies exploration/exploitation tradeoffs in the context of online lending. In the case of unsecured online lending, the lender effectively gives away money in order to learn about the borrower’s ability to repay. In our model, the lender maximizes the expected net present value of the cash flow she receives by dynamically adjusting the loan amounts and the interest rate as she learns about the borrower’s unknown income. The lender has to carefully balance the trade- offs between earning more interest when she lends more and the risk of delinquency. We formulate the problem as an infinite-horizon dynamic program and establish the structure of optimal policy for a large class of income distributions. We analyze the relative regret compared with the full-information case and show there is a distribution under which it is unbounded. 3 - Model-free Assortment Pricing with Transaction Data Saman Lagzi, University of Toronto, Toronto, ON, Canada, Ningyuan Chen, Andre Augusto Cire, Ming Hu We study a problem in which a firm sets prices for products based on the transaction data, i.e., which product past customers chose from an assortment and what were the historical prices that they observed. Our approach does not impose a model on the distribution of the customers’ valuations and only assumes that purchase choices satisfy incentive-compatible constraints. The individual valuation of each past customer can then be encoded as a polyhedral set, and we maximize the worst-case revenue assuming that new customers’ valuations are drawn from the empirical distribution implied by the collection of such polyhedra. We show that the optimal prices in this setting can be approximated at any arbitrary precision by solving a compact mixed-integer linear program. Moreover, we design approximation strategies that are of low computational complexity and interpretable. MB34 CC Room 209B In Person: Service Science IBM Best Student Paper Competition (I) Award Session Chair: Jinsheng Chen, Columbia University, New York, NY, 10027- 6714, United States Nonprofit crowdsourcing platforms encourage volunteers to complete tasks by using nudging mechanisms to notify a subset of volunteers with the hope that at least one of them responds positively. However, since excessive notifications may reduce volunteer engagement, the platform faces a trade-off between notifying more volunteers for the current task and saving them for future ones. Motivated by these applications, we introduce the online volunteer notification problem. We develop an online randomized policy that achieve constant-factor guarantees, and we demonstrate the effectiveness of our policy by testing them on data from a volunteer-based food recovery platform. 2 - Can Autonomous Vehicles Solve the Commuter Parking Problem Neda Mirzaeian, Carnegie Mellon University, Pittsburgh, PA, 15217-1249, United States We investigate the effect of autonomous vehicles (AVs) on the morning commute, and characterize a user equilibrium for commuters by developing a continuous- time model that takes into account parking fees and traffic congestion as key economic deterrents to driving. We illustrate our results using data from Pittsburgh, and show that AVs result in a high total system cost. To reduce this cost, a social planner can regulate commuters’ decisions by adjusting parking fees and congestion tolls, and/or adjusting infrastructure (e.g., converting downtown parking spots to drop-off spots). Our results indicate that these measures can reduce the total system cost substantially (e.g., by 70% in Pittsburgh). 3 - Optimal Routing under Demand Surges Jinsheng Chen, Columbia University, New York, NY, 10027-6714, United States Many service systems employ dedicated staffing with cross-training to provide partial flexibility. Servers primarily serve specific classes of customers, but may serve other classes if necessary, at the cost of inefficiency. For example, during a pandemic, nurses trained in other specializations may be reassigned to take care of patients who have contracted the infectious disease. We consider a multi-class multi-pool parallel server system with partial flexibility, under general time- varying arrival rates. We derive near-optimal scheduling policies that minimize the sum of holding and “overflow” costs. Our policy is simple, intuitive, and makes use of future arrival rate information. 1 - Online Policies for Efficient Volunteer Crowdsourcing Scott Rodilitz, Yale University, New Haven, CT, 90278, United States
53
Made with FlippingBook Online newsletter creator