2016 INFORMS Annual Meeting Program

SD47

INFORMS Nashville – 2016

3 - Path-dependent And Randomized Strategies In Barberis’ Casino Gambling Model Xunyu Zhou, Columbia University, Manhattan, New York City, NY, United States, xz2574@columbia.edu We consider the dynamic casino gambling model initially proposed by Barberis (2012) and study the optimal stopping strategy of a pre-committing gambler with cumulative prospect theory (CPT) preferences. We illustrate how the strategies computed in Barberis (2012) can be strictly improved by reviewing the betting history or by tossing an independent coin, and we explain that the improvement generated by using randomized strategies results from the lack of quasi-convexity of CPT preferences. Moreover, we show that any path-dependent strategy is equivalent to a randomization of path-independent strategies. 4 - Optimal Exit Time From Casino Gambling: Strategies Of Pre-committed And Naive Gamblers Xuedong He, Chinese University of Hong Kong, xdhe@se.cuhk.edu.hk We study the strategies of a pre-committed gambler, who commits her future selves to the strategy she sets up today, and of a naive gambler, who fails to do so and thus keeps changing plans at every time. We identify conditions under which the pre-committed gambler, asymptotically, adopts a stop-loss strategy, exhibits the behavior of disposition effect, or does not exit. When the utility function is piece-wise power and the probability weighting functions are concave power, we derive the optimal strategy of the pre-committed gambler in closed-form whenever it exists. Finally, we study the actual behavior of the naive gambler and highlight its marked differences from that of the pre-committed gambler. SD46 209B-MCC Revenue Management: From Theory to Practice Sponsored: Revenue Management & Pricing Sponsored Session Chair: Georgia Perakis, Massachusetts Institute of Technology, Cambridge, MA, United States, georgiap@mit.edu 1 - The Role Of Vendor Funds In Promotion Planning Lennart Baardman, MIT, Cambridge, MA, United States, Georgia Perakis, Kiran Venkata Panchamgam Vendor funds are integral to promotion planning. Vendor funds are trade deals in which manufacturers offer discounts to retailers, encouraging them to promote their products, while demanding pass-through to the customers. We model the problem of selecting vendor funds to maximize profit while taking into account the impact on promotional pricing as a QIP. First, we use a strategic model to analyze the optimal strategies of both the manufacturer and the retailer. Additionally, to solve the tactical problem, we use Lagrangian relaxation to propose an approximation algorithm with analytical performance guarantees. Finally, computational results show near-optimal practical performance. 2 - Revenue Management For The Shipping Industry Under Limited Foresight Max Biggs, Massachusetts Institute of Technology, Cambridge, MA, United States, maxbiggs@mit.edu, Georgia Perakis We study a scenario where a shipping company has limited foresight into cargo availability in future periods. It is possible that a high valued cargo may discharge in unpromising region with scarce or low valued cargoes, thus jeopardizing profit in subsequent time periods. We model this situation using a finite horizon Markov Decision Process and present a ranking algorithm which solves optimally in polynomial time. We also explore fast heuristics, and test these algorithms on a simulation that uses real shipping data to evaluate their performance in a practical setting. We also show extensions for uncertain shipping rates. 3 - Submodular Batch Scheduling Consider an online retailer shipping out orders from a warehouse. Multi-item orders take up capacity in a holding area until all items are picked. This holding area is often the cause of bottlenecks, so we formulate the submodular batch scheduling problem to maximize throughput. We wish to schedule jobs in batches to minimize the sum of job completion times. The processing time of each batch is given by a submodular cost function, and the completion time of each job is given by the completion time of the batch. We show this problem is strongly NP-hard, but propose several practical methods, including a 4-approximation algorithm. 4 - The Periodic Joint Replenishment Problem Is Strongly NP-hard Tamar Cohen, MIT, tcohen@mit.edu, Liron Yedidsion The Joint Replenishment Problem (JRP) deals with the prospect of saving resources through coordinated replenishments in order to achieve substantial cost savings. In the JRP it is required to schedule the replenishment times of numerous commodities in order to supply a constant demand per commodity. In Daniel Chen, MIT, Cambridge, MA, United States, dcchen@mit.edu, Retsef Levi, Georgia Perakis

this research we answer a long-standing open question regarding the computational complexity the periodic JRP. This problem received a lot of attention over the years and many heuristic and approximation algorithms were suggested. However, in spite of the vast effort, the complexity of the problem remained unresolved. In this research, we provide a proof that the problem is indeed strongly NP-hard. SD47 209C-MCC Online Learning and Revenue Management Sponsored: Revenue Management & Pricing Sponsored Session Chair: Maxime Cohen, Google Research, Google, New York, NY, 10012, United States, maxccohen@google.com Co-Chair: Ilan Lobel, New York University, New York, NY, United States, ilobel@stern.nyu.edu 1 - Dynamic Pricing And Learning With Online Retail Rankings N. Bora Keskin, Duke University, Durham, NC, United States, bora.keskin@duke.edu, Arnoud V. den Boer In online market environments such as Amazon or Google Shopping, firms receive advertisement space if they satisfy certain conditions. Beforehand, it is not clear if the benefits of this increased exposure outweigh the potential costs. We investigate this question in a dynamic pricing-and-learning setting. 2 - Dynamic Pricing With Demand Covariates Sheng Qiang, Stanford University, sqiang@stanford.edu, Mohsen Bayati We consider a generic problem that a firm sells products over T periods without knowing the demand function. The firm sets prices to earn revenue and learn the demand function. In each period before setting the prices, the firm observes some demand covariates, which may be predictive of the demand. Demand covariates can include marketing expenditure, consumer’s attributes, weather, etc. We prove that in this setting the greedy policy achieves asymptotically optimal performance. We also show that inclusion of any set of demand covariates as potential variables for the demand estimation (even though they could be independent of the demand) would make greedy policy asymptotically optimal. 3 - Optimistic Gittins Indices Vivek Farias, MIT, vivekf@mit.edu, Eli Gutin We consider the multi-armed bandit problem in the Bayesian setting wherein one seeks to minimize expected regret. Gittins indices provide an optimal algorithm for the maximization of discounted, infinite horizon rewards (as opposed to regret minimization) in this very setting. So motivated, we propose an index rule we dub ‘optimistic’ Gittins indices. We show that this rule achieves logarithmic regret with an optimal constant, matching the Lai-Robbins lower bound for the problem — the strongest possible guarantee possible. The algorithm also offers excellent performance relative to recently studied approaches such as Thomspon and Information Directed Sampling. 4 - Feature-based Dynamic Pricing Ilan Lobel, NYU Stern, New York, NY, 10012, United States, ilobel@stern.nyu.edu, Maxime Cohen, Renato Paes Leme We consider the problem faced by a firm that receives highly differentiated products in an online fashion and needs to price them in order to sell them to its customer base. Products are described by vectors of features and the market value of each product is linear in the values of the features. The firm does not initially know the values of the different features, but it can learn the values of the features based on whether products were sold at the posted prices in the past. We propose an algorithm that combines ideas from contextual bandits and the ellipsoid method, and show that it has a worst-case regret that is quadratic in the dimensionality of the feature space and logarithmic in the time horizon.

109

Made with