Informs Annual Meeting 2017

TA55

INFORMS Houston – 2017

2 - New Genomic Selection Approaches Lizhi Wang, Iowa State University, IMSE and ECPE, Ames, IA, 50011, United States, lzwang@iastate.edu Genomic selection is among the most impactful decisions that plant and animal breeders make. It also presents an incredibly complex mathematical challenge: how to select the right breeding parents to produce the next generation of progeny to achieve maximal genetic gains? We will review previously proposed genomic selection approaches, present a new one, and discuss future research directions. 3 - Mathematical Programming Based Training Set Optimization for Genomic Selection Shiyang Huang, Iowa State University, 0076 Black Engineering, Ames, IA, 50011, United States, shuang@iastate.edu, Guiping Hu A new method of optimized training set (TRS) selection was proposed for genomic selection. A quadratic optimization model with binary variables is formulated and and solved with a specially designed algorithm to select individuals forming the TRS from a calibration set. A ridge regression best linear unbiased predictor (rrBLUP) model was trained based on selected TRS to calculate genomic-estimated breeding value (GEBV) and predict the corresponding trait. The proposed method significantly outperformed common existing TRS selection methods, and suggested a criterion based on individuals’ Identify-By- State (IBS) distances to evaluate training sets. 4 - Optimizing the Size of Training Data for Genomic Prediction Reka Howard, University of Nebraska-Lincoln, 342B Hardin Hall, Lincoln, NE, 68583, United States, rekahoward@unl.edu Genomic prediction (GP) is a technique in plant breeding where plants’ genotypic and phenotypic information - called training set - are used to predict plant’s phenotypic performance for which only marker information is available, also known as testing set. In GP the size, genetic architecture and quality of phenotypic data influence predictability. Herein an algorithm based on genetic similarities that avoids the use of phenotypic data for choosing the optimal training set for GP using a soybean data set containing information for 5600 individuals is presented. 362B Robust Portfolio Selection Sponsored: Financial Services Sponsored Session Chair: Victor DeMiguel, London Business School, London, NW1 4SA, United Kingdom, avmiguel@london.edu 1 - Long-term Asset Allocation under Time-varying Imvestment Opportunities: Optimal Portfolios with Parameter and Model Uncertainty Alex Weissensteiner, Free University of Bolzano Bozen, We study the implications of predictability on the optimal asset allocation of ambiguity averse long-term investors in a VAR model. While over short periods the model-implied conditional covariance structure of asset-class returns determines the optimal allocation, over longer horizons the optimal asset allocation is significantly influenced by the covariance structure induced by estimation errors. As a consequence, the ambiguity averse long-term investor tilts her portfolio not simply toward the global minimum-variance portfolio but shrinks portfolio weights toward a seemingly inefficient portfolio which shows maximum robustness against estimation errors. 2 - A Portfolio Perspective on the Multitude of Firm Characteristics Victor DeMiguel, London Business School, Management Science and Operations, Regents Park, London, NW1 4SA, United Kingdom, avmiguel@london.edu We investigate which characteristics matter jointly for an investor who cares not only about average returns but also about portfolio risk, transaction costs, and out-of-sample performance. We find only a small number of characteristics—six— are significant without transaction costs. With transaction costs, the number of significant characteristics increases to 15 because the trades in the underlying stocks required to rebalance different characteristics often net out. We show investors can identify combinations of characteristics with abnormal out-of- sample returns net of transaction costs that are not fully explained by the Fama and French (2015) and Hou, Xue, and Zhang (2014) factors. TA55 Universitÿtsplatz 1, Bolzano, 39100, Italy, alex.weissensteiner@unibz.it, Thomas Dangl

3 - Risk-adjusted Returns and Leverage Efficiency under Market Impact

Chanaka Edirisinghe, Rensselaer Polytechnic Institute, Lally School of Management, Pittsburgh 2118, Troy, NY, 12180-3590, United States, edirin@rpi.edu, Jingnan Chen, Jaehwan Jeong We model the Pareto-efficient set between portfolio’s Sharpe and leverage under risk-aversion in continuous-time trading under market impact. We derive analytical properties for Sharpe-Leverage tradeoff and show the departures from the standard theory that ignores trading costs. We also obtain the minimum variance portfolio and provide empirical testing under market impact. Disregarding market impact leads to severe performance shortfall. 4 - Portfolio Construction by Mitigating Error Amplification: The Bounded-noise Portfolio Long Zhao, UT.McCombs Business School, 2110 Speedway Stop B6500, CBA 5.334 B, Austin, TX, 78712-1277, United States, longzhao@utexas.edu This paper focuses on the problem of poor portfolio performance when a minimum-variance portfolio is constructed using the sample estimates. Estimation errors are mostly blamed for this problem. However, we argue that even small unbiased estimation errors can lead to bad performance because the optimization step amplies errors. Instead of trying to independently improve the estimation step or fix the optimization step for robustness, we disentangle the well-estimated aspects from the poorly estimated aspects of the covariance matrix and handle them differently and appropriately. Using a single constant parameter, our method achieves excellent performance in both simulation and real data. 362C Emerging Topics in Machine Learning: Methodology and Applications Sponsored: Data Mining Sponsored Session Chair: Tong Wang, University of Iowa, Iowa City, IA, 52245, United States, tong-wang@uiowa.edu 1 - Big Data is Small Data Jessica Clark, University of Maryland Smith School of Business, College Park, MD, 20742, United States, jclark@stern.nyu.edu, Foster Provost The goal of this work is to explore the ways in which big data and small data are similar in a binary classification setting. We define big data sets as those that represent human behaviors selected from a large set of options, and are therefore not only large in quantity but are also sparse. Conversely, small data sets are dense and comprise few instances and few features. We propose a model for big (and small) data generation and use it to generate sample data sets. We find two key similarities between big and small data. First, models trained on big data generate learning curves that continue to improve as more instances are added. Second, models trained on sparse data tend not to overfit, even with many features. 2 - New Algorithms for Inference in Dynamic Stochastic Block Models Theja Tulabandhula, UIC, Chicago, IL, United States, tt@theja.org, Mehrnaz Amjadi Although the computational and statistical trade-offs for modeling single graphs are relatively well understood, extending such results to sequences of graphs is difficult. In this work, we propose two models for such sequences that capture: (a) link persistence between nodes across time, and (b) community persistence of each node across time. In the first model, we assume that the latent community of each node does not change, and in the second model we relax this assumption. For both models, we propose computationally efficient inference algorithms, which leverage community detection methods that work on single graphs. We provide simulation results validating their performance. 3 - Causal Rule Sets for Identifying Subgroups with Enhanced Peer Influence tong-wang@uiowa.edu, Cynthia Rudin, Shan Huang, Haojun Wu Peer influence is of central importance in social science. The goal of this paper is to identify subgroups where peer influence exerts enhanced effect on users’ responses to social advertisements. Our model characterizes such a subgroup using a set of interpretable rules, constructed from information regarding a user (susceptibility), his/her friend (influence), and their relationship. For this study, we partnered with a world leading mobile social network platform, WeChat, and conducted a large-scale randomized controlled field experiment on WeChat Moments on more than 28 million users across 99 ads. Our model showed competitive performance on different types of product ads. TA56 Tong Wang, University of Iowa, Pappajohn Business Build, 21 East Market Street, Iowa City, IA, 52245, United States,

290

Made with FlippingBook flipbook maker