Informs Annual Meeting Phoenix 2018

INFORMS Phoenix – 2018

WA64

2 - Matched Forest for High-dimensional Matched Case-control Studies

n WA64 West Bldg 104A

Nooshin Shomal Zadeh, PhD Student, Arizona State University, 699 S. Mill Ave, Tempe, AZ, 85281, United States, Sangdi Lin, George Runger Matched case-control study designs are commonly used in clinical studies to identify exposure variables associated with a medical condition. Matching is a preprocessing approach which is used in case-control studies to improve the efficiency by enforcing the confounding variables to have same distributions for cases and controls. Existing methods for analyzing matched case-control data sets have limitations in high-dimensional setting (for matching or exposure variables) and with interaction effects. This research proposes a new learning method which is not only flexible for a number of matching and exposure variables, but can also detect interaction effects. 3 - A Multi-model Approach for Sports Betting Recommendation Ismail T. Yumru, Data Scientist, Algopoly, Istanbul, 34342, Turkey, Mustafa Gokce Baydogan, Berk Orbay Statistical learning methods are increasingly used in the domain of sports result prediction, where estimating the probabilities of possible outcomes (odds) of a game better than a bookmaker does is key to making profit. In this study, several learning techniques such as multinomial penalized regression, random forest and gradient boosting are carried out first to estimate the odds, then to select the subsets of games that are most likely to be profitable. 4 - Learning Based Mission Planning for Solar Powered Multi Robot System Di Wang, University of Illinois at Chicago, Chicago, IL, 60607, United States, Mengqi Hu, Yang Gao To improve the long duration operation for multi-robot system, the solar-powered robot has attracted greater attention. In this research, we propose a Markov decision model for solar-powered multi-robot mission planning to co-optimize the task allocation and energy schedule. A deep reinforcement learning algorithm is developed to solve the Markov model considering variours objectives, such as minimizing traveling distance, traveling time and net energy consumption. Without retraining for new problem instance, the proposed algorithm can generate near optimal mission planning and energy scheduling decisions. n WA66 West Bldg 105A Practice – Data Mining & Control in Manufacturing Contributed Session Chair: Arun Chockalingam, Eindhoven University of Technology, Hoog Gagel 62, Eindhoven, 5611BG, Netherlands 1 - Machine Learning-based Quality Prediction for Smart Manufacturing in SMT Process Dongil Kim, Assistant Professor, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon, Korea, Republic of, Hyein Kim, Jeongin Koo, Sungsoo Choi, Sang-Hyun Lee, Jeong Tae Kang Surface Mount Technology (SMT) process is one of the most important processes in electronics industry. Through the SMT process, electronic chips are mounted on a Printed Circuit Board (PCB), and released as final products. Hence, the final quality of product is determined with the SMT process. In this paper, we propose a machine learning-based quality prediction method for SMT process. The input data were collected from the equipment in the process, such as reflow/soldering process. In addition, external variables, such as weather information, were also employed. We evaluate the performance of the proposed method through a real- world SMT process dataset, and the important factors are analyzed. 2 - A Study on Data Preprocessing Method for Relationship and Characterization of Cutting Process Monitoring Signals Hyein Kim, Korea Institute of Industrial Technology, Cheonan, Korea, Republic of, Dongil Kim, Jeongin Koo The importance of monitoring, anomalous prediction and autonomous control is increasing to prevent and respond to unusual situation for maintaining the accuracy and operation rate of CNC. To propose optimal machining conditions, we measure the status of real time process through the CNC communication. Based on this, we have attached various sensors to the CNC to collect data on all the situations that occur during machining to understand the relationship and characteristics between the monitoring signals. To clarify relationships of collected data, we set up a machining section based on the position of the axis and applied a variety of analytical methods to perform the data preprocessing process. 3 - Operationalizing the Offshoring Decision Arun Chockalingam, Assistant Professor, Eindhoven University of Technology, Den Dolech 2, Eindhoven, 5612AZ, Netherlands, Haolin Feng We consider a manufacturer who has both onshore and offshore production facilities. The manufacturer can dynamically allocate production between both facilities. Adjusting the production allocation incurs both fixed and proportional costs. We formulate this allocation problem as a stochastic-impulse control problem and derive the optimal allocation policy.

Joint Session DM/Practice Curated: Predictive and Prescriptice Analytics Sponsored: Data Mining Sponsored Session

Chair: Lianning Zhu, Texas Tech University, Lubbock, TX 1 - L1 Norm Based Major Component Detection Analysis Qi An, North Carolina State University, Raleigh, NC, 27606, United States, Shu-Cherng Fang, Shan Jiang l1 MCDA is a state-of-the-art tool to identify the major components of data with irregularly positioned ``spokes’’. We develop an algorithmic framework of l1 MCDA for multivariate asymmetric radial data clouds. It consists of locating a central point and calculating the major directions and median radii in those direction via a two-level median fitting process. The central point analysis features a pre-selection procedure to screen out candidate points with sufficient data points in the vicinity. Numerical experiments test the proposed analysis methods on high dimensional data set of various configurations and indicate our proposed method demonstrates superior accuracy and robustness. 2 - A Novel Optimization Based Algorithm for Multi-class Data Classification Problem Fatih Rahim, Ko University, Rumelifeneri Yolu, Sariyer, Istanbul, 34450, Turkey, Metin T rkay Multi-class data classification is a supervised machine learning problem that involves assigning data to multiple groups. We present a novel MILP-based algorithm that splits each class’s data set into subsets such that the subsets of different classes are linearly separable. At each iteration we form a subset of samples out of the set of unassigned samples by a MILP model that maximizes the cardinality of the new subset. The algorithm terminates when all the samples are assigned. We build classifiers based on the convex hulls of the subsets and the polyhedral regions for the testing phase. We conclude that our optimization based algorithm provides competitive results in terms of prediction accuracy. 3 - Heart Rate Estimation using Wrist-type Photoplethysmography Based on Neural Network Lianning Zhu, Texas Tech University, Lubbock, TX, 79416, United States, Dongping Du Photoplethysmography (PPG) signals from wearable devices are easily corrupted by motion artifact (MA), which poses great difficulties on heart rate (HR) estimation. A moving time window was used to segment PPG signals and ACC signals. Here, a new HR tracking algorithm was developed and denoted as NN- Bayesian. HR prediction was performed by a neural network and HR estimation was performed based on predicted HR and Bayesian decision theory. To validate proposed HR tracking framework, HR estimation was performed using raw PPG signals and cleansed PPG signals which were obtained by spectrum subtraction (SS). The proposed framework shows its accuracy and robustness to track HR. Joint Session DM/Practice Curated: Data Science and Deep Learning V Sponsored: Data Mining Sponsored Session Chair: Di Wang, University of Illinois at Chicago, Chicago, IL, 60607, United States 1 - Active Batch Learning With Cluster-based Stochastic Query-by-forest (CSQBF) Ghazal Shams, Arizona State University, Tempe, AZ, 85282, United States, Seho Kee, Enrique Del Castillo, Eugene Tuv, George Runger Modern systems use automated sensing that can generate a large number of unlabeled instances at low cost, but obtaining labels may require human effort that is time-consuming and expensive. In this work, we propose the Cluster-based Stochastic Query-By-Forest (CSQBF) algorithm which introduces an enhanced stochastic querying strategy by combining the supervised knowledge of a trained classifier on the labeled data with unlabeled data cluster information iteratively in a pool-based active learning scenario. n WA65 West Bldg 104B

434

Made with FlippingBook - Online magazine maker