Informs Annual Meeting Phoenix 2018

INFORMS Phoenix – 2018

SB62

4 - A Statistical Network Modeling Approach for Discriminative Brain Network Connectivity Analysis Shouyi Wang There are many studies focusing on network detection in multivariate (MV) time- series data. A great deal of focus have been on estimation of brain networks using fMRI, fNIRS and EEG. We propose a sparse weighted directed network (SWDN) estimation approach which can detect the underlying minimum spanning network with maximum likelihood and estimated weights based on linear Gaussian conditional relationship in the multivariate time series. Considering the brain neuro-imaging signals as the multivariate data, we evaluated the performance of the proposed approach using the publicly available fMRI data-set and the results of the similar study which had evaluated popular network estimation approaches on the simulated fMRI data. Moreover, we applied the proposed network construction method as a feature extraction technique from fMRI data to classify the patterns of the Parkinson Disease. n SB64 West Bldg 104A Joint Session DM/AI/Practice Curated: Unstructured Data Mining: Methods and Applications Sponsored: Data Mining Sponsored Session Chair: Chaojiang ôCJ Wu, Drexel University, 3141 Chestnut Street, Philadelphia 1 - Investor’s Opinion Divergence and Stock Return Volatility: Evidence from User-generated Content Yang Li, Drexel University LeBow College of Business, PA, United States This paper examines the relationship between investors’ opinion divergence and stock return volatility using daily user-generated content (UGC) data of 66 most discussed stocks from a social media platforms for investors in the U.S. Our research adds to the emerging body of literature on the impact of UGC on the stock market regarding: (1) novel techniques for systematically measuring sentiment divergence in large-scale UGC data; and (2) uncovering the dynamic interdependence of the relationship between investors’ opinion divergence and stock volatilities. 2 - Deep Learning for Predicting Research Topic Trends Jiangen He, Drexel University College of Computing and Informatics, PA, United States Text data can be useful for predicting topic trends in the future. The prediction can be beneficial for various applications. In this study, we represent the semantic and structural dynamics of research topics by embedding learning and propose a deep learning model that can capture discriminative features for predicting research topic trends (rising or falling trend). We train and test the model by using scientific publications from PubMed. Our experimental results show that both semantic and structural dynamics of a research topic has predictive features for its future trend. 3 - Breadth, Depth, and Conformity: A Double-Hurdle Model for Review Helpfulness Chaojiang Wu, Drexel University, 3141 Chestnut Street, Philadelphia, PA, 19107, United States, Feng Mai, Xiaolin Li Using multi-attribute attitude model as a framework, we show that the helpfulness of a review not only depends on its internal features, but also hinges on its breadth, depth, and content, in relation to other reviews of the product. We construct two novel measures, content entropy and content deviation, and then propose a double-hurdle model to simultaneously estimate the probability of a review being voted and its helpfulness. We show content entropy positively and content deviation negatively related with the helpfulness of a review. We further illustrate that the content deviation moderates the relationship of the numerical rating deviations and helpfulness. 4 - Person Name Disambiguation Based on Profession Haimonti Dutta, Assistant Professor, University at Buffalo, 325P Jacobs Management Center, Buffalo, NY, 14260, United States Named Entity Disambiguation (NED) is the task of disambiguating entity mentions in natural language text. In this paper, we present a generative model based on Latent Dirichlet Allocation (LDA) to disambiguate entity mentions occurring in noisy text. Each entity is assumed to have a profession and the representative words are generated from the document in which it is mentioned; words representing a profession are further interpreted for the task of disambiguation. Empirical results presented on a corpus of 14020 historical newspaper articles show that this generative model can obtain reasonable accuracy in the task of person name disambiguation.

n SB62 West Bldg 103A Data Mining Theoretical Paper Finals Sponsored: Data Mining Sponsored Session Chair: Tong Wang, University of Iowa, Iowa City, IA, 52245, United States Chair: Ramin Moghaddass, University of Miami, Miami, FL Abstract not available. n SB63 West Bldg 103B Joint Session DM/Practice Curated: Data Analytics and Modeling for Health Informatics and Decision Making Sponsored: Data Mining Sponsored Session Chair: Shouyi Wang, PhD, The University of Texas at Arlington, 500 West First Street, Arlington, TX 76019, Arlington, TX, 76019, United States 1 - Information Loss from Parameter Estimation in Discrete Distributions with an Application in Genetics Maryam Moghimi, PhD Candidate, The University of Texas at Arlington, Arlington, TX, 76019, United States, Herbert W. Corley We consider the information loss due to the use of a statistic T(X) to characterize an n-dimensional vector X of random variables representing discrete data. In other words, we compare the probability that a random sample X of size n takes on the value x to the probability that X is x given the statistic satisfies T(X) is T(x). We focus on sufficient statistics to develop a general formula for the Shannon information loss due to this data reduction. Applications of this approach will be discussed, including an example on gene sequencing. 2 - A Chronological Pharmacovigilance Network Analysis Approach for Predicting Adverse Drug Events Behrooz Davazdahemami, PhD Candidate, Oklahoma State University, Stillwater, OK, 74075, United States, Dursun Delen In this study, we extend prior research by proposing a chronological network analysis (NA) approach for predicting adverse drug events (ADEs) before the drugs’ approval. Combining known drug-ADE associations from the biomedical literature with information about drugs’ target proteins for pre-2001 approved drugs, a drug-ADE network was constructed and used as a basis to train machine- learning (ML) models. The models were validated by predicting drug-ADE associations for post-2001 drugs. The promising results (92.9% accuracy, 72.8% sensitivity) achieved by the ensemble ML models shows the incredible power of such models, in combination with NA, for capturing sophisticated drug-ADE patterns. 3 - Hemodynamic Pattern Discovery and Classification with Children Who Stutter Using Functional Near-infrared Spectroscopy Rahil Hosseini In this paper, we developed a novel supervised sparse feature learning approach to discover discriminative biomarkers from functional near infrared spectroscopy (fNIRS) brain imaging data recorded during a speech production experiment from 46 children in three groups: children who stutter (n = 16); children who do not stutter (n = 16); and children who recovered from stuttering (n = 14). We made an extensive feature analysis of the cerebral hemodynamics from fNIRS signals and selected a small number of important discriminative features using the proposed sparse feature-learning framework. The discovered set cerebral hemodynamics features are presented as a set of promising biomarkers to elucidate the underlying neurophysiology in children who have recovered or persisted in stuttering and to facilitate future data-driven diagnostics in these children.

54

Made with FlippingBook - Online magazine maker