Dppred : an Effective Prediction Framework with Concise Discriminative
نویسندگان
چکیده
In the literature, two series of models have been proposed to address prediction problems including classification and regression. Simple models, such as generalized linear models, have ordinary performance but strong interpretability on a set of simple features. The other series, including tree-based models, organize numerical, categorical and high dimensional features into a comprehensive structure with rich interpretable information in the data. In this thesis, we propose a novel discriminative pattern-based prediction framework (DPPred) to accomplish the prediction tasks by taking their advantages of both effectiveness and interpretability. Specifically, DPPred adopts the concise discriminative patterns that are on the prefix paths from the root to leaf nodes in the tree-based models. Moreover, DPPred selects a limited number of the useful discriminative patterns by searching for the most effective pattern combination to fit generalized linear models. To validate the effectiveness of DPPred, we conduct experiments on both classification and regression tasks. Experimental results demonstrate that DPPred provides competitive accuracy with the state-of-the-art as well as the valuable interpretability for developers and experts. In particular, when studying health status for cardiopulmonary patients, DPPred shows the acceptable predicting accuracy (more than 95%) and reveals the importance of demographic features; when studying the amyotrophic lateral sclerosis (ALS) disease, DPPred not only outperforms the baselines by using only 40 concise discriminative patterns out of a potentially exponentially large set of patterns, but also discover novel markers.
منابع مشابه
DPPred: An Effective Prediction Framework with Concise Discriminative Patterns
In the literature, two series of models have been proposed to address prediction problems including classification and regression. Simple models, such as generalized linear models, have ordinary performance but strong interpretability on a set of simple features. The other series, including tree-based models, organize numerical, categorical and high dimensional features into a comprehensive str...
متن کاملDPClass: An Effective but Concise Discriminative Patterns-Based Classification Framework
Pattern-based classification was originally proposed to improve the accuracy using selected frequent patterns, where many efforts were paid to prune a huge number of non-discriminative frequent patterns. On the other hand, tree-based models have shown strong abilities on many classification tasks since they can easily build high-order interactions between different features and also handle both...
متن کاملTowards Explanation of DNN-based Prediction with Guided Feature Inversion
While deep neural networks (DNN) have become an effective computational tool, the prediction results are often criticized by the lack of interpretability, which is essential in many real-world applications such as health informatics. Existing attempts based on local interpretations aim to identify relevant features contributing the most to the prediction of DNN by monitoring the neighborhood of...
متن کاملDiscriminatively Activated Sparselets
Shared representations are highly appealing due to their potential for gains in computational and statistical efficiency. Compressing a shared representation leads to greater computational savings, but can also severely decrease performance on a target task. Recently, sparselets (Song et al., 2012) were introduced as a new shared intermediate representation for multiclass object detection with ...
متن کاملAdvances in discriminative dependency parsing
Achieving a greater understanding of natural language syntax and parsing is a critical step in producing useful natural language processing systems. In this thesis, we focus on the formalism of dependency grammar as it allows one to model important headmodifier relationships with a minimum of extraneous structure. Recent research in dependency parsing has highlighted the discriminative structur...
متن کامل