An Optimal Temporal and Feature Space Allocation in Supervised Data Mining
نویسندگان
چکیده
This paper presents an expository study of temporal data mining for prediction of a future response sequence via mining large number of highly correlated concurrent time series. In the study, we investigate a two dimensional search scheme over time domain weighting and feature space selection. The weighting of observation records over time domain is used to exploit the time dependency structure and feature space selection is enforced to avoid the over-fitting issue. For a specific temporal and spatial selection, its area under ROC curve (AUC) is used to evaluate the prediction performance over the training and testing data. By varying the weighting scheme and feature selection, AUC contour maps on both training and testing data are generated. The contour maps can suggest us to apply the optimal allocations with highest AUC for future responses prediction in training, testing , and possible validation data. Numerical results over two sets of temporal data with different applications have shown that the proposed scheme can improve the prediction performance of conventional data mining methods significantly.
منابع مشابه
Supervised Feature Extraction of Face Images for Improvement of Recognition Accuracy
Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...
متن کاملFisher Discriminant Analysis (FDA), a supervised feature reduction method in seismic object detection
Automatic processes on seismic data using pattern recognition is one of the interesting fields in geophysical data interpretation. One part is the seismic object detection using different supervised classification methods that finally has an output as a probability cube. Object detection process starts with generating a pickset of two classes labeled as object and non-object and then selecting ...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملData Stream Mining Algorithms in Big Data: A Survey
The infrastructure build in the big data platform is reliable to challenge the commercial and noncommercial IT development communities of data streams in high dimensional data cluster modeling. The APSO ie., Accelerated Particle Swarm Optimization is a technique which commonly known for data's are sourced to accumulate their continuation in the batch model induction algorithms which is not feas...
متن کاملOverlap-based feature weighting: The feature extraction of Hyperspectral remote sensing imagery
Hyperspectral sensors provide a large number of spectral bands. This massive and complex data structure of hyperspectral images presents a challenge to traditional data processing techniques. Therefore, reducing the dimensionality of hyperspectral images without losing important information is a very important issue for the remote sensing community. We propose to use overlap-based feature weigh...
متن کامل