A Progressive Refinement Approach Tospatial Data
نویسندگان
چکیده
Spatial data mining, i.e., mining knowledge from large amounts of spatial data, is a demanding eld since huge amounts of spatial data have been collected in various applications, ranging from remote sensing to geographical information systems (GIS), computer cartography, environmental assessment and planning. The collected data far exceed people's ability to analyze it. Thus, new and e cient methods are needed to discover knowledge from large spatial databases. The goal of this thesis is to analyze methods for mining of spatial data, and to determine environments in which e cient spatial data mining methods can be implemented. In the spatial data mining process, we use (1) non-spatial properties of the spatial objects and (2) attributes, predicates and functions describing spatial relations between described objects and other features located in the spatial proximity of the described objects. The descriptions are generalized, transformed into predicates, and the discovered knowledge is presented using multiple levels of concepts. We introduce the concept of spatial association rules and present e cient algorithms for mining spatial associations and for the classi cation of objects stored in geographic information databases. A spatial association rule describes the implication of one or a set of features (or predicates) by another set of features in spatial databases. For example, the rule \80% of the large cities in Canada are close to the Canada-U.S. border" is a spatial association rule. A spatial classi cation process is a process that assigns a set of spatial objects into a number of given classes based on a set of spatial and non-spatial features (predicates). For example, the rule \if a store is inside a large mall and is close to highways, then it brings high pro ts" is a spatial classi cation rule. The developed algorithms are based on the progressive re nement approach. This approach allows for e cient discovery of knowledge in large spatial databases. A complete set iii of spatial association rules can be discovered, and accurate decision trees can be constructed, using the progressive re nement approach. Theoretical analysis and experimental results demonstrate the e ciency of the algorithms. The completeness of the set of discovered spatial association rules is shown through the theoretical analysis and the experiments show that the proposed spatial classi cation algorithm allows for better accuracy of classi cation than the algorithm proposed in the previous work [37]. An important proposed optimization technique used in the progressive re nement approach, is that the search for patterns at high conceptual levels may apply e cient spatial computation algorithms at a relatively coarse resolution. Only the candidate spatial predicates, which are worth detailed examination will be computed by re ned spatial computation techniques. Such a multi-level approach saves computation e ort because it is very expensive to perform detailed spatial computation for all the possible spatial relationships. The results of the research have been incorporated into the spatial data mining system prototype, GeoMiner. GeoMiner includes ve spatial data mining modules: characterizer, comparator, associator, cluster analyzer, and classi er. The SAND (Spatial And Nonspatial Data) architecture has been applied in the modeling of spatial databases. The GeoMiner system includes the spatial data cube construction module, the spatial on-line analytical processing (OLAP) module, and spatial data mining modules. A spatial data mining language, GMQL (Geo-Mining Query Language), is designed and implemented as an extension to Spatial SQL, for spatial data mining. Moreover, an interactive, user-friendly data mining interface has been constructed and tools have been implemented for visualization of discovered spatial knowledge. iv make every e ort to add to your faith goodness; and to goodness, knowledge; and to knowledge, self-control; and to self-control, perseverance; and to perseverance, godliness; and to godliness, brotherly kindness; and to brotherly kindness, love. For if you possess these qualities in increasing measure, they will keep you from being ine ective and unproductive in your knowledge of our Lord Jesus Christ. St. Peter on progressive re nement in 2 Peter 1:5-8 v Acknowledgments I have not stopped giving thanks for you, remembering you in my prayers. Ephesians 1:161 First, I thank Him through Whom all things are possible, I thank our Lord Jesus Christ. My work would not be possible without the gifts God gave me. This work was not by might nor by power, but by His Spirit2 and to Him be the glory. I wish to express my gratitude for the continuing support and guidance of my senior supervisor Dr. Jiawei Han. I thank him for his encouragement, con dence and many fruitful meetings we had. His advice and example of perseverance and hard work guided me though my research. He is a person who always nds time in his busy schedule for discussions with me, who has shown me the ropes of scienti c work and whose directions I will use in my further career. I am thankful to my supervisor Dr. Tiko Kameda, for his comments and advice that enabled me to make improvements to this thesis. I would like to thank Dr. Max Egenhofer and Dr. Tom Poiker for agreeing to serve as external examiners and for their remarks that helped me to better present the results of my research and improve the quality of the thesis. My gratitude and appreciation goes to my friends from the Intelligent Database Systems Laboratory, for many discussions, help and support I have received from them. In particular I would like to thank Neboj sa Stefanovi c, with whom I worked on the GeoMiner project; who worked together with me on many issues of spatial data mining and spatial OLAP, who proofread this thesis giving great feedback, and who shared many joyful moments during my years at SFU. I am grateful to other members of the team who made the current implementation of GeoMiner possible: Jenny Y. Tam for the design and implementation of DBMiner 1Quotations are from THE HOLY BIBLE: NEW INTERNATIONALVERSION. Copyright c 1973, 1978, 1984 by International Bible Society. Used by permission of Zondervan Publishing House. The "NIV" and "New International Version" trademarks are registered in the United States Patent and Trademark O ce by International Bible Society. 2Zechariah 4:6. vi OLAP module, based on which the GeoMiner's OLAP engine is build; Qing Chen and Wan Gong for the development of the classi cation module; Yijun Lu for the development of the YACC parser of the GeoMiner; and to Yongjian Fu for the initial development of the multilevel association rule mining module. Many thanks go to Anthony K.H. Tung and Kim Haas for the proofreading of this thesis and their remarks that enhanced the presentation. I am grateful to Dr. S lawomir Pilarski for his guidance in the early stages of my stay in Canada. I would like to thank the sta and faculty of the School of Computing Science for all the help they provided. I am very grateful to my family in Poland, for their support and love. I hope they will be proud of my achievements, as proud as I am of them. I thank my spiritual family, my brothers and sisters in Christ for their prayers, continuous encouragement, love and the patience they showed me. This work was supported in part by the research grant NSERC-OGP003723 from the Natural Sciences and Engineering Research Council of Canada, NCE/IRIS research grant from the Networks of Centres of Excellence of Canada, SFU Graduate Fellowships, SFU President's Ph.D. Stipend, BC Science Council GREAT Award, SFU Faculty of Applied Science Graduate Fellowship, and by Ebco/Epich Scholarships in Expert Systems funded by OPUS Building Corporation, Global (West) Wholesalers Ltd., and A.B.C. Recycling Ltd. vii
منابع مشابه
Interactive Poster: Progressive Information Presentation
An important aim of information visualization is the communication of characteristics of the data. Beside the exploration of relevant aspects, presentation of the findings is crucial. Due to the increasingly large data volumes, however, new strategies to avoid cluttered displays are necessary. Our approach makes use of progressive refinement to deskew information temporally. Moreover, we also a...
متن کاملProgressive refinement: more than a means to overcome limited bandwidth
Progressive refinement is commonly understood as a means to solve problems imposed by limited system resources. In this publication, we apply this technology as a novel approach for information presentation and device adaptation. The progressive refinement is able to handle different kinds of data and consists of innovative ideas to overcome the multiple issues imposed by large data volumes. Th...
متن کاملEvaluation of progressive treemaps to convey tree and node properties
In this paper we evaluate progressive treemaps. Progressive refinement has a long tradition in image communication, but is a relatively new approach for information presentation. Besides technical benefits it also promises to provide advantages important for the conveyance of data properties. In this first user study in this domain, we focus on the additional value of progressive refinement for...
متن کاملProgressive Presentation of Large Hierarchies Using Treemaps
The presentation of large hierarchies is still an open research question. Especially, the time-consuming calculation of the visualization and the cluttered display lead to serious usability issues on the viewer side. Existing solutions mainly address appropriate visual representation and usually neglect considering system resources. We propose a holistic approach for the presentation of large h...
متن کاملA progressive refinement approach for the visualisation of implicit surfaces
Visualising implicit surfaces with the ray casting method is a slow procedure. The design cycle of a new implicit surface is, therefore, fraught with long latency times as a user must wait for the surface to be rendered before being able to decide what changes should be introduced in the next iteration. In this paper, we present an attempt at reducing the design cycle of an implicit surface mod...
متن کاملOn-the-fly device adaptation using progressive contents
In this publication we propose a device adaptation approach based on progressive contents. Such representations are inherently scalable, created once, and multiply used for different kinds of device. Computationally inexpensive on-the-fly adaptation is achieved by a previewwise progressive data refinement with fully client-based resource assessment and estimation continuously predicting whether...
متن کامل