Redundancy Detection in Semistructured Case Bases

نویسندگان

  • Kirsti Racine
  • Qiang Yang
چکیده

ÐWith the dramatic proliferation of case-based reasoning systems in commercial applications, many case bases are now becoming legacy systems. They represent a significant portion of an organization's assets, but they are large and difficult to maintain. One of the contributing factors is that these case bases are often large and yet unstructured or semistructured; they are represented in natural language text. Adding to the complexity is the fact that the case bases are often authored and updated by different people from a variety of knowledge sources, making it highly likely for a case base to contain redundant and inconsistent knowledge. In this paper, we present methods and a system for maintaining large and semistructured case bases. We focus on a difficult problem in case base maintenance: redundancy detection. This problem is particularly pervasive when one deals with a semistructured case base. We will discuss an information-retrieval-based algorithm and an implemented system for solving this problem. As the ability to contain the knowledge acquisition problem is of paramount importance, our method allows one to express relevant domain expertise for detecting redundancy naturally and effortlessly. Empirical evaluations of the system demonstrate the effectiveness of the methods in several large

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Redundancy and Inconsistency Detection in Large and Semi-structured Case Bases

With the dramatic proliferation of case based reasoning systems in commercial applications, many case bases are now becoming legacy systems. They represent a significant portion of an organization’s assets, but they are large and difficult to maintain. One of the contributing factors is that these case bases are often large and yet unstructured or semi-structured; they are represented in natura...

متن کامل

Maintaining Unstructured Case Bases

With the dramatic proliferation of case based reasoning sys tems in commercial applications many case bases are now becoming legacy systems They represent a signi cant portion of an organization s assets but they are large and di cult to maintain One of the contribut ing factors is that these case bases are often large and yet unstructured they are represented in natural language text Adding to...

متن کامل

NF-SS: A Normal Form for Semistructured Schema

Semistructured data is becoming increasingly important for web applications with the development of XML and related technologies. Designing a “good” semistructured database is crucial to prevent data redundancy, inconsistency and undesirable updating anomalies. However, unlike relational databases, there is no normalization theory to facilitate the design of good semistructured databases. In th...

متن کامل

Designing Semistructured Databases Using ORA-SS Model

Semistructured data has become prevalent with the growth of the Internet. The development of new web applications that require efficient design and maintenance of large amounts of data makes it increasingly important to design “good” semistructured databases to prevent data redundancy and updating anomalies. However, it is not easy, even impossible, for current semistructured data models to cap...

متن کامل

Analytical Review of Test Redundancy Detection Techniques

This paper presents an analytical review of approaches used by different authors. Coverage information is very important for finding redundancy in test cases. Test redundancy detection reduces the costs of testing and maintenance of software. A redundant test case is a useless part of test suite and it increases the testing cost and test suite size. There are a lot of works that proposed differ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Knowl. Data Eng.

دوره 13  شماره 

صفحات  -

تاریخ انتشار 2001