Advanced Information Access to Parliamentary Debates

نویسنده

  • Maarten Marx
چکیده

Parliamentary debates are highly structured transcripts of meetings of politicians in parliament. These debates are an important part of the cultural heritage of many countries; they are often free of copy-right; citizens often have a legal right to inspect them; and several countries make great effort to digitize their entire historical collection and make it available to the general public. This provides many opportunities for the Information Retrieval community. In this paper, we analyze the structure of parliamentary proceedings and sketch a widely applicable DTD. We show how proceedings in PDF format can be transformed into deeply nested XML. Having the proceedings in XML makes a wide range of applications possible. We elaborate on five applications: entry point retrieval, advanced content and structure search; automatic creation of tables of contents and hyperlinked navigation menus; graphical result aggregation; large savings on storage space and bandwidth for scanned documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exemelification of Parliamentary Debates

Parliamentary debates are an interesting domain to apply state-of-the-art information retrieval technology. Parliamentary debates are highly structured transcripts of meetings of politicians in parliament. These debates are an important part of the cultural heritage of countries; they are often free of copy-right; citizens often have a legal right to inspect them; and several countries make gre...

متن کامل

Bringing parliamentary debates to the Semantic Web

An analysis of parliamentary debates and media resources that cover them can provide insight into the political climate of a country. Although debates are now regularly published on official government portals, their analysis remains a cumbersome and challenging task for historians and political scientists. One of the main tasks of the PoliMedia project is to allow easy crossmedia comparisons a...

متن کامل

Information Retrieval based on Explicit Knowledge Representation

The tool which we tested in the present monolingual retrieval task, Lexware, is based on explicit knowledge representation not on statistic language modeling. In the present task Lexware indexing seems to be satisfactory while its query builder is not. The system has been tested extensively on indexing of Swedish parliamentary debates with very good results. We are happy that Swedish is finally...

متن کامل

Structural elements in achieving legislative tobacco control in NSW, 1960- 1995: implications for the future

Objective: To analyse structural factors revealed by politicians that shaped legislation on tobacco control in New South Wales, 1955-1995. Methods: Parliamentary debates and other records were collected. Open-ended interviews were conducted with 17 of the Members of Parliament (MPs) and health advocates who were significantly involved, and analysed for structural elements. Results: Tobacco indu...

متن کامل

WikiCat Browser: Exploring the Data from Wikipedia Categories’ Point of View

In loose terms, the goal in this project is to developed an interactive tool which utilizes the data annotations like extracted entities to enables users to explore the data based on their relation to the concepts of an external knowledge base, namely Wikipedia. This system helps users to dig into the data from their own perspective and explore which aspects of their interested concepts are rel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Digit. Inf.

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2009