Paper to Screen: Processing Historical Scans in the ADS

نویسندگان

  • Donna M. Thompson
  • Alberto Accomazzi
  • Günther Eichhorn
  • Carolyn Stern-Grant
  • Edwin A. Henneken
  • Michael J. Kurtz
  • Elizabeth Bohlen
  • Stephen S. Murray
چکیده

The NASA Astrophysics Data System in conjunction with the Wolbach Library at the Harvard-Smithsonian Center for Astrophysics is working on a project to microfilm historical observatory publications. The microfilm is then scanned for inclusion in the ADS. The ADS currently contains over 700,000 scanned pages of volumes of historical literature. Many of these volumes lack clear pagination or other bibliographic data that are necessary to take advantage of the searching capabilities of the ADS. This paper will address some of the interesting challenges that needed to be resolved during the processing of the Observatory Reports included in the ADS. Brief overview of the process of metadata capture In order to be able to utilize the sophisticated searching capabilities of the ADS for the scanned publications, page numbers and article metadata (e.g. title, author, beginning and ending pages) must be generated. A software tool to aid in the capture of these metadata has been developed and is available on the ADS web site. A number of volunteers have worked with this interface and have generated the data for approximately 450 volumes of 44 different titles. Some of the observatory publications that have been processed and are now searchable with the ADS include selections from the following: Annals of Harvard College Observatory; Astronomical and Meteorological Observations made at the U.S. Naval Observatory; Beobachtungs-Ergebnisse der Koniglichen Sternwarte zu Berlin; and Contributions from the Rutherford Observatory of Columbia University New York. The capture of the metadata must be done in two stages. In the first stage, page numbering mode, the scans are viewed sequentially and a page number is assigned to each image and duplicate scans are marked to be removed. (See Figure 1.) Once this process has been completed the second stage, article metadata mode, can be done. At this time the images are shown with the assigned page numbers and the article information (author, title, first/last page, and abstract) can be entered. (See Figure 2.) Once these data have been checked and processed further by the ADS staff the articles in this collection become searchable using the ADS. Many volunteers from around the world have participated in this project. As of June 2006 more than 120 people signed up as contributors. A Smithsonian 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling and Simulation of Modern Industrial Screens using Discrete Element Method (TECHNICAL NOTE)

With progress in mineral processing technologies, particle size classification equipment has also been changed to satisfy the needs of modern plants. Accordingly, design, manufacturing and utilizing of banana screens in mineral processing plants have led to increased screening efficiency at industrial scale. Banana screen is an important invention occurred in past decade which increases screeni...

متن کامل

Ranking of New Sponsored Online Ads Using Semantically Related Historical Ads

Online advertising in search engines is a wide and growing market. In this market, revenue of search engines depends on the number of user clicks received on displayed ads. Thus, in order to increase the revenue, search engines try to select top ads and rank them based on the expected number of clicks they will receive. For ads that were in the system for a period of time, the expected number o...

متن کامل

Clinical applications of virtual, non-contrast head images derived from dual-source, dual-energy cerebrovascular computed tomography angiography

Background: This study set out to evaluate the utility of cerebrovascular virtual non-contrast (VNC) scans. Materials and Methods: Conventional non-contrast (CNC) and dual-energy computed tomography angiography (DE-CTA) head scans were conducted on 100 subjects, of which 46 were normal, 15 had parenchymal hematomas of the brain, 13 had ischemic infarction, 22 had tumors, and 4 had calcified les...

متن کامل

Advertisement and the Social Identity of Iranian Women: A critical Analysis of Verbal and Visual Discourse in Business Ads

This study aims to investigate the mechanisms contributing to the effect of business ads on the social identity of Ianian women. 60 ads of the Iranian TV were analysed by quantitative and qualitative methods. Frequency analysis revealed that the image of a desired and successful woman is mainly affected by these factors; 1) Employers' benefit; 2) discourse type: 3) targer population; and 4) the...

متن کامل

The Penrose limit of AdS×S space and holography

In the Penrose limit, AdS×S space turns into a Cahen-Wallach (CW) space whose Killing vectors satisfy a Heisenberg algebra. I discuss how this algebra is mapped onto the holographic screen on the boundary of AdS. Furthermore, I show that the algebra on the boundary of AdS may be obtained directly from the CW space by appropriately constraining the states defined on it. By viewing the constraint...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0610030  شماره 

صفحات  -

تاریخ انتشار 2006