Data Provenance and Management in Radio Astronomy: A Stream Computing Approach

نویسندگان

  • Mahmoud S. Mahmoud
  • Andrew Ensor
  • Alain Biem
  • Bruce G. Elmegreen
  • Sergei Gulyaev
چکیده

New approaches for data provenance and data management (DPDM) are required for mega science projects like the Square Kilometer Array, characterized by extremely large data volume and intense data rates, therefore demanding innovative and highly efficient computational paradigms. In this context, we explore a streamcomputing approach with the emphasis on the use of accelerators. In particular, we make use of a new generation of high performance stream-based parallelization middleware known as InfoSphere Streams. Its viability for managing and ensuring interoperability and integrity of signal processing data pipelines is demonstrated in radio astronomy. IBM InfoSphere Streams embraces the stream-computing paradigm. It is a shift from conventional data mining techniques (involving analysis of existing data from databases) towards real-time analytic processing. We discuss using InfoSphere Streams for effective DPDM in radio astronomy and propose a way in which InfoSphere Streams can be utilized for large antennae arrays. We present a case-study: the InfoSphere Streams implementation of an autocorrelating spectrometer, and usMahmoud S. Mahmoud AUT Institute for Radio Astronomy & Space Research, Auckland NZ e-mail: [email protected] Andrew Ensor AUT Institute for Radio Astronomy & Space Research, Auckland NZ e-mail: [email protected] Alain Biem IBM T J Watson Research Center, Yorktown Heights NY e-mail: [email protected] Bruce Elemgreen IBM T J Watson Research Center, Yorktown Heights NY e-mail: [email protected] Sergei Gulyaev AUT Institute for Radio Astronomy & Space Research, Auckland NZ e-mail: [email protected] 1 ar X iv :1 11 2. 25 84 v1 [ cs .D C ] 1 2 D ec 2 01 1 2 M.S. Mahmoud, A. Ensor, A. Biem, B. Elmegreen & S. Gulyaev ing this example we discuss the advantages of the stream-computing approach and the utilization of hardware accelerators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Real-Time Building Information Modeling (BIM) Synchronization Using Radio Frequency Identification Technology and Cloud Computing System

The online observation of a construction site and processes bears significant advantage to all business sector. BIM is the combination of a 3D model of the project and a project-planning program which improves the project planning model by up to 6D (Adding Time, Cost and Material Information dimensions to the model). RFID technology is an appropriate information synchronization tool between the...

متن کامل

Supporting On-the-fly Provenance Tracking in Stream Processing Systems

A new class of data management systems that operate on highvolume streaming data is becoming increasingly important. As this kind of systems has to process unpredictable streaming data in real-time and deliver instantaneous responses, it becomes very difficult to precisely validate stream processing results in timely manner, verify stream computation that took place and investigate processing s...

متن کامل

Assessing the Trustworthiness of Streaming Data

The notion of confidence policy is a novel notion that exploits trustworthi-ness of data items in data management and query processing. In this paper we address the problem of enforcing confidence policies in data stream management systems (DSMSs), which is crucial in supporting users with different access rights, processing confidence-aware continuous queries, and protecting the secure streami...

متن کامل

A Efficient Stream Provenance via Operator Instrumentation

Managing fine-grained provenance is a critical requirement for data stream management systems (DSMS), not only to address complex applications that require diagnostic capabilities and assurance, but also for providing advanced functionality such as revision processing or query debugging. This paper introduces a novel approach that uses operator instrumentation, i.e., modifying the behavior of o...

متن کامل

The Case for Fine-Grained Stream Provenance

The current state of the art for provenance in data stream management systems (DSMS) is to provide provenance at a high level of abstraction (such as, from which sensors in a sensor network an aggregated value is derived from). This limitation was imposed by high-throughput requirements and an anticipated lack of application demand for more detailed provenance information. In this work, we firs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1112.2584  شماره 

صفحات  -

تاریخ انتشار 2011