Automating data-model workflows at a level 12 HUC scale: Watershed modeling in a distributed computing environment
نویسندگان
چکیده
The prototype discussed in this article retrieves Essential Terrestrial Variable (ETV) web services and uses data-model workflows to transform ETV data for hydrological models in a distributed computing environment. The ETV workflow is a service layer to 100's of terabytes of national datasets bundled for fast data access in support of watershed modeling using the United States Geological Survey (USGS) Hydrological Unit Code (HUC) level-12 scale. The ETV data has been proposed as the Essential Terrestrial Data necessary to construct watershed models anywhere in the continental USA (Leonard and Duffy, 2013). Here, we present the hardware and software system designs to support the ETV, data-model, and model workflows using High Performance Computing (HPC) and service-oriented architecture. This infrastructure design is an important contribution to both how and where the workflows operate. We describe details of how these workflow services operate in a distributed manner for modeling CONUS HUC-12 catchments using the Penn State Integrated Hydrological Model (PIHM) as an example. The prototype is evaluated by generating data-model workflows for every CONUS HUC-12 and creating a repository of workflow provenance for every HUC-12 (~100 km2) for use by researchers as a strategy to begin a new hydrological model study. The concept of provenance for data-model workflows developed here assures reproducibility of model simulations (e.g. reanalysis) from ETV datasets without storing model results which we have shown will require many petabytes of storage. © 2014 Elsevier Ltd. All rights reserved. Software availability Name: HydroTerre Developer: Lorne Leonard, Department of Civil Engineering & Penn State Institutes of Energy and the Environment, The Pennsylvania State University Contact information: Christopher J. Duffy & Lorne Leonard, Department of Civil & Environmental Engineering, The Pennsylvania State University, 212 Sackett Building, University Park, PA 16802, USA Software required: Internet browser (later versions are recommended) Program language: Cþþ, C#, Microsoft SQL, ArcGIS, Silverlight, COM, HTML, JavaScript Availability and cost: Any user can access HydroTerre web applications at no cost at: http://www.hydroterre.psu.edu [email protected] (C.J. Duffy).
منابع مشابه
Automating Data-Model Workflows at a Level-12 HUC Scale in a Distributed Computing Environment
The HydroTerre web services provide the Essential Terrestrial Variable (ETV) datasets to create common hydrological models anywhere in the continental United States (CONUS). These services allow web users to download data for their own purposes in their own computing environment. The datasets are provided using standard Geographic Information System formats and the data transformation is depend...
متن کاملRunoff and sediment yield modeling using WEPP in a semi-arid environment (Case study: Orazan Watershed)
Water erosion is a major environmental problem in many parts of the world. Majority of semi-arid countries are concerned because of their specific climate and soils sensitivity, but also because of the recent intensification of human activities and agricultural practices. Accurate estimation of water erosion for various land-use and climate scenarios is so an important key to define sustainable...
متن کاملThe flow hydrograph modeling using GIS and distributed hyrdological model, in Dinvar Watershed, Karkheh, Iran
In this paper, the modeling is based on division of the catchment into a grid mesh. Each cell has a unique response function independent of the functioning of other cells Summation of the flow responses from the cells, result the flow hydrograph from this area basin.A method is presented to simulation the flow hydrograph within a river basin using the hydrological model WetSpa. WetSpa is a GIS-b...
متن کاملA Graphical Modeling Environment for the Generation of Workflows for the Globus Toolkit
Grid computing aims at managing resources in a heterogeneous distributed environment. The Globus Toolkit provides a set of components that can be used to build applications that function in a grid computing system. Presently, applications are typically handcrafted either by using an Application Programming Interface (API) interacting through a set of command line interfaces, or by using a set o...
متن کاملImproving the palbimm scheduling algorithm for fault tolerance in cloud computing
Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Environmental Modelling and Software
دوره 61 شماره
صفحات -
تاریخ انتشار 2014