High Performance I/O and Data Management
نویسنده
چکیده
A library for parallel IO and data management has been developed for large-scale multi-physics simulations. The goal of the library is to provide sustainable, interoperable, efficient, scalable, and convenient tools for parallel IO and data management for high-level data structures in applications, and to provide tools for the connection between applications. The high-level data structures include oneand multi-dimensional arrays, structured meshes, unstructured meshes, and the meshes generated through adaptive mesh refinement. The IO mechanism can be collective and noncollective. The data objects suitable for the library could be either large or small data sets. Even for small data sets, the IO performance is close to one of MPI-IO performance. Keywords-IO, data structure, data management, high performance.
منابع مشابه
High-performance scientific data management system
Many scientific applications have large I/O requirements, in terms of both the size of data and the number of files or data sets. Management, storage, efficient access, and analysis of this data present an extremely challenging task. Traditionally, two different solutions have been used for this task: file I/O or databases. File I/O can provide high performance but is tedious to use with large ...
متن کاملFlexible and Efficient Parallel I/O for Large-Scale Multi-Component Simulations
In this paper, we discuss our experience of providing high performance parallel I/O for a large-scale, on-going, multi-disciplinary simulation project for solid propellant rockets. We describe the performance and data management issues observed in this project and present our solutions, including (1) support for relatively fine-grained distribution of irregular datasets in parallel I/O, (2) a f...
متن کاملI/O Optimization and Evaluation for Tertiary Storage Systems
Large-scale parallel scientific applications are generating huge amounts of data that tertiary storage systems emerge as a popular place to hold them. SRB, a uniform interface to various storage systems including tertiary storage systems such as HPSS, UniTree etc., becomes an important and convenient way to access tertiary data across networks in a distributed environment. But SRB is not optimi...
متن کاملOperating System Enhancements for Data-Intensive Server Systems
Recent studies on operating system support for concurrent server systems mostly target CPU-intensive workloads with light disk I/O activities. However, an important class of server systems that access a large amount of disk-resident data, such as the index searching server of large-scale Web search engines, has received limited attention. In this thesis work, we examine operating system techniq...
متن کاملParallel I/o Scheduling and Buffer Management
Parallel I/O systems are an integral component of modern high performance systems, providing large secondary storage capacity, and having the potential to alleviate the I/O bottleneck of data intensive applications. In these systems the I/O buffer can be used for two purposes (a) improve I/O parallelism by buffering prefetched blocks and making the load on disks more uniform, and (b) improve I/...
متن کاملI/O-aware bandwidth allocation for petascale computing systems
In the Big Data era, the gap between the storage performance and an application’s I/O requirement is increasing. I/O congestion caused by concurrent storage accesses from multiple applications is inevitable and severely harms the performance. Conventional approaches either focus on optimizing an application’s access pattern individually or handle I/O requests on a low-level storage layer withou...
متن کامل