Designing multicore scalable filesystems with durability and crash consistency
نویسندگان
چکیده
It is challenging to simultaneously achieve multicore scalability and high disk throughput in a file system. For example, data structures that are on separate cache lines in memory (e.g., directory entries) are grouped together in a transaction log when the file system writes them to disk. This grouping results in cache line conflicts, thereby limiting scalability. McoreFS is a novel file system design that decouples the in-memory file system from the on-disk file system using per-core operation logs. This design facilitates the use of highly concurrent data structures for the in-memory representation, which allows commutative operations to proceed without conflicts and hence scale perfectly. McoreFS logs operations in a per-core log so that it can delay propagating updates to the disk representation until an fsync. The fsync call merges the per-core logs and applies the operations to disk. McoreFS uses several techniques to perform the merge correctly while achieving good performance: timestamped linearization points to order updates without introducing cache line conflicts, absorption of logged operations, and dependency tracking across operations. Experiments with a prototype of McoreFS show that its implementation is conflict-free for 99% of test cases involving commutative operations generated by Commuter, scales well on an 80-core machine, and provides disk performance that matches or exceeds that of Linux ext4. Thesis Supervisor: M. Frans Kaashoek Title: Charles Piper Professor of Electrical Engineering and Computer Science Thesis Supervisor: Nickolai Zeldovich Title: Associate Professor of Electrical Engineering and Computer Science
منابع مشابه
vDrive: An Efficient and Consistent Virtual I/O System
The most popular methods for managing storage and providing crash consistency are I/O virtualization and journaled filesystems respectively. This popularity is due to their widespread use in production environments. However, both of these methods have evolved separately in different contexts in the past. This paper presents a first look on providing crash consistency for virtual I/O caches thro...
متن کاملFast Databases with Fast Durability and Recovery Through Multicore Parallelism
Multicore in-memory databases for modern machines can support extraordinarily high transaction rates for online transaction processing workloads. A potential weakness, however, is recovery from crash failures. Can classical techniques, such as checkpoints, be made both efficient enough to keep up with current systems’ memory sizes and transaction rates, and smart enough to avoid additional cont...
متن کاملDesign of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملDurability and Crash Recovery in Distributed In-memory Storage Systems a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
This dissertation presents fast crash recovery for the RAMCloud distributed in-memory data center storage system. RAMCloud is designed to operate on thousands or tens-of-thousands of machines, and it stores all data in DRAM. Rather than replicating in DRAM for redundancy, it provides inexpensive durability and availability by recovering quickly after server crashes. Overall, its goal is to reco...
متن کاملScalable Database Logging for Multicores
Modern databases, guaranteeing atomicity and durability, store transaction logs in a volatile, central log buffer and then flush the log buffer to non-volatile storage by the write-ahead logging principle. Buffering logs in central log store has recently faced a severe multicore scalability problem, and log flushing has been challenged by synchronous I/O delay. We have designed and implemented ...
متن کامل