|A Parallel File System for Petascale Computing.|
GSoC '08 Project IdeasThe Google Summer of Code program allows students to contribute to open source projects for the summer on interesting ideas useful to the community. We provide a number of ideas for the PVFS open source project:
PVFS over ZFSPVFS provides a well defined object storage interface for backend server storage that enables high-performance access to metadata and data. Traditionally, the storage backends have been built on disk filesystems (such as ext3 or xfs) using system calls. This provides encapsulation of the storage through a standard interface, but often at the cost of performance and transactional guarantees.
The open source ZFS software provides a transactional object interface to storage through its DMU (Data Management Unit).
This project consists of implementing the PVFS storage interface to the ZFS DMU code, providing high-performance transactional semantics for stored objects in a parallel filesystem.
Short FUSE: High performance modifications to FUSEPVFS provides a linux kernel module that integrates with the linux kernel VFS layer. It is designed and implemented to provide appropriate semantics for distributed and parallel applications, as well as high performance I/O between the kernel and userspace.
FUSE is an interface that allows filesystem drivers to be written in userspace, alleviating the need for a kernel VFS implementation that can be both difficult to implement and manage. The current FUSE implementation does not provide good performance for large I/O requests, or appropriate semantic constraints for distributed filesystems, so its use in the distributed computing community has been limited.
This project idea consists of incorporating the semantic constraints and high-performance design concepts from the PVFS kernel VFS implementation into FUSE, to be used by high performance filesystems, such as Ceph and PVFS.
PVFS over SCTPPVFS provides a high-performance networking abstraction API (which we call BMI) that supports common HPC networking fabrics, such as Infiniband, Myrinet, and TCP/IP. BMI is message oriented instead of byte oriented, so it matches better with the message-oriented design of the new SCTP transport protocol.
This project idea consists of implementing a BMI backend method for SCTP. This will allow commodity clusters with ethernet networks to take advantage of the performance and reliability benefits of the SCTP protocol for access to storage.