HDF5: State of the Union
Authors: Mr. Quincey Koziol (Lawrence Berkeley National Laboratory)
Abstract: A forum for HDF developers and users to interact. HDF developers will describe the current status of HDF5 and discuss future plans – including connectors for Big Data and Cloud – followed by an open discussion and our first annual community award for best use of HDF5 tech.
Long Description: HDF5 is an open-source technology suite for managing complex and high-volume data in heterogeneous computing and storage environments. HDF5 is the latest evolution of nearly 30 years of refinement across diverse fields, e.g. aerospace, energy, finance, oil & gas, and biotech. The HDF5 suite is included by every major HPC vendor as part of their core software because of its broad adoption by scientists and researchers in government, academia, and industry. Thanks to our passionate and dedicated community -- over 650 projects on Github -- HDF5 has broad support on nearly every programming language imaginable.
HDF5 includes: (1) versatile self-describing data model that can represent very complex data objects + relationships + metadata; (2) completely portable binary file format with no limit on the number or size of data objects; (3) software library optimized for high-speed and compact reads/writes; and (4) tools for managing, manipulating, viewing, and analyzing HDF5 data.
We're excited to share the latest advances in HDF5, as well as discuss future directions in supporting the Cloud (e.g. AWS S3) and Big Data (e.g. Spark). Ample time will be allowed for questions and discussion, focusing on feedback from our community: successes, challenges, and requests.
Birds of a Feather Index