\If things work well the data and metadata standards used will rarely be seen be end users, but they will enable the storage of data within data server instances, and the seamless transfer of data into or out of these systems. Flexible standards must use formats that can be used from a number of programming languages ranging from C, C++, and Fortran on supercomputers/high performance computing resources to perform quantum chemistry calculations through to Python for data analysis and JavaScript/TypeScript in web frontends or C/C++ in desktop applications. They should also be suited to the needs of persistent data publication independent of any particular database technology or programming language.
These considerations led to the choice to make use of JavaScript Object Notation (JSON) as a core standard for data and metadata, with a view to using related technologies such as JSON-LD (JSON for Linked Data) where appropriate. Large data is preferably stored in binary formats, which is where HDF5 was seen as one strong contender and more recently MessagePack due to its JSON-like structure and wide language support thanks to its simple binary specification. JSON also lends itself to use in BSON and jsonb---two binary JSON specifications used in MongoDB and PostgresSQL respectively.
Figures with examples of CJSON
In order to effectively share chemical data we must establish data and metadata standards capable of representing everything we wish to communicate. Further, it must offer routes to extending the standards without causing breakage and churn in existing data. Ideally communities should form to establish best practices, and propagate this to a number of codes to prove viability and offer a body of work that demonstrates the advantages of the approaches shown. A number of existing formats have been used such as XYZ, SDF, XML-based formats such as CML \cite{Phadungsukanan2012,Murray-Rust2011,Murray-Rust2011a,de2013} and more recently JSON-based formats such as Chemical JSON \cite{Hanwell2017}.