# Creators Moritz Thuerlemann, Sereina Riniker, Computational Chemistry Group, ETH Zurich # License Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International # Dataset with Energies, Gradients and Atomic Properties of Small Molecules. Dataset used to train the multipole model and the intramolecular potential in the ANA2B publication. The dataset was created with PSI4 (1.4) using PBE0-D3BJ/def2-TZVP. Packages used for DFT calculations: - PSI4 (1.4) Packages used to save the dataset: - h5py (3.9) ## Structure Each molecule can be accessed by its hash. For each molecule, every property is given as an array of size [BxN] where B is the number of conformations and N the number of atoms. SYSTEM_NAME # Hash identifier of the system - 'elements': [N], array(bytes) # Elements of the molecule. - 'coordinates': [BxNx3], array(float32) # Coordinates [A] of B conformations. - 'gradients': [BxNx3], array(float32) # Gradients [kJ/molA] of B conformations. - 'energy': [B], array(float32) # Energy [kJ/mol] of B conformations. - 'ratios': [BxNx1], array(float32) # MBIS Atomic volume ratios [1] of B conformations. - 'monos': [BxNx1], array(float32) # MBIS monopoles [e] of B conformations. - 'dipos': [BxNx3], array(float32) # MBIS dipoles [eA] of B conformations. - 'quads': [BxNx3x3], array(float32) # MBIS traceless quadrupoles [eAA] of B conformations. ## Access You can simply access the data with: ``` import h5py f = h5py.File('dataset_intra.hdf5', 'r') for SYSTEM_KEY in f: for ENTRY_KEY in f[SYSTEM_KEY]: ... array = f[SYSTEM_KEY][ENTRY_KEY][()] # access array ```