
Equivariant potentials are the (relatively) new kid on the block with promising high accuracy in published benchmarks. One of them is MACE which we now added to the zoo of machine learning potentials available through the interfaces in MLatom. See the above figure with the overview of MLPs supported by MLatom (in bold) and other representatives (modified from our MLP benchmark paper ). We have just released the MLatom 3.1.0 version with MACE and show how to use it here.
Installation
pip install mlatomgit clone https://github.com/ACEsuit/mace.gitpip install ./mace
Data preparation
Below we provide a 1000-point dataset that randomly sampled from MD17 dataset for the ethanol molecule as the training data (xyz.dat, en.dat, grad.dat, which store the geometires, potential energies, and energy gradients respectively) along with test data of another 1000 points (names begin with "test_").
mace_tutorial
(Note that
the energies are in Hartree, and distances are in Ångstrom.)
Training, testing and using MACE can be done through input files, command line, and Python API. Below we show how.
Training and testing with input file and command file
createMLmodelXYZfile=xyz.datYfile=en.datYgradXYZfile=grad.datMLmodelType=MACEmace.max_num_epochs=100MLmodelOut=mace.pt
You can save the following input in file train.inp and then run it with MLatom in your terminal as:
> mlatom train.inp
Alternatively, you can run all options in the command line:
> mlatom createMLmodel XYZfile=xyz.dat Yfile=en.dat YgradXYZfile=grad.dat MLmodelType=MACE mace.max_num_epochs=100 MLmodelOut=mace.pt
You can also submit a job to our XACS cloud computing or use its online terminal. It's free, but training only on CPUs can be very slow. To speed up the test, you can comment out or delete the line YgradXYZfile=grad.dat, which would only train on energies but will be faster.
The web interface of XACS cloud computing's job submitter

useMLmodelXYZfile=test_xyz.datYgradXYZestFile=test_gradest.datYestfile=test_enest.datMLmodelType=MACEMLmodelIn=mace.pt
analyzeYfile=test_en.datYgradXYZfile=test_grad.datYestfile=test_enest.datYgradXYZestFile=test_gradest.dat
The analysis results looks like (note that the orignal unit is Hartree and Hartree/Angstrom):
Around 0.45 kcal/mol for energy and 0.76 kcal/mol/A for gradients.
Training and using Python
First, let's import MLatom:
import mlatom as ml
which offers greate flexibility. You can check the documentation .
molDB = ml.data.molecular_database.from_xyz_file(filename = 'xyz.dat')molDB.add_scalar_properties_from_file('en.dat', 'energy')molDB.add_xyz_vectorial_properties_from_file('grad.dat', 'energy_gradients')
Then define a MACE model and train with the database:
model= ml.models.mace(model_file='mace.pt', hyperparameters={'max_num_epochs': 100})model.train(molDB, property_to_learn='energy', xyz_derivative_property_to_learn='energy_gradients')
Making predictions with the model:
test_molDB = ml.data.molecular_database.from_xyz_file(filename = 'test_xyz.dat')test_molDB.add_scalar_properties_from_file('test_en.dat', 'energy')test_molDB.add_xyz_vectorial_properties_from_file('test_grad.dat', 'energy_gradients')model.predict(molecular_database=test_molDB,property_to_predict='mace_energy',xyz_derivative_property_to_predict='mace_gradients')
Then you can do analysis whatever you like, e.g. calculate RMSE:
ml.stats.rmse(test_molDB.get_properties('energy'), test_molDB.get_properties('mace_energy'))*ml.constants.Hartree2kcalpermolml.stats.rmse(test_molDB.get_xyz_vectorial_properties('energy_gradients').flatten(), test_molDB.get_xyz_vectorial_properties('mace_gradients').flatten())*ml.constants.Hartree2kcalpermol
Using
the model
After the model is trained, it can be used with MLatom for applications, e.g., geometry optimizations or MD, check out MLatom's manual for details. Here is brief example how the input file for geometry optimization would look like:
geomopt # Request geometry optimizationMLmodelType=MACEMLmodelIn=mace.ptXYZfile=ethanol_init.xyz # The file with the initial guessoptXYZ=eq_MACE.xyz
In Python, geometry optimization is also quite simple:
import mlatom as mlmol = ml.data.molecule.from_xyz_file('ethanol_init.xyz')print(mol.get_xyz_string())model = ml.models.mace(model_file='mace.pt')ml.optimize_geometry(model=model, molecule=mol, program='ASE')print(mol.get_xyz_string())
Summary