Skip to content



ALFF: Active Learning Framework for generating Graph Neural Network Forcefields.

Developed and maintained by C.Thang Nguyen



ROOT_PATH = Path(__file__).parent module-attribute

__author__ = 'C.Thang Nguyen' module-attribute

__contact__ = '' module-attribute





00_train 01_md 02_dft 03_data


_get_mlp_engine(pdict) -> str
get_sys_index(task) -> list
pre_train(iter_idx, pdict, mdict)

This function prepares: - collect data files - prepare training args based MLP engine

run_train(iter_idx, pdict, mdict)
post_train(iter_idx, pdict, mdict)
pre_md(iter_idx, pdict, mdict)

Prepare MD tasks - collect initial configurations - prepare MD args

run_md(iter_idx, pdict, mdict)
post_md(iter_idx, pdict, mdict)
pre_dft(iter_idx, pdict, mdict)

Prepare DFT tasks

run_dft(iter_idx, pdict, mdict)
post_dft(iter_idx, pdict, mdict)

Do post DFT tasks - collect DFT results - remove temporary files

al_iteration(configfile_param, configfile_machine)

Run main loop of active learning.



pre_finetune(pdict: dict, mdict: dict)

This function prepares: - collect data files - prepare training args based MLP engine

run_finetune(pdict: dict, mdict: dict)
post_finetune(pdict: dict, mdict: dict)
fine_tuning(configfile_param: str, configfile_machine: str)

Fine tune the existed ML models or train a new ML model.



pre_md_ase_sevenn(work_dir, pdict, mdict)

This function does: - prepare MD args - generate task_dirs for ranges of temperature and stress - establish MD tasks and ASE_run_file

run_md_ase_sevenn(work_dir, pdict, mdict)

Refer to the rungen_gpaw_optimize() function.

post_md_ase_sevenn(work_dir, pdict, mdict)

This function does: - collect MD results & compute committee_error (compute on remote machine) - select candidate configurations for DFT calculation Note: Now, do on remote machine



_get_lammps_arg_default(iter_idx, pdict)
pre_md_lammps_sevenn(iter_idx, pdict, mdict)
run_md_lammps_sevenn(iter_idx, pdict, mdict)
check_cluster(conf_name, fp_cluster_vacuum, fmt='lammps/dump')
check_bad_box(conf_name, criteria, fmt='lammps/dump')



committee_error_e(atoms: Atoms, calc_list: list[object])

Committee error for energy on a single configuration


  • atoms (Atoms) –

    Atoms object

  • calc_list (list[object]) –

    list of ASE's calculators of ML models in the committee.


  • e_std ( float ) –

    standard deviation of the energy

committee_error_f(atoms, calc_list: list[object], rel_force: float = None)

Committee error for forces on a single configuration


  • atoms (Atoms) –

    Atoms object

  • calc_list (list[object]) –

    list of ASE's calculators of ML models in the committee.

  • rel_force (float, default: None ) –

    relative force. Defaults to None.


  • f_std_mean ( float ) –

    mean of the standard deviation of atomic forces in the configuration

  • f_std_max ( float ) –

    maximum of the standard deviation

  • f_std_min ( float ) –

    minimum of the standard deviation

committee_error_s(atoms: Atoms, calc_list: list[object], rel_stress: float = None)

Committee error for stress on a single configuration


  • atoms (Atoms) –

    Atoms object

  • calc_list (list[object]) –

    list of ASE's calculators of ML models in the committee.

  • rel_stress (float, default: None ) –

    relative stress. Defaults to None.


  • s_std_mean ( float ) –

    mean of the standard deviation of the stress in the configuration

  • s_std_max ( float ) –

    maximum of the standard deviation

  • s_std_min ( float ) –

    minimum of the standard deviation

committee_error(extxyz_file: str, calc_list: list[object], rel_force: float = None, compute_stress: bool = True, rel_stress: float = None, outfile: str = 'committee_error.txt')

Committee error for energy, forces and stress on a list of configurations


  • extxyz_files (list) –

    list of extended xyz files

  • calc_list (list[object]) –

    list of ASE's calculators of ML models

  • rel_force (float, default: None ) –

    relative force. Defaults to None.

  • compute_stress (bool, default: True ) –

    whether to compute stress. Defaults to True.

  • rel_stress (float, default: None ) –

    relative stress. Defaults to None.

  • outfile (str, default: 'committee_error.txt' ) –

    output file. Defaults to "committee_error.txt".


  • outfile "committee_error.txt" with the following columns:

select_candidate(extxyz_file: str, committee_error_file: str, e_std_hi: float = 0.1, e_std_lo: float = 0.0, f_std_hi: float = 0.1, f_std_lo: float = 0.0, s_std_hi: float = None, s_std_lo: float = 0.0)

Select candidate configurations for DFT calculation


  • extxyz_file (str) –

    extended xyz file

  • committee_error_file (str) –

    committee error file

  • e_std_hi (float, default: 0.1 ) –

    energy std high. Defaults to 0.1.

  • e_std_lo (float, default: 0.0 ) –

    energy std low. Defaults to 0.05.

  • f_std_hi (float, default: 0.1 ) –

    force std high. Defaults to 0.1.

  • f_std_lo (float, default: 0.0 ) –

    force std low. Defaults to 0.05.

  • s_std_hi (float, default: None ) –

    stress std high. Defaults to 0.1.

  • s_std_lo (float, default: 0.0 ) –

    stress std low. Defaults to 0.05.


  • extxyz_file

    candidate configurations

  • committee_error_file ( s ) –

    files for candidate, accurate and inaccurate configurations

Note: When wanting to select candidates based on only energy/force, just set f_std_lo/s_std_lo to a very large value. By this way, the criterion for that term will never meet. E.g., f_std_lo=1e6 for selecting candidates based on energy only.

select_candidate_polar(extxyz_file: str, committee_error_file: str, e_std_hi: float = 0.1, e_std_lo: float = 0.0, f_std_hi: float = 0.1, f_std_lo: float = 0.0, s_std_hi: float = None, s_std_lo: float = 0.0)

Select candidate configurations for DFT calculation. Polar version (note polars may difficult to install on older systems)

committee_error_SevenNet(extxyz_file: str, checkpoint_files: list, sevenn_args: dict = {}, rel_force: float = None, compute_stress: bool = True, rel_stress: float = None, outfile: str = 'committee_error.txt')

Committee error for energy, forces and stress on a list of configurations


  • extxyz_files (list) –

    list of extended xyz files

  • checkpoint_files (list) –

    list of checkpoint_files files SevenNet models

  • sevenn_args (dict, default: {} ) –

    arguments for SevenNetCalculator. Defaults to {}.


  • file "committee_error.txt" with the following columns:




Logger = create_logger('alff', level='INFO', log_file=FILE_LOG_ALFF) module-attribute


CLI for active learning


CLI for data generation


CLI for fine-tuning


CLI for phonon calculation


CLI for elastic constants calculation


CLI for converting the MPCHGNet dataset to XYZ format


Get the arguments from the command line


Validate the machine config






info_keys = ['uncorrected_total_energy', 'corrected_total_energy', 'energy_per_atom', 'ef_per_atom', 'e_per_atom_relaxed', 'ef_per_atom_relaxed', 'magmom', 'bandgap', 'mp_id'] module-attribute
chgnet_to_ase_atoms(datum: dict[str, dict[str, Any]]) -> list[Atoms]



build_structure(pdict, mdict)

Build structures based on input parameters

optimize_structure(pdict, mdict)

Optimize the structures

sampling_space(pdict, mdict)

Scale and perturb the structures. - Save 2 lists of paths: original and scaled structure paths

copy_structure_file(src_dir: str, dest_dir: str)

Copy structure file (both labeled or unlabeled) from source directory to destination directory, and rename to FILE_FRAME_unLABEL

scale_x_dim(struct_files: list, scale_x_list: list)

Scale the x dimension of the structures

scale_y_dim(struct_files: list, scale_y_list: list)

Scale the y dimension of the structures

scale_z_dim(struct_files: list, scale_z_list: list)

Scale the z dimension of the structures

perturb_structure(struct_files: list, perturb_num: int, perturb_disp: float)

Perturb the structures

_total_conf_num(pdict: dict)
run_dft(pdict, mdict)

Run DFT calculations

collect_data(pdict, mdict)

Collect data from DFT simulations

data_generator(configfile_param: str, configfile_machine: str)

Generate initial data for training ML models



split_atoms_list(atoms_list: List[Atoms], train_ratio: float = 0.9, valid_ratio: float = 0.1, seed: int = None) -> Tuple[List[Atoms], List[Atoms], List[Atoms]]

Split a dataset into training, validation, and test sets.

If input (train_ratio + valid_ratio) < 1, the remaining data will be used as the test set.


  • data (List[Atoms]) –

    List of ASE Atoms objects.

  • train_ratio (float, default: 0.9 ) –

    Ratio of training set. Defaults to 0.9.

  • valid_ratio (float, default: 0.1 ) –

    Ratio of validation set. Defaults to 0.1.

  • seed (Optional[int], default: None ) –

    Random seed for reproducibility. Defaults to None.


  • Tuple[List[Atoms], List[Atoms], List[Atoms]]

    Tuple[List[Atoms], List[Atoms], List[Atoms]]: Split datasets as train, valid, and test.

split_extxyz_dataset(extxyz_files: List[str], train_ratio: float = 0.9, valid_ratio: float = 0.1, seed: int = None, outfile_prefix: str = 'dataset')

Split a dataset into training, validation, and test sets.

If input (train_ratio + valid_ratio) < 1, the remaining data will be used as the test set.


  • extxyz_files (List[str]) –

    List of file paths in EXTXYZ format.

  • train_ratio (float, default: 0.9 ) –

    Ratio of training set. Defaults to 0.9.

  • valid_ratio (float, default: 0.1 ) –

    Ratio of validation set. Defaults to 0.1.

  • seed (Optional[int], default: None ) –

    Random seed. Defaults to None.

  • outfile_prefix (str, default: 'dataset' ) –

    Prefix for output file names. Defaults to "dataset".

read_list_extxyz(extxyz_files: list[str]) -> List[Atoms]

Read a list of EXTXYZ files and return a list of ASE Atoms objects.


NOTE: - work_dir is a folder relative to the run_dir - task_dirs is folders relative to the work_dir


pregen_gpaw_optimize(work_dir, pdict)

This function does: - Prepare task_dirs: select only unlabeled structures to compute at clusters. - Prepare ase_args for GPAW and gpaw_run_file

Note: in this function, - struct_dirs are relative to run_dir. - task_dirs are relative to work_dir.

rungen_gpaw_optimize(work_dir, pdict, mdict)

This function does: - Read task_dirs from .yaml file - Prepare the task_list - Prepare fordward & backward files - Prepare command_list - Submit jobs to the cluster - Download the results when finished

postgen_gpaw_optimize(work_dir, pdict)

This function does: - Remove unlabeled .extxyz files, just keep the labeled ones.

pregen_gpaw_singlepoint(work_dir, pdict)

Refer to the pregen_gpaw_optimize() function.

rungen_gpaw_singlepoint(work_dir, pdict, mdict)

Refer to the rungen_gpaw_optimize() function.

postgen_gpaw_singlepoint(work_dir, pdict)

Refer to the postgen_gpaw_optimize() function.

temperature_stress_mdarg_ase(struct_dirs: list, temperature_list: list = [], stress_list: list = [], ase_args: dict = {})

Generate the task_dirs for ranges of temperatures and stresses.


  • struct_dirs (list) –

    List of dirs contains configuration files.

  • temperature_list (list, default: [] ) –

    List of temperatures.

  • stress_list (list, default: [] ) –

    List of stresses.

pregen_gpaw_aimd(work_dir, pdict)

Refer to the pregen_gpaw_optimize() function. Note: - structure_dirs: contains the optimized structures without scaling. - scale_structure_dirs: contains the scaled structures.

rungen_gpaw_aimd(work_dir, pdict, mdict)

Refer to the rungen_gpaw_optimize() function.

postgen_gpaw_aimd(work_dir, pdict)

Refer to the postgen_gpaw_optimize() function.

sort_dft_task_dirs(task_dirs: list, work_dir: str) -> list

Sort the structure paths by its supercell size. This helps to chunk the tasks with similar supercell size together (similar supercell size means similar k-point number), which then lead to running DFT calculations in similar time, avoiding the situation that some tasks are finished while others are still running.





relax_initial_structure(pdict, mdict)

Relax the structure by DFT/MD

scale_and_relax(pdict, mdict)

Scale and relax the structures while fixing box size. Use when want to compute phonon at different volumes.

compute_stress_strain(pdict: dict, mdict: dict)

Compute stress and strain tensors for each scale-relaxed-structure by DFT/MD.

compute_stress_single_structure(work_dir, pdict, mdict)

The function does the following: - generate supercells with small deformation and compute corresponding strain tensor - run DFT/MD minimize calculation to compute stress tensor for each suppercell. - collect stress and strain tensor for each supercell

compute_elastic_tensor_single_structure(work_dir, pdict: dict, mdict: dict)

Compute elastic tensor for a single structure. - Collect stress and strain tensors from calculations on deformed structures. - Compute elastic constants by fitting stress-strain relations.

compute_elastic(pdict: dict, mdict: dict)

Compute elastic constants from stress-strain tensors.

elastic_calc(configfile_param: str, configfile_machine: str)

Generate initial data for training ML models



postelast_lammps_optimize(work_dir, pdict)

This function does: - Remove unlabeled .extxyz files, just keep the labeled ones. - Convert LAMMPS output to extxyz_labeled.

postelast_lammps_singlepoint(work_dir, pdict)

This function does: - Clean up unlabelled extxyz files - Collect forces from the output files



  • Elasticity

    Main class to compute the elastic stiffness tensor of the crystal.

  • ElasticConstant

    Class to manage elastic constants and compute elastic properties.


Elasticity(ref_cryst: Atoms, symprec: float = 1e-05)

Bases: object

Main class to compute the elastic stiffness tensor of the crystal. Steps to compute the elastic tensor: - Initialize the class with the reference structure. - Generate deformed structures with 'elementary deformations' - Compute stress for each deformed structure by DFT/MD. - Input the deformed structures with stress tensors to the method fit_elastic_tensor


  • ref_cryst (Atoms) –

    ASE Atoms object, reference structure (relaxed/optimized structure)

  • symprec (float, default: 1e-05 ) –

    symmetry precision to check the symmetry of the crystal


  • generate_deformations

    Generate deformed structures with 'elementary deformations' for elastic tensor calculation.

  • fit_elastic_tensor

    Calculate elastic tensor from the stress-strain relation by fitting this relation to the set of linear equations, strains and stresses.

  • get_pressure

    Return external isotropic (hydrostatic) pressure in ASE units.

  • write_cij

    Write the elastic constants to a text file.

  • fit_BM_EOS

    Calculate Birch-Murnaghan Equation of State for the crystal.

  • get_bulk_modulus

    Calculate bulk modulus using the Birch-Murnaghan equation of state.

  • write_MB_EOS

    Write the Birch-Murnaghan EOS parameters to a text file.

  • write_MB_EOS_pv_data

    Write the volume-pressure data to a text file.


ref_cryst = ref_cryst instance-attribute
symprec = symprec instance-attribute
bravais = get_lattice_type(self.ref_cryst, self.symprec)[0] instance-attribute
strain_list = None instance-attribute
stress_list = None instance-attribute
pressure = None instance-attribute
Cij = None instance-attribute
generate_deformations(delta: float = 0.01, n: int = 5)

Generate deformed structures with 'elementary deformations' for elastic tensor calculation. The deformations are created based on the symmetry of the crystal.


  • delta (float, default: 0.01 ) –

    the maximum magnitude of deformation in Angstrom and degrees.

  • n (int, default: 5 ) –

    number of deformations on each non-equivalent axis (number of deformations in each direction)


  • list[Atoms]: list of deformed structures. Number of structures = (n * number_of_axes). These structures are then used in MD/DFT to compute the stress tensor.

fit_elastic_tensor(deform_crysts: list[Atoms]) -> tuple[np.array, np.array]

Calculate elastic tensor from the stress-strain relation by fitting this relation to the set of linear equations, strains and stresses. The number of linear equations is computed depends on the symmetry of the crystal.

It is assumed that the crystal is converged (relaxed/optimized) under intended pressure/stress. The geometry and stress on this crystal is taken as the reference point. No additional optimization will be run. Then, the strain and stress tensor is computed for each of the deformed structures (exactly, the stress difference from the reference point).

This function returns tuple of Cij elastic tensor, and the fitting results returned by numpy.linalg.lstsq: Birch coefficients, residuals, solution rank, singular values.


  • deform_crysts (list[Atoms]) –

    list of Atoms objects with calculated deformed structures


  • tuple ( tuple[array, array] ) –

    tuple of Cij elastic tensor and fitting results. - Cij: in vector form of Voigt notation. - Bij: float vector, residuals, solution rank, singular values

get_pressure(stress) -> float

Return external isotropic (hydrostatic) pressure in ASE units. If the pressure is positive the system is under external pressure. This is a convenience function to convert output of get_stress function into external pressure.


  • stress(np.array

    stress tensor in Voight (vector) notation as returned by the .get_stress() method.


float: external hydrostatic pressure in ASE units.

write_cij(filename: str = 'cij.txt')

Write the elastic constants to a text file.


  • filename (str, default: 'cij.txt' ) –

    output file name

fit_BM_EOS(deform_crysts: list[Atoms])

Calculate Birch-Murnaghan Equation of State for the crystal.

\[ P(V) = \frac{B_0}{B'_0}\left[\left({\frac{V}{V_0}}\right)^{-B'_0} - 1\right] \]

It's coefficients are estimated using n single-point structures ganerated from the crystal (cryst) by the scan_volumes function between two relative volumes. The BM EOS is fitted to the computed points by least squares method.


  • cryst (Atoms) –

    Atoms object, reference structure (relaxed/optimized structure)

  • deform_crysts (list[Atoms]) –

    list of Atoms objects with calculated deformed structures


  • tuple

    tuple of EOS parameters ([V0, B0, B0p], pv data)'.

get_bulk_modulus(deform_crysts: list[Atoms])

Calculate bulk modulus using the Birch-Murnaghan equation of state. The bulk modulus is the B_0 coefficient of the B-M EOS. The units of the result are defined by ASE. To get the result in any particular units (e.g. GPa) you need to divide it by ase.units.::



  • cryst (Atoms) –

    Atoms object, reference structure (relaxed/optimized structure)

  • deform_crysts (list[Atoms]) –

    list of Atoms objects with calculated deformed structures


  • float

    bulk modulus B_0 in ASE units.

write_MB_EOS(filename: str = 'BMeos.txt')

Write the Birch-Murnaghan EOS parameters to a text file.


  • filename (str, default: 'BMeos.txt' ) –

    output file name

write_MB_EOS_pv_data(filename: str = 'BMeos_pv_data.txt')

Write the volume-pressure data to a text file.


  • filename (str, default: 'BMeos_pv_data.txt' ) –

    output file name

ElasticConstant(cij_mat: np.array = None, cij_dict: dict = None, bravais_lattice: str = 'Cubic')

Bases: object

Class to manage elastic constants and compute elastic properties.


  • Cij (array) –

    (6, 6) array of Voigt representation of elastic stiffness.

  • bravais_lattice (str, default: 'Cubic' ) –

    Bravais lattice name of the crystal.

  • **kwargs

    dictionary of elastic constants Cij. Where C11, C12, ... C66 : float,


  • Cij

    The elastic stiffness constants in Voigt 6x6 format

  • Sij

    The compliance constants in Voigt 6x6 format

  • bulk

    Returns a bulk modulus estimate.

  • shear

    Returns a shear modulus estimate.


bravais = bravais_lattice instance-attribute
Cij() -> np.ndarray

The elastic stiffness constants in Voigt 6x6 format

Sij() -> np.ndarray

The compliance constants in Voigt 6x6 format

bulk(style: str = 'Hill') -> float

Returns a bulk modulus estimate.


  • style(str)

    style of bulk modulus. Default value is 'Hill'. - 'Voigt': Voigt estimate. Uses Cij. - 'Reuss': Reuss estimate. Uses Sij. - 'Hill': Hill estimate (average of Voigt and Reuss).

shear(style: str = 'Hill') -> float

Returns a shear modulus estimate.


  • style(str)

    style of bulk modulus. Default value is 'Hill'. - 'Voigt': Voigt estimate. Uses Cij. - 'Reuss': Reuss estimate. Uses Sij. - 'Hill': Hill estimate (average of Voigt and Reuss).

func_MEOS(v, v0, b0, b0p)
func_BMEOS(v, v0, b0, b0p)
get_lattice_type(cryst: Atoms, symprec=1e-05) -> tuple[int, str, str, int]

Identify the lattice type and the Bravais lattice of the crystal. The lattice type numbers are (numbering starts from 1): Triclinic (1), Monoclinic (2), Orthorhombic (3), Tetragonal (4), Trigonal (5), Hexagonal (6), Cubic (7)


  • cryst (Atoms) –

    ASE Atoms object

  • symprec (float, default: 1e-05 ) –

    symmetry precision to check the symmetry of the crystal


  • tuple ( tuple[int, str, str, int] ) –

    Bravais name, lattice type number (1-7), space-group name, space-group number

generate_elementary_deformations(cryst: Atoms, delta: float = 0.01, n: int = 5, bravais_lattice: str = 'Cubic') -> list[Atoms]

Generate deformed structures with 'elementary deformations' for elastic tensor calculation. The deformations are created based on the symmetry of the crystal and are limited to the non-equivalent axes of the crystal.


  • cryst (Atoms) –

    Atoms object, reference structure (relaxed/optimized structure)

  • delta (float, default: 0.01 ) –

    the maximum magnitude of deformation in Angstrom and degrees.

  • n (int, default: 5 ) –

    number of deformations on each non-equivalent axis (number of deformations in each direction)

  • symprec (float) –

    symmetry precision to check the symmetry of the crystal


  • list[Atoms]

    list[Atoms] list of deformed structures. Number of structures = (n * number_of_axes)

deform_1axis(cryst: Atoms, axis: int = 0, delta: float = 0.01) -> Atoms

Return the deformed structure along one of the cartesian directions. The axis is specified as follows:

- tetragonal deformation: 0,1,2 = x,y,z.
- shear deformation: 3,4,5 = yz, xz, xy.


  • cryst (Atoms) –

    reference structure (structure to be deformed)

  • axis (int, default: 0 ) –

    direction of deformation. 0,1,2 = x,y,z; 3,4,5 = yz, xz, xy.

  • delta (float, default: 0.01 ) –

    magnitude of the deformation. Angstrom and degrees.


ase.Atoms: deformed structure

strain_voigt_to_symmetry_matrix(u: list, bravais_lattice: str = 'Cubic') -> np.array

Return the strain matrix to be used in stress-strain equation, to compute elastic tensor. The number of Cij constants depends on the symmetry of the crystal. This strain matrix is computed based on the symmetry to reduce the necessary number of equations to be used in the fitting procedure (also reduce the necessary calculations). Refer Landau's textbook for the details.

- Triclinic: C11, C22, C33, C12, C13, C23, C44, C55, C66, C16, C26, C36, C46, C56, C14, C15, C25, C45
- Monoclinic: C11, C22, C33, C12, C13, C23, C44, C55, C66, C16, C26, C36, C45
- Orthorhombic: C11, C22, C33, C12, C13, C23, C44, C55, C66
- Tetragonal: C11, C33, C12, C13, C44, C66
- Trigonal: C11, C33, C12, C13, C44, C14
- Hexagonal: C11, C33, C12, C13, C44
- Cubic: C11, C12, C44


  • u (list) –

    vector of strain in Voigt notation [ u_xx, u_yy, u_zz, u_yz, u_xz, u_xy ]

  • bravais_lattice (str, default: 'Cubic' ) –

    Bravais lattice name of the lattice


  • array

    np.array: Symmetry defined stress-strain equation matrix

get_cij_list(bravais_lattice: str = 'Cubic') -> list[str]

Return the order of elastic constants for the structure


  • bravais_lattice (str, default: 'Cubic' ) –

    Bravais lattice name of the lattice


list: list of strings C_ij the order of elastic constants

get_cij_6x6matrix(cij_dict: dict[float], bravais_lattice: str = 'Cubic') -> np.array

Return the Cij matrix for the structure based on the symmetry of the crystal.


  • cij_dict (dict) –

    dictionary of elastic constants Cij. Where C11, C12, ... C66 : float, Individual components of Cij for a standardized representation:

    • Triclinic: all Cij where i <= j
    • Monoclinic: C11, C12, C13, C15, C22, C23, C25, C33, C35, C44, C46, C55, C66
    • Orthorhombic: C11, C12, C13, C22, C23, C33, C44, C55, C66
    • Tetragonal: C11, C12, C13, C16, C33, C44, C66 (C16 optional)
    • Trigonal: C11, C12, C13, C14, C33, C44
    • Hexagonal: C11, C12, C13, C33, C44, C66 (2*C66=C11-C12)
    • Cubic: C11, C12, C44
    • Isotropic: C11, C12, C44 (2*C44=C11-C12)
  • bravais_lattice (str, default: 'Cubic' ) –

    Bravais lattice name of the lattice

get_voigt_strain_vector(cryst: Atoms, ref_cryst: Atoms = None) -> np.array

Calculate the strain tensor between the deformed structure and the reference structure. Return strain in vector form of Voigt notation, component order: u_{xx}, u_{yy}, u_{zz}, u_{yz}, u_{xz}, u_{xy}.


  • cryst (Atoms) –

    deformed structure

  • ref_cryst (Atoms, default: None ) –

    reference, undeformed structure


  • array

    np.array: vector of strain in Voigt notation.






prepho_gpaw_optimize_fixbox(work_dir, pdict)

Refer to the pregen_gpaw_optimize() function. Only change gpaw_dft for fixed cell optimization.

postpho_gpaw_singlepoint(work_dir, pdict)

This function does: - Clean up unlabelled extxyz files - Collect forces from the output files



prepho_lammps_optimize(work_dir, pdict)

This function does: - Prepare task_dirs: select only unlabeled structures to compute at clusters. - Prepare lammps_optimize and lammps_input files. - Convert extxyz to lmpdata. - Copy potential file to work_dir.

runpho_lammps_optimize(work_dir, pdict, mdict)

This function does: - Read task_dirs from .yaml file - Prepare the task_list - Prepare fordward & backward files - Prepare command_list - Submit jobs to the cluster - Download the results when finished

postpho_lammps_optimize(work_dir, pdict)

This function does: - Remove unlabeled .extxyz files, just keep the labeled ones. - Convert LAMMPS output to extxyz_labeled.

prepho_lammps_optimize_fixbox(work_dir, pdict)

This function does: - Prepare task_dirs: select only unlabeled structures to compute at clusters. - Prepare lammps_optimize and lammps_input files. - Convert extxyz to lmpdata. - Copy potential file to work_dir.

prepho_lammps_singlepoint(work_dir, pdict)

This function does: - Prepare task_dirs: select only unlabeled structures to compute at clusters. - Prepare lammps_optimize and lammps_input files. - Convert extxyz to lmpdata. - Copy potential file to work_dir.

postpho_lammps_singlepoint(work_dir, pdict)

This function does: - Clean up unlabelled extxyz files - Collect forces from the output files



convert_phonopy2ase(atoms: PhonopyAtoms) -> Atoms
convert_ase2phonopy(atoms: Atoms) -> PhonopyAtoms
get_band_path(atoms: Atoms, path_str: str = None, npoints: int = 61, path_frac=None, labels=None)
get_band_structure(work_dir, pdict)
get_DOS_n_PDOS(work_dir, pdict)
get_thermal_properties(work_dir, pdict)
_ref_phonon_calc(atoms: Atoms, calc: object, supercell_matrix=[[2, 0, 0], [0, 2, 0], [0, 0, 2]], displacement=0.01, NAC: bool = False) -> object

NOTE: this function is note be used. just for reference.


  • atoms (Atoms) –

    ASE's structure object which is already optimized/relaxed as the ground state.

  • calc (object) –

    ASE calculator object.

  • supercell_matrix (list, default: [[2, 0, 0], [0, 2, 0], [0, 0, 2]] ) –

    The supercell matrix for the phonon calculation.

  • displacement (float, default: 0.01 ) –

    The atomic displacement distance in Angstrom.

  • NAC (bool, default: False ) –

    Whether to use non-analytical corrections (NAC) for the phonon calculation.

NOTE: not yet finished



build_structure_phonon(pdict, mdict)
relax_initial_structure(pdict, mdict)

Relax the structure by DFT/MD

scale_and_relax(pdict, mdict)

Scale and relax the structures while fixing box size. Use when want to compute phonon at different volumes.

compute_force(pdict, mdict)

Compute forces for each scale-relaxed-structure by DFT/MD.

compute_force_single_structure(work_dir, pdict, mdict)

Run DFT/MD single-point calculation to compute forces for a list of supercells of a single structure. The function does the following: - Initialize the phonopy object - generate supercells with displacements - run DFT/MD single-point calculation to compute forces for each supercell - assign forces back to phonopy object - save the phonopy object to a file for latter post-processing

compute_phonon(pdict, mdict)

Compute phonon properties by phonopy functions.

phonon_calc(configfile_param: str, configfile_machine: str)

Generate initial data for training ML models




alff accepts a configuration file in YAML/JSON/JSONC format.



ALFF parameters.

mlp_engine: str
    The engine to use for training the MLP model. Choices: 'sevenn', 'mace'
num_models: int
    Number of models to train.
init_data_paths: list[str]
    List of paths to the initial data.
distributed: bool
    Whether to use distributed training.
distributed_backend: str
    The Pytorch backend to use for distributed training. Choices: 'nccl', 'mpi'
sevenn_args: dict
    SevenNet's parameters.
mace_args: dict
    Mace's parameters.

SevenNet parameters that are not applicable.

These parameters are either generated by the ALFF or are not required for running the ALFF.

train.random_seed: int
    Random seed for reproducibility.
data.load_dataset_path: list[str]
    List of paths to the dataset.

ALFF parameters for running on a clusters.



  • build_conf

    Build atomic configuration, using library

  • scale_atoms

    Scale the atoms by the given factors along the three directions.

  • perturb_atoms

    Perturb the atoms by random displacements. This method adds random displacements to the atomic positions. See more

  • align_atom_to_origin

    Align min atoms position to the origin.

  • add_vacuum

    Add vacuum to the atoms.

  • make_cell_triangular

    Atoms with a box is an upper triangular matrix is a requirement to use NPT class in ASE.

  • make_cell_triangular_extxyz

    Make the cell of atoms in extxyz file to be triangular.

  • poscar2lmpdata

    Convert POSCAR file to LAMMPS data file.

  • extxyz2lmpdata

    Convert extxyz file to LAMMPS data file.

  • lmpdata2extxyz

    Convert LAMMPS data file to extxyz file.

  • lmpdump2extxyz

    Convert LAMMPS dump file to extxyz file.

  • write_extxyz

    Write a list of Atoms object to an extxyz file. The exited function does not support writing file if the parent directory does not exist. This function will overcome this problem.

  • read_extxyz

    Read extxyz file. The exited returns a single Atoms object if file contains only one frame. This function will return a list of Atoms object.

  • change_key_in_extxyz

    NOTE: when Atoms object contains reversed_keys: energy, forces, stress, momenta, free_energy,... it will has a SinglePointCalculator object attached to the Atoms, and these keys can be accessed via atoms.calc.results or .get_() methods.

  • remove_key_in_extxyz

    Remove unwanted keys from extxyz file to keep it clean.

  • select_extxyz_frames

    Choose frames from a extxyz trajectory file, based on some criteria.

  • find_primitive_cell

    Find the primitive cell of the given atoms object.

build_conf(pdict: dict)

Build atomic configuration, using library

Supported structure types: - bulk: sc, fcc, bcc, tetragonal, bct, hcp, rhombohedral, orthorhombic, mcl, diamond, zincblende, rocksalt, cesiumchloride, fluorite or wurtzite. - molecule: molecule - mx2: MX2


  • pdict (dict) –

    Parameters dictionary


  • outfile

    Save atomic configuration with format specified by ext of outfile. All ASE supported formats are allowed.

scale_atoms(atoms: Atoms, factors: list = [1, 1, 1]) -> Atoms

Scale the atoms by the given factors along the three directions.

perturb_atoms(atoms: Atoms, std_disp: float) -> Atoms

Perturb the atoms by random displacements. This method adds random displacements to the atomic positions. See more

align_atom_to_origin(atoms: Atoms) -> Atoms

Align min atoms position to the origin.

add_vacuum(atoms: Atoms, distances: list = [0, 0, 0]) -> Atoms

Add vacuum to the atoms.

make_cell_triangular(atoms: Atoms) -> Atoms

Atoms with a box is an upper triangular matrix is a requirement to use NPT class in ASE. This function will normalize the atoms's cell matrix to an upper triangular matrix. REF: this comments

make_cell_triangular_extxyz(extxyz_file: str) -> None

Make the cell of atoms in extxyz file to be triangular.

poscar2lmpdata(poscar_file: str, lmpdata_file: str, atom_style: str = 'atomic') -> list[str]

Convert POSCAR file to LAMMPS data file.

extxyz2lmpdata(extxyz_file: str, lmpdata_file: str, atom_style: str = 'atomic') -> list[str]

Convert extxyz file to LAMMPS data file. NOTE: need to save original_cell to able to recover the original orientation of the crystal.

lmpdata2extxyz(lmpdata_file: str, extxyz_file: str, original_cell_file: str = None)

Convert LAMMPS data file to extxyz file.

lmpdump2extxyz(lmpdump_file: str, extxyz_file: str, original_cell_file: str = None, stress_file: str = None, lammps_units: str = 'metal')

Convert LAMMPS dump file to extxyz file.


  • lmpdump_file (str) –

    Path to the LAMMPS dump file.

  • extxyz_file (str) –

    Path to the output extxyz file.

  • original_cell_file (str, default: None ) –

    Path to the text file contains original_cell. It should a simple text file that can write/read with numpy. If not provided, try to find in the same directory as lmpdump_file with the extension .original_cell. Defaults to None.

  • stress_file (str, default: None ) –

    Path to the text file contains stress tensor. Defaults to None.

  • Current ver: stress is mapped based on frame_index, it requires that frames in text stress file must be in the same "length and order" as in the LAMMPS dump file.
  • TODO: map based on timestep. Need to modify ASE to read timestep from LAMMPS dump file.
write_extxyz(outfile: str, atoms: list)

Write a list of Atoms object to an extxyz file. The exited function does not support writing file if the parent directory does not exist. This function will overcome this problem.


  • atoms (list) –

    List of Atoms object.

  • outfile (str) –

    Path to the output file.

read_extxyz(extxyz_file: str, index=':') -> list[Atoms]

Read extxyz file. The exited returns a single Atoms object if file contains only one frame. This function will return a list of Atoms object.


  • extxyz_file (str) –

    Path to the output file.


  • list ( list[Atoms] ) –

    List of Atoms object.

  • returns a single Atoms object or a list of Atoms object, depending on the index argument.
    • index=":" will always return a list.
    • index=0 or index=-1 will return a single Atoms object.
  • this function will always return a list of Atoms object, even index=0 or index=-1
change_key_in_extxyz(extxyz_file: str, keys: dict[str, str])

NOTE: when Atoms object contains reversed_keys: energy, forces, stress, momenta, free_energy,... it will has a SinglePointCalculator object attached to the Atoms, and these keys can be accessed via atoms.calc.results or .get_() methods.

These keys are not stored in atoms.arrays or So to access these properties via atoms.arrays or, we need to change the keys that differ from the reserved keys.


  • extxyz_file (str) –

    Path to the extxyz file.

  • keys (dict) –

    Dictionary of key pairs {"old_key": "new_key"} to change. Example: {"old_key": "new_key", "forces": "ref_forces", "stress": "ref_stress"}

remove_key_in_extxyz(extxyz_file: str, keys: list[str])

Remove unwanted keys from extxyz file to keep it clean.

select_extxyz_frames(extxyz_file: str, has_symbols: list = None, only_symbols: list = None, exact_symbols: list = None, has_properties: list = None, only_properties: list = None, has_columns: list = None, only_columns: list = None, output_file: str = 'selected_frames.extxyz') -> list[Atoms]

Choose frames from a extxyz trajectory file, based on some criteria.


  • extxyz_file (str) –

    Path to the extxyz file.

  • has_symbols (list, default: None ) –

    List of symbols that each frame must have at least one of them.

  • only_symbols (list, default: None ) –

    List of symbols that each frame must have only these symbols.

  • exact_symbols (list, default: None ) –

    List of symbols that each frame must have exactly these symbols.

  • has_properties (list, default: None ) –

    List of properties that each frame must have at least one of them.

  • only_properties (list, default: None ) –

    List of properties that each frame must have only these properties.

  • has_columns (list, default: None ) –

    List of columns that each frame must have at least one of them.

  • only_columns (list, default: None ) –

    List of columns that each frame must have only these columns.

  • output_file (str, default: 'selected_frames.extxyz' ) –

    Path to the output file.

find_primitive_cell(atoms: Atoms, symprec=1e-05, angle_tolerance=-1.0) -> Atoms

Find the primitive cell of the given atoms object. NOTE: must use .get_scaled_positions() to define the cell in spglib.



  • prepare_submission

    Function to submit a job to the cluster:

  • submit_job_chunk

    Improved version of submit_job to split the task_dirs into chunks and submit them.

  • async_submit_job_chunk

    Convert submit_job_chunk() into an async function but only need to wait for the completion of the entire for loop (without worrying about the specifics of each operation inside the loop)

  • info_current_dispatch

    Return the information of the current chunk of tasks.

  • remote_info

    Return the remote machine information.


COLOR_MAP = {0: 'blue', 1: 'green', 2: 'cyan', 3: 'yellow', 4: 'red', 5: 'purple'} module-attribute
fh = logging.FileHandler(FILE_LOG_DISPATCH) module-attribute
fmt = logging.Formatter('%(asctime)s | %(name)s-%(levelname)s: %(message)s', '%Y%b%d %H:%M:%S') module-attribute
prepare_submission(mdict_machine: dict, mdict_resources: dict, command_list: list[str], work_dir: str, task_dirs: list[str], forward_files: list[str], backward_files: list[str], forward_common_files: list[str], outlog: str, errlog: str)

Function to submit a job to the cluster: - Prepare the task list - Make the submission and wait for the job to finish - Download the results

submit_job_chunk(mdict_machine: dict, mdict_resources: dict, command_list: list[str], work_dir: str, task_dirs: list[str], forward_files: list[str], backward_files: list[str], forward_common_files: list[str], outlog: str, errlog: str, job_limit: int, machine_index=1)

Improved version of submit_job to split the task_dirs into chunks and submit them.

async_submit_job_chunk(mdict_machine: dict, mdict_resources: dict, command_list: list[str], work_dir: str, task_dirs: list[str], forward_files: list[str], backward_files: list[str], forward_common_files: list[str], outlog: str, errlog: str, job_limit: int, machine_index: int = 0) async

Convert submit_job_chunk() into an async function but only need to wait for the completion of the entire for loop (without worrying about the specifics of each operation inside the loop) NOTE: - An async function normally contain a await ... statement to be awaited (yield control to event loop) - If the 'event loop is blocked' by a asynchronous function (it will not yield control to event loop), the async function will wait for the completion of the synchronous function. So, the async function will not be executed asynchronously. Try to use await asyncio.to_thread() to run the synchronous function in a separate thread, so that the event loop is not blocked.

info_current_dispatch(task_dirs, job_limit, chunk_index, task_dirs_1chunk, old_time=None, new_time=None, machine_index=0) -> str

Return the information of the current chunk of tasks.

remote_info(mdict) -> str

Return the remote machine information. Args: mdict (dict): the machine dictionary



time_str = time.strftime('%Y%b%d_%H%M%S') module-attribute
DIR_LOG = 'log' module-attribute
FILE_LOG_ALFF = f'{DIR_LOG}/{time_str}_alff.log' module-attribute
DIR_TRAIN = '00_train' module-attribute
DIR_MD = '01_md' module-attribute
DIR_DFT = '02_dft' module-attribute
DIR_DATA = '03_data' module-attribute
DIR_TMP = 'tmp_dir' module-attribute
DIR_TMP_DATA = 'copied_data' module-attribute
DIR_TMP_MODEL = 'copied_model' module-attribute
FILE_ITER_LOG = '_iter.log' module-attribute
FILE_DATAPATH = 'data_paths.yaml' module-attribute
FILE_MODELPATH = 'model_paths.yaml' module-attribute
FILE_TRAIN_ARG = 'train_args.yaml' module-attribute
FILE_CHECKPOINT_PATH = 'checkpoint_paths.yaml' module-attribute
FILE_LAMMPS_SCRIPT = '' module-attribute
FILE_LAMMPS_ARG = 'lammps_args.yaml' module-attribute
FILE_ASE_ARG = 'ase_args.yaml' module-attribute
FILE_TRAJ_MD = 'traj_md.extxyz' module-attribute
FILE_TRAJ_MD_CANDIDATE = FILE_TRAJ_MD.replace('.extxyz', '_candidate.extxyz') module-attribute
FMT_ITER = '05d' module-attribute
FMT_STAGE = '02d' module-attribute
FMT_MODEL = '03d' module-attribute
FMT_CONF = '04d' module-attribute
FMT_TASK_MD = '06d' module-attribute
FMT_TASK_DFT = '06d' module-attribute
DIR_BUILD = '00_build_structure' module-attribute
DIR_SCALE = '01_scale' module-attribute
DIR_GENDATA = '02_gendata' module-attribute
FILE_FRAME_unLABEL = 'conf.extxyz' module-attribute
FILE_FRAME_LABEL = 'conf_label.extxyz' module-attribute
FILE_TRAJ_LABEL = 'traj_label.extxyz' module-attribute
FILE_FINAL_DATA = 'data_label.extxyz' module-attribute
FILE_COLLECT_DATA = 'collect_data_label.extxyz' module-attribute
DIR_SUPERCELL = '01_supercell' module-attribute
DIR_PHONON = '02_phonon' module-attribute
DIR_ELASTIC = '02_elastic' module-attribute
LIB_GPAW_PATH = f'{ROOT_PATH}/util/script/ase_script' module-attribute
SCHEMA_ARG_ASE = f'{LIB_GPAW_PATH}/schema_arg_ase.yaml' module-attribute
SCHEMA_ARG_GENDATA = f'{ROOT_PATH}/data/schema_arg_gendata.yaml' module-attribute
SCHEMA_ARG_FINETUNE = f'{ROOT_PATH}/al/schema_arg_finetune.yaml' module-attribute
SCHEMA_ARG_ACTIVE_LEARN = f'{ROOT_PATH}/al/schema_arg_active_learn.yaml' module-attribute
SCHEMA_ARG_PHONON = f'{ROOT_PATH}/phonon/schema_arg_phonon.yaml' module-attribute
SCHEMA_ARG_ELASTIC = f'{ROOT_PATH}/elastic/schema_arg_elastic.yaml' module-attribute
SCHEMA_ARG_MACHINE = f'{ROOT_PATH}/util/script/schema/schema_arg_machine.yaml' module-attribute




Create the directory name for the structure






Some notes: - Run MD in ase following this tutorial: - For MD run, control symmetry to avoid error: broken symmetry. - Must set txt='calc.txt' in GPAW calculator for backward files. - Defines some print functions that can attach to ASE's dynamics object - param_yaml must contain - a dict ase_calc define calculator. - a dict md with ASE MD parameters.



pdict = get_cli_args() module-attribute
ase_calc = pdict.get('calc', None) module-attribute
code_lines = module-attribute
struct_args = pdict['structure'] module-attribute
input_pbc = struct_args.get('pbc', False) module-attribute
extxyz_file = struct_args['from_extxyz'] module-attribute
atoms = read(extxyz_file, format='extxyz', index='-1') module-attribute
md_args = pdict.get('md', {}) module-attribute
dt = dt * units.fs module-attribute
temperature = md_args.get('temperature', 300) module-attribute
ensemble = md_args.get('ensemble', 'NVE') module-attribute
thermostat = md_args.get('thermostat', {}) module-attribute
thermostat_name = thermostat.get('name', 'Nose_Hoover_chain') module-attribute
support_thermostats = ['Langevin', 'Nose_Hoover', 'Nose_Hoover_chain'] module-attribute
barostat = md_args.get('barostat', {}) module-attribute
barostat_name = barostat.get('name', 'Parrinello_Rahman') module-attribute
support_barostats = ['Parrinello_Rahman', 'Iso_Nose_Hoover_chain', 'Aniso_Nose_Hoover_chain'] module-attribute
dyn = VelocityVerlet(atoms, timestep=dt) module-attribute
friction = thermostat.get('friction', 0.002) / units.fs module-attribute
tdamp = thermostat.get('tdamp', 50) module-attribute
tchain = thermostat.get('tchain', 3) module-attribute
stress = md_args.get('stress', None) module-attribute
stress_in_eVA3 = stress / units.GPa module-attribute
pfactor = md_args.get('pfactor', 2000000.0) module-attribute
mask = barostat.get('mask', None) module-attribute
pdamp = barostat.get('pdamp', 1000) module-attribute
pchain = barostat.get('pchain', 3) module-attribute
equil_steps = md_args.get('equil_steps', 0) module-attribute
num_frames = md_args.get('num_frames', 1) module-attribute
traj_freq = md_args.get('traj_freq', 1) module-attribute
nsteps = num_frames * traj_freq module-attribute

Get the arguments from the command line

print_dynamic(atoms=atoms, filename='calc_dyn_properties.txt')

Function to print the potential, kinetic and total energy. Note: Stress printed in this file in GPa, but save in EXTXYZ in eV/Angstrom^3.

write_dyn_extxyz(atoms=atoms, filename='traj_md.extxyz')

Some notes: - Run MD in ase following this tutorial: - For MD run, control symmetry to avoid error: broken symmetry. - Must set txt='calc.txt' in GPAW calculator for backward files. - param_yaml must contain - a dict gpaw_calc with GPAW parameters. - a dict md with ASE MD parameters.



pdict = get_cli_args() module-attribute
gpaw_args = pdict['calc'].get('gpaw', {}) module-attribute
gpaw_params = {'mode': {'name': 'pw', 'ecut': 500}, 'xc': 'PBE', 'convergence': {'energy': 1e-06, 'density': 0.0001, 'eigenstates': 1e-08}, 'occupations': {'name': 'fermi-dirac', 'width': 0.01}, 'txt': 'calc_singlepoint.txt'} module-attribute
calc1 = GPAW(**gpaw_params) module-attribute
dftd3_args = pdict['calc'].get('dftd3', {}) module-attribute
xc = gpaw_params['xc'] module-attribute
damping = dftd3_args.pop('damping', 'd3zero') module-attribute
calc2 = DFTD3(method=xc, damping=damping, **dftd3_args) module-attribute
calc = SumCalculator([calc1, calc2]) module-attribute
struct_args = pdict['structure'] module-attribute
input_pbc = struct_args.get('pbc', False) module-attribute
extxyz_file = struct_args['from_extxyz'] module-attribute
atoms = read(extxyz_file, format='extxyz', index='-1') module-attribute
md_args = pdict.get('md', {}) module-attribute
dt = dt * units.fs module-attribute
temperature = md_args.get('temperature', 300) module-attribute
ensemble = md_args.get('ensemble', 'NVE') module-attribute
thermostat = md_args.get('thermostat', {}) module-attribute
thermostat_name = thermostat.get('name', 'Nose_Hoover_chain') module-attribute
support_thermostats = ['Langevin', 'Nose_Hoover', 'Nose_Hoover_chain'] module-attribute
barostat = md_args.get('barostat', {}) module-attribute
barostat_name = barostat.get('name', 'Parrinello_Rahman') module-attribute
support_barostats = ['Parrinello_Rahman', 'Iso_Nose_Hoover_chain', 'Aniso_Nose_Hoover_chain'] module-attribute
dyn = VelocityVerlet(atoms, timestep=dt) module-attribute
friction = thermostat.get('friction', 0.002) / units.fs module-attribute
tdamp = thermostat.get('tdamp', 50) module-attribute
tchain = thermostat.get('tchain', 3) module-attribute
stress = md_args.get('stress', None) module-attribute
stress_in_eVA3 = stress / units.GPa module-attribute
pfactor = md_args.get('pfactor', 2000000.0) module-attribute
mask = barostat.get('mask', None) module-attribute
pdamp = barostat.get('pdamp', 1000) module-attribute
pchain = barostat.get('pchain', 3) module-attribute
equil_steps = md_args.get('equil_steps', 0) module-attribute
num_frames = md_args.get('num_frames', 2) module-attribute
traj_freq = md_args.get('traj_freq', 1) module-attribute
nsteps = num_frames * traj_freq module-attribute

Get the arguments from the command line

print_dynamic(atoms=atoms, filename='calc_dyn_properties.txt')

Function to print the potential, kinetic and total energy. Note: Stress printed in this file in GPa, but save in EXTXYZ in eV/Angstrom^3.

write_dyn_extxyz(atoms=atoms, filename='traj_label.extxyz')

Some notes - Must set txt='calc.txt' in GPAW calculator for backward files. - param_yaml must contain - a dict gpaw_calc with GPAW parameters. - a dict optimize with ASE optimization parameters.



pdict = get_cli_args() module-attribute
gpaw_args = pdict['calc'].get('gpaw', {}) module-attribute
gpaw_params = {'mode': {'name': 'pw', 'ecut': 500}, 'xc': 'PBE', 'convergence': {'energy': 1e-06, 'density': 0.0001, 'eigenstates': 1e-08}, 'occupations': {'name': 'fermi-dirac', 'width': 0.01}, 'txt': 'calc_optimize.txt'} module-attribute
calc1 = GPAW(**gpaw_params) module-attribute
dftd3_args = pdict['calc'].get('dftd3', {}) module-attribute
xc = gpaw_params['xc'] module-attribute
damping = dftd3_args.pop('damping', 'd3zero') module-attribute
calc2 = DFTD3(method=xc, damping=damping, **dftd3_args) module-attribute
calc = SumCalculator([calc1, calc2]) module-attribute
struct_args = pdict['structure'] module-attribute
input_pbc = struct_args.get('pbc', False) module-attribute
extxyz_file = struct_args['from_extxyz'] module-attribute
atoms = read(extxyz_file, format='extxyz', index='-1') module-attribute
opt_args = pdict.get('optimize', {}) module-attribute
mask = opt_args.get('mask', None) module-attribute
pbc = atoms.get_pbc() module-attribute
fmax = opt_args.get('fmax', 0.05) module-attribute
max_steps = opt_args.get('max_steps', 10000) module-attribute
atoms_filter = FrechetCellFilter(atoms, mask=mask) module-attribute
opt = BFGS(atoms_filter) module-attribute
pot_energy = atoms.get_potential_energy() module-attribute
forces = atoms.get_forces() module-attribute
stress = atoms.get_stress() module-attribute
output_file = extxyz_file.replace('.extxyz', '_label.extxyz') module-attribute

Get the arguments from the command line


Some notes - Must set txt='calc.txt' in GPAW calculator for backward files. - param_yaml must contain - a dict gpaw_calc with GPAW parameters.



pdict = get_cli_args() module-attribute
gpaw_args = pdict['calc'].get('gpaw', {}) module-attribute
gpaw_params = {'mode': {'name': 'pw', 'ecut': 500}, 'xc': 'PBE', 'convergence': {'energy': 1e-06, 'density': 0.0001, 'eigenstates': 1e-08}, 'occupations': {'name': 'fermi-dirac', 'width': 0.01}, 'txt': 'calc_singlepoint.txt'} module-attribute
calc1 = GPAW(**gpaw_params) module-attribute
dftd3_args = pdict['calc'].get('dftd3', {}) module-attribute
xc = gpaw_params['xc'] module-attribute
damping = dftd3_args.pop('damping', 'd3zero') module-attribute
calc2 = DFTD3(method=xc, damping=damping, **dftd3_args) module-attribute
calc = SumCalculator([calc1, calc2]) module-attribute
struct_args = pdict['structure'] module-attribute
input_pbc = struct_args.get('pbc', False) module-attribute
extxyz_file = struct_args['from_extxyz'] module-attribute
atoms = read(extxyz_file, format='extxyz', index='-1') module-attribute
pot_energy = atoms.get_potential_energy() module-attribute
forces = atoms.get_forces() module-attribute
stress = atoms.get_stress() module-attribute
output_file = extxyz_file.replace('.extxyz', '_label.extxyz') module-attribute

Get the arguments from the command line



generate_input_lammps_md(file_data: str, pair_style: str = ['e3gnn/parallel'], pair_coeff: str = ['* * numb_layers /path/to/potential Cu'], dt: float = 0.001, temp: float = 300, press: float = 0.0, tau_t: int = 100, tau_p: int = 1000, sampling_ensemble: str = 'npt', relax_ensemble: str = None, relax_steps: int = 10000, collect_frames: int = 100, traj_freq: int = 500, thermo_freq: int = 5000, file_plumed: str = None, units: str = 'metal', atom_style: str = 'atomic', pbc: list = [1, 1, 1], dir_output: str = 'output_md', file_output: str = '')

Generate lammps input file for MD simulation.


  • relax_ensemble (str, default: None ) –

    Ensemble for relaxation before sampling. If None, use the same ensemble as sampling_ensemble.

  • collect_frames (int, default: 100 ) –

    Number of frames to be collected. Then total MD nsteps = collect_frames * traj_freq

  •"sub_text\s+", line) matches sub_text followed by at least 1 space.
_revise_lammps_npt(lines, relax_ensemble, sampling_ensemble)

Revise lammps input file to use npt ensemble

_revise_lammps_nvt(lines, relax_ensemble, sampling_ensemble)

Revise lammps input file to use nvt ensemble

_revise_lammps_nve(lines, relax_ensemble, sampling_ensemble)

Revise lammps input file to use nve ensemble

_revise_lammps_plumed(lines, file_plumed)

Revise lammps input file to use plumed

generate_script_lammps_singlepoint(units: str = 'metal', atom_style: str = 'atomic', dimension: int = 3, pbc: list = [1, 1, 1], read_data: str = 'path_to_file.lmpdata', read_restart: str = None, pair_style: list[str] = ['eam/alloy'], pair_coeff: list[str] = ['* * Cu_Mishin2001.eam.alloy Cu'], dir_output: str = 'output_md', save_script: str = '')

Generate lammps script for single-point calculation.

generate_script_lammps_minimize(units: str = 'metal', atom_style: str = 'atomic', dimension: int = 3, pbc: list = [1, 1, 1], read_data: str = 'path_to_file.lmpdata', read_restart: str = None, pair_style: list[str] = ['eam/alloy'], pair_coeff: list[str] = ['* * Cu_Mishin2001.eam.alloy Cu'], min_style: str = 'cg', etol: float = 1e-09, ftol: float = 1e-09, maxiter: int = 100000, maxeval: int = 100000, dmax: float = 0.01, press: list = [None, None, None], couple: str = 'none', dir_output: str = 'output_md', save_script: str = '')

Generate lammps script for minimization.


  • etol (float, default: 1e-09 ) –

    Energy tolerance for minimization. Default 1.0e-9

  • ftol (float, default: 1e-09 ) –

    Force tolerance for minimization. Default 1.0e-9

  • maxiter (int, default: 100000 ) –

    Maximum number of iterations. Default 100000

  • maxeval (int, default: 100000 ) –

    Maximum number of evaluations. Default 100000

  • dmax (float, default: 0.01 ) –

    maximum distance for line search to move (distance units). Default: 0.01

_pbc_string(pbc: list = [1, 1, 0]) -> str

Convert pbc list to string. [1, 1, 0] -> "p p f". See

Acceptable values: 1, 0, p, f, s, m

_pressure_string(press: Union[list, float] = [0.0, 0.0, 0.0]) -> str
_revise_input_pressure(press: list, pbc: list = [1, 1, 1]) -> list

Revise pressure string based on pbc

lmp_section_atom_forcefield(units: str = 'metal', atom_style: str = 'atomic', dimension: int = 3, pbc: list = [1, 1, 1], read_data: str = 'path_to_file.lmpdata', read_restart: str = None, pair_style: list[str] = ['eam/alloy'], pair_coeff: list[str] = ['* * Cu_Mishin2001.eam.alloy Cu']) -> list[str]

Generate lammps input block for atom and forcefield.


  • read_data (str, default: 'path_to_file.lmpdata' ) –

    Path to the data file. e.g. "path_to_lmpdata"

  • read_restart (str, default: None ) –

    Path to the restart file. e.g. "path_to_restart". If provided, read_restart is used instead of read_data.

lmp_section_common_setting(dir_output: str = 'output_md') -> list[str]
lmp_section_minimize(min_style: str = 'cg', etol: float = 1e-09, ftol: float = 1e-09, maxiter: int = 100000, maxeval: int = 100000, dmax: float = 0.01, press: list = [None, None, None], couple: str = 'none') -> list[str]

Generate lammps input block for minimization.

lmp_section_dynamic_setting(dt: float) -> list[str]
lmp_section_custom_lines(lines: list[str]) -> list[str]


some common utilities for generator, auto_test and data


text_pkg_info(modules=['numpy', 'scipy', 'ase', 'thutil', 'sevenn', 'phonopy'])
write_iter_log(filename: str, iter_idx: int, stage_idx: int, stage_name: str)
read_iter_log(filename: str) -> list[int]

Read the last line of the iter log file.

log_text_stage(iter_idx, stage_idx, stage_name)
iter_str(iter_idx: int) -> str
replace(file_name, pattern, subst)
copy_file_list(file_list, from_path, to_path)
cmd_append_log(cmd, log_file)
repeat_to_length(input_str: str, length) -> str
expand_idx(in_list: list[int, str]) -> list[int]

Expand the input list of indices to a list of integers. Eg: in_list = [1, 2, "3-5:2", "6-10"]
