API Documentation¶
Density fitting¶
A set of utility functions for density fitting. Some Credits to: https://gitlab.com/jmargraf/kdf
- class dfa_recommender.df_class.DensityFitting(wfnpath: str, xyzfile: str, basis: str, charge: int = 0, spin: int = 1, wfnpath2: str = 'NA')[source]¶
Bases:
objectDensity fitting class to project the electron density onto auxiliary basis sets.
- calc_powerspec() array[source]¶
Calculates powerspectrum to yeild a invariant representation from density fitting coefficients
- Returns:
powerspec – powerspectrum derived from density fitting coefficients.
- Return type:
np.ndarray
- calc_utilities() None[source]¶
Calculate the shell numbers and number of basis functions in each shell.
- compensate_charges()[source]¶
Compensate charges in the density fitting. NOTE that currently only work for alpha.
- convert_CP2e3nn() None[source]¶
match m between psi4 and e3nn convension within the same l. For example, for l=1, psi4 m: [0, 1, -1] e3nn m: [-1, 0, 1] For example, for l=2, psi4 m: [0, 1, -1, 2, -2] e3nn m: [-2, -1, 0, 1, 2]
- get_dab() None[source]¶
Build dab_P tensor as tensor before contracting to aux coeffiecients (np.ndarray)
- pad_df_coeffs() None[source]¶
Convert self.C_P (a 1D array) to self.self.C_P_pad (N_atoms x M), where M corresponds to the largest dim of coeffs of all atoms. For example, H2O at def2-universal-jkfit basis has 113 coeffs. H -> [0, 0, 1, 1, 2, 2] -> 18 coeffs O -> [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 4] -> 77 coeffs Then self.self.C_P_pad is a (3, 77) np.array with zero-padding at corresponding irreps
- property wfnpath: None¶
- property wfnpath2: None¶
- property xyzfile: None¶
- dfa_recommender.df_class.get_molecule(xyzfile: str, charge: int, spin: int, sym: str = 'c1') Tuple[Molecule, list][source]¶
Assemble a molecule object from xyzfile, charge and spin.
- Parameters:
xyzfile (str,) – path to the xyz file of the input molecule.
charge (int,) – charge of the input molecule.
spin (int,) – spin multiplicity (2*S + 1) for the input molecule
sym (str, Optional, default: c1) – point group symmetry of the input molecule
- Returns:
mol (psi4.geometry object) – psi4.geometry object for the input molecule
symbols (list) – list of atom symbols
- dfa_recommender.df_utils.get_spectra(densfit: DensityFitting, fock: bool = False, H: bool = False, t: str = 'alpha') array[source]¶
Compute the final power spectrum for the DensityFitting object
- Parameters:
densfit (DensityFitting object,) – created from .xyz, .wfn., and basis set
fock (bool, Optional, default: False) – Fock fitting or not
H (bool, Optional, default: False) – Hamiltonian (Potential + Kinetics) fitting or not
t (str, Optional, default: alpha) – alpha or beta spin orbitals
Returns
——–
powerspec (np.ndarray) – powerspectrum derived from density fitting coefficients
- dfa_recommender.df_utils.get_subtracted_spectra(densfit: DensityFitting, fock=False, t='alpha') array[source]¶
Compute the final power spectrum (w.r.t. a second .wfn file) for the DensityFitting object
- Parameters:
densfit (DensityFitting object,) – created from .xyz, .wfn., and basis set
fock (bool, Optional, default: False) – Fock fitting or not
H (bool, Optional, default: False) – Hamiltonian (Potential + Kinetics) fitting or not
t (str, Optional, default: alpha) – alpha or beta spin orbitals
Returns
——–
powerspec (np.ndarray) – powerspectrum derived from density fitting coefficients
Behler-Parrinello type gated networks with density fitting features¶
Gated network for energy prediction.
- class dfa_recommender.net.ElementalGate(elements, n_out, onehot=True, trainable=False)[source]¶
Bases:
ModuleElement based masking. Produces a Nbatch x Natoms x Nelem mask depending on the nuclear charges passed as an argument. The purpose is to create element-wise activate based on the block-wise weights in self.gate If onehot is set, mask is one-hot mask, else a random embedding is used. If the trainable flag is set to true, the gate values can be adapted during training. It is recommended to create a mapping dictionary for your elements. For example: mapping = {“X”: 0, “H”: 1, “C”: 2, “N”: 3, “O”: 4, “F”: 5}
- forward(inputs: Tensor) Tensor[source]¶
Compute output.
- Parameters:
inputs (torch.Tensor,) – model input as atomic numbers
Returns
——–
outputs (torch.Tensor,) – model output which is unity at the position of the element and zero otherwise.
- training: bool¶
- class dfa_recommender.net.GatedNetwork(nin: int, n_out: int, elements: list, n_hidden: int = 50, n_layers: int = 3, trainable: bool = False, onehot: bool = True, droprate: float = 0.2)[source]¶
Bases:
ModuleBehler-Parrinello type gated networks that combines all the building blocks above.
- forward(inputs: Tensor, update_batch_stats: bool = True) Tensor[source]¶
Compute output.
- Parameters:
inputs (torch.Tensor,) – model inputs, [batch_size, max(natoms), :-1] are the molecule features, [batch_size, max(natoms), -1] encode the element type.
update_batch_stats (bool, Optional, default as True) – used only in batch normalization
Returns
——–
outputs (torch.Tensor,) – model outputs.
- training: bool¶
- class dfa_recommender.net.MLP(n_in: int, n_out: int, n_hidden: int = 50, n_layers: int = 3, droprate: float = 0.2)[source]¶
Bases:
ModuleMultiple layer fully connected neural network. Each type of element has a MLP. Same elements share the same MLP (i.e., weight sharing)
- forward(inputs: Tensor) Tensor[source]¶
Compute output.
- Parameters:
inputs (torch.Tensor,) – model input.
Returns
——–
outputs (torch.Tensor,) – model output.
- training: bool¶
- class dfa_recommender.net.MySoftplus(beta: int = 1, threshold: int = 20)[source]¶
Bases:
ModuleShifted Softplus such as MySoftplus(0) = 0
- beta: int¶
- extra_repr() str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- forward(input: Tensor) Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- threshold: int¶
- class dfa_recommender.net.TiledMultiLayerNN(n_in: int, n_out: int, n_tiles: int, n_hidden: int = 50, n_layers: int = 3, droprate: float = 0.2)[source]¶
Bases:
ModuleTiled multilayer networks. A list of MLPs These MLPs are applied to the input to which the outputs as concatenated. The purpose is to create element-wise prediction. Note that n_tiles should be the same as the number of element types in your data set.
- forward(inputs: Tensor) Tensor[source]¶
Compute output.
- Parameters:
inputs (torch.Tensor,) – model input.
Returns
——–
outputs (list,) – model output as list of torch.Tensor
- training: bool¶
- dfa_recommender.net.call_bn(bn: BatchNorm1d, x: Tensor, update_batch_stats: bool = True) None[source]¶
Call for batch normalization
- class dfa_recommender.net.finalMLP(elements, n_out, droprate=0.2)[source]¶
Bases:
ModuleThe final fully connected neural network that maps the outputs from ElementalGate to the final outputs.
- forward(inputs: Tensor, update_batch_stats: bool = True) Tensor[source]¶
Compute output.
- Parameters:
inputs (torch.Tensor,) – model inputs
update_batch_stats (bool, Optional, default as True) – used only in batch normalization
Returns
——–
outputs (torch.Tensor,) – model outputs.
- training: bool¶
Virtual adversarial training¶
Virtual adversarial training
- class dfa_recommender.vat.VAT(device, eps, xi, alpha, k=1, use_entmin=False)[source]¶
Bases:
objectImplementation of virtual adversarial training. See https://arxiv.org/abs/1704.03976 for more details.
- dfa_recommender.vat.df_l2_normalize(d, l_x, cut=True)[source]¶
Normalize d with a zero masking.
- Parameters:
d (torch.Tensor) – random perturbation in the input space
l_x (torch.Tensor) – a tensor based on which the mask is created
cut (bool, default as True) – whether applying the mask or not
- Returns:
dn – normalized random perturbation in the input space
- Return type:
torch.Tensor
PyTorch utility functions¶
- class dfa_recommender.dataset.SubsetDataset(dataset, indices)[source]¶
Bases:
DatasetSubset a torch.utils.data.Dataset object
- dfa_recommender.evaluate.evaluate_regressor(regressor, loader, device, y_scaler)[source]¶
Evaluate the model performance on a single regression task
- Parameters:
regressor (torch.nn.Module) – trained regression model
loader (torch.utils.data.DataLoader) – your torch dataloader
device (torch.device) – the device at which this evaluation is performed
y_scaler (sklearn.preprocessing.StandardScaler) – the scaler that you normalize the label of training data
- Returns:
mae (float) – MAE
scaled_mae (float) – scaled MAE
rval (float,) – Pearson’s coefficient
Utility functions for preparing datasets, model training and evaluation.
- dfa_recommender.ml_utils.numpy_to_dataset(X, y, regression=False)[source]¶
Aseemble numpy arrays to torch tensor data set
- Parameters:
X (np.array) – features
y (np.array) – targets
regression (bool, default as False) – whether a regression task or not
- Returns:
data – assembled data set
- Return type:
torch.utils.data.TensorDataset