fast.msm_gen

fast.msm_gen.clustering module

class fast.msm_gen.clustering.ClusterWrap(base_struct, base_clust_obj=None, atom_indices=None, build_full=True, n_procs=1, mem_efficient=False)[source]

Bases: base

Clustering wrapper function

Parameters
  • base_struct (str or md.Trajectory,) – A structure with the same topology as the trajectories to load.

  • base_clust_obj (enspara.msm.MSM,) – enspara object to use with clustering.

  • atom_indices (str or list,) – The atom indices of the base_struct to cluster with.

  • build_full (bool, default = True,) – Flag for building from scratch.

  • n_procs (int, default = 1,) – The number of processes to use when loading, clustering and saving conformations.

  • mem_efficient (bool, default=False,) – optionally save memory by not loading all of the atoms of trajectories. Saving full cluster centers should be performed by save_states if this is set to True.

check_clustering(msm_dir, gen_num, n_kids, verbose=True)[source]
property class_name
property config
run()[source]
set_filenames(msm_dir)[source]
fast.msm_gen.clustering.load_trjs(trj_filenames, n_procs=1, **kwargs)[source]

Parallelize loading trajectories from msm directory.

fast.msm_gen.save_states module

class fast.msm_gen.save_states.SaveWrap(save_routine='full', centers='auto', gen_num=0, largest_center=inf, save_xtc_centers=False, n_procs=1)[source]

Bases: base

Save states wrapping object

Parameters
  • save_routine (str, default=’full’,) – The type of states to save. Three options: 1) ‘masses’ saves only in the centers_masses, 2) ‘restarts’ saves only the restarts, and 3) ‘full’ saves both.

  • centers (str, default=’auto’,) – The indicator for the set of centers to save. Four options: 1) ‘all’ will save every center, 2) ‘none’ will not save any centers, 3) ‘restarts’ will only save the centers to use for restarting simulations, and 4) ‘auto’ will only save new states that were discovered in previous round of sampling.

  • gen_num (int, default=0,) – The generation number of adaptive sampling. Only used if centers is set to ‘restarts’.

  • largest_center (float, default=np.inf,) – The largest distance to a cluster center expected. Can be used to speed up searching for cluster centers. A reasonable value if the distance cutoff used for clustering.

  • save_xtc_centers (bool, default=False,) – Optionally save centers as an xtc in data.

  • n_procs (int, default=1,) – The number of processes to use when saving states.

check_save_states(msm_dir)[source]
property class_name
property config
run(msm_dir='.')[source]
fast.msm_gen.save_states.chunks(lst, n)[source]

Yield successive n-sized chunks from lst.

fast.msm_gen.save_states.save_states(assignments, distances, state_nums=None, save_routine='full', largest_center=inf, n_confs=1, n_procs=1, msm_dir='.')[source]

Saves specified state-numbers by searching through the assignments and distances and pulling single frames from trajectories. This is a special tailored helper function that has a directory structure hard-coded in. Can specify a largest distance to a cluster center to save computational time searching for min distances. If multiple conformations are saved, the center is saved as conf-0 and new conformations are sampled from the cluster.

Inputs

assignmentsarray, shape=(n_trajectories, n_frames),

Assigned cluster for each frame in each trajectory.

distancesarray, shape=(n_trajectories, n_frames),

The distance to the cluster center for each frame in each trajectory.

state_numsarray, shape=(n_states, ), default=None,

The specific state numbers for saving. If None are supplied, will save every state.

save_routinestr, default=’full’,

The routine for saving states, either ‘full’, ‘masses’, or ‘restarts’. ‘masses’ will only save the processed cluster centers, ‘restarts’ will only save the full system centers, and ‘full’ will save both.

largest_centerfloat, default=np.inf,

The largest expected distance from any frame to a cluster center. Specifying a small number can save in computational time. Defaults to np.inf, which will consider every frame when finding cluster centers.

n_confsint, default=1,

The number of representative conformations to save of each cluster center. The first conformation is always the cluster center, and subsequent conformations are sampled randomly from frames clustered.

n_procsint, default=1,

The number of processes to use when saving states.

msm_dirstr, default=’.’,

Location of the msm directory containing trajectories and folders for saving states.