fast.sampling
fast.sampling.core module
- class fast.sampling.core.AdaptiveSampling(initial_state, n_gens=1, n_kids=1, sim_obj=None, cluster_obj=None, save_state_obj=None, msm_obj=None, analysis_obj=None, ranking_obj=None, spreading_func=None, update_freq=inf, continue_prev=False, sub_obj=None, q_check_obj=None, q_check_obj_sim=None, output_dir='adaptive_sampling', verbose=True)[source]
Bases:
basePerforms adaptive sampling
- Parameters
initial_state (str or MDTraj object,) – The starting structure for adaptive sampling.
n_gens (int, default=1,) – The number of generations of sampling to perform.
n_kids (int, default=1,) – The number of simulations per generation of adaptive sampling.
sim_obj (object, default=None,) – An object that can run simulations. Currently supported within this package are Gromacs and Upside wrappers.
cluster_obj (object, default=None,) – A cluster wrapper that dictates how simulations are clustered.
save_state_obj (object, default=None,) – Can optionally provide an object that dictates how states are saved.
msm_obj (enspara.msm.MSM object) – An enspara MSM object. This is used to fit assignments at each generation of sampling.
analysis_obj (object, default=None,) – Type of analysis to perform on each cluster center. Can be used in state rankings.
ranking_obj (rankings object) – This is an object with at least two functions: __init__(**args) and select_states(msm, n_clones). The output of this object is a list of states to simulate.
spreading_func (func, default=None,) – Optionally spread state selection by minimizing similarity penalty, calculated using the provided metric for calculating state-distances. i.e. md.rmsd.
update_freq (int, default=np.inf,) – The number of generations between a full reclustering of states and analysis of cluster centers. Defaults to never reclustering (continually adds new cluster centers without changing previously discovered centers).
continue_prev (bool, default=False,) – Flag to indicate if sampling is continuing from a previous run. Avoids accidentally overwritting a previous run of sampling.
sub_obj (object, default=None,) – A submission object that handles submitting clustering, MSM, analysis, and save_state routines. Wrappers are available for Slurm queueing systems as well as local machines (subprocess calls).
q_check_obj (object, default=None,) – An object that handles checking queueing system for jobs that are still running.
q_check_obj_sim (object, default=None,) – An object that handles checking queueing system for jobs that are still running.
output_dir (str, default=’adaptive_sampling’,) – The output directory name for adaptive sampling run.
- property class_name
- property config
fast.sampling.rankings module
- class fast.sampling.rankings.FAST(state_rankings=None, directed_scaling=feature_scale(maximize=True), statistical_component=counts(maximize_ranking=False), statistical_scaling=feature_scale(maximize=False), alpha=1, alpha_percent=False, maximize_ranking=True, **kwargs)[source]
Bases:
base_rankingFAST ranking object
- property class_name
- property config
- class fast.sampling.rankings.base_ranking(maximize_ranking=True, state_centers=None, distance_metric=None, width=1.0)[source]
Bases:
basebase ranking class. Pieces out selection of states from independent rankings
- class fast.sampling.rankings.counts(maximize_ranking=False, scaling=None, **kwargs)[source]
Bases:
base_rankingMin-counts ranking object. Ranks states based on their raw counts.
- property class_name
- property config
- class fast.sampling.rankings.evens[source]
Bases:
baseEvens ranking object
- property class_name
- property config
- fast.sampling.rankings.generate_aij(tcounts, spreading=False)[source]
Generates the adjacency matrix used for page ranking.
- Parameters
tcounts (matrix, shape=(n_states, n_states)) – The count matrix of an MSM. Can be dense or sparse.
spreading (bool, default=False) – Optionally transposes matrix to do counts spreading instead of page rank.
- Returns
aij – The adjacency matrix used for page ranking.
- Return type
matrix, shape=(n_states, n_states)
- fast.sampling.rankings.get_unique_states(msm)[source]
returns a list of the visited states within an msm object
- class fast.sampling.rankings.page_ranking(d, init_pops=True, max_iters=100000, norm=True, spreading=False, maximize_ranking=True, **kwargs)[source]
Bases:
base_rankingpage ranking. ri = (1-d)*init_ranks + d*aij
- property class_name
- property config
- fast.sampling.rankings.rank_aij(aij, d=0.85, Pi=None, max_iters=100000, norm=True)[source]
Ranks the adjacency matrix.
- Parameters
aij (matrix) – The adjacency matrix used for ranking.
d (float) – The weight of page ranks [0, 1]. A value of 1 is pure page rank and 0 is all the initial ranks.
Pi (array, default=None) – The prior ranks.
max_iters (int, default=100000) – The maximum number of iterations to check for convergence.
norm (bool, default=True) – Normilizes output ranks
- Return type
The rankings of each state
- class fast.sampling.rankings.string(start_states, end_states, statistical_component=None, maximize_ranking=False, **kwargs)[source]
Bases:
base_rankingUses the string method with MSMs to relax pathway. Samples from states along the highest flux pathway between start-states and end-states. Uses the statistical component to rank the states on this pathway.
- Parameters
start_states (int or array-like, shape = (n_start_states, )) – The starting states for defining the pathway.
end_states (int of array-like, shape = (n_end_states, )) – The ending states for defining the pathway.
statistical_component (ranking function) – A ranking class object to rank the pathway states. If none is selected, evens is used.
maximize_ranking (bool, default=False,) – Optionally maximize the ranking. This will favor states with high statistical components, i.e. favor states with high counts (unlikely to be desireable).
- property class_name
- property config