fast.sampling

fast.sampling.core module

class fast.sampling.core.AdaptiveSampling(initial_state, n_gens=1, n_kids=1, sim_obj=None, cluster_obj=None, save_state_obj=None, msm_obj=None, analysis_obj=None, ranking_obj=None, spreading_func=None, update_freq=inf, continue_prev=False, sub_obj=None, q_check_obj=None, q_check_obj_sim=None, output_dir='adaptive_sampling', verbose=True)[source]

Bases: base

Performs adaptive sampling

Parameters
  • initial_state (str or MDTraj object,) – The starting structure for adaptive sampling.

  • n_gens (int, default=1,) – The number of generations of sampling to perform.

  • n_kids (int, default=1,) – The number of simulations per generation of adaptive sampling.

  • sim_obj (object, default=None,) – An object that can run simulations. Currently supported within this package are Gromacs and Upside wrappers.

  • cluster_obj (object, default=None,) – A cluster wrapper that dictates how simulations are clustered.

  • save_state_obj (object, default=None,) – Can optionally provide an object that dictates how states are saved.

  • msm_obj (enspara.msm.MSM object) – An enspara MSM object. This is used to fit assignments at each generation of sampling.

  • analysis_obj (object, default=None,) – Type of analysis to perform on each cluster center. Can be used in state rankings.

  • ranking_obj (rankings object) – This is an object with at least two functions: __init__(**args) and select_states(msm, n_clones). The output of this object is a list of states to simulate.

  • spreading_func (func, default=None,) – Optionally spread state selection by minimizing similarity penalty, calculated using the provided metric for calculating state-distances. i.e. md.rmsd.

  • update_freq (int, default=np.inf,) – The number of generations between a full reclustering of states and analysis of cluster centers. Defaults to never reclustering (continually adds new cluster centers without changing previously discovered centers).

  • continue_prev (bool, default=False,) – Flag to indicate if sampling is continuing from a previous run. Avoids accidentally overwritting a previous run of sampling.

  • sub_obj (object, default=None,) – A submission object that handles submitting clustering, MSM, analysis, and save_state routines. Wrappers are available for Slurm queueing systems as well as local machines (subprocess calls).

  • q_check_obj (object, default=None,) – An object that handles checking queueing system for jobs that are still running.

  • q_check_obj_sim (object, default=None,) – An object that handles checking queueing system for jobs that are still running.

  • output_dir (str, default=’adaptive_sampling’,) – The output directory name for adaptive sampling run.

property class_name
property config
print_parameters()[source]
run()[source]
fast.sampling.core.push_forward(s, num=0)[source]

fast.sampling.rankings module

class fast.sampling.rankings.FAST(state_rankings=None, directed_scaling=feature_scale(maximize=True), statistical_component=counts(maximize_ranking=False), statistical_scaling=feature_scale(maximize=False), alpha=1, alpha_percent=False, maximize_ranking=True, **kwargs)[source]

Bases: base_ranking

FAST ranking object

property class_name
property config
rank(msm, unique_states=None)[source]
class fast.sampling.rankings.base_ranking(maximize_ranking=True, state_centers=None, distance_metric=None, width=1.0)[source]

Bases: base

base ranking class. Pieces out selection of states from independent rankings

select_states(msm, n_clones)[source]
class fast.sampling.rankings.counts(maximize_ranking=False, scaling=None, **kwargs)[source]

Bases: base_ranking

Min-counts ranking object. Ranks states based on their raw counts.

property class_name
property config
rank(msm, unique_states=None)[source]
class fast.sampling.rankings.evens[source]

Bases: base

Evens ranking object

property class_name
property config
rank(msm, unique_states=None)[source]
select_states(msm, n_clones)[source]
fast.sampling.rankings.generate_aij(tcounts, spreading=False)[source]

Generates the adjacency matrix used for page ranking.

Parameters
  • tcounts (matrix, shape=(n_states, n_states)) – The count matrix of an MSM. Can be dense or sparse.

  • spreading (bool, default=False) – Optionally transposes matrix to do counts spreading instead of page rank.

Returns

aij – The adjacency matrix used for page ranking.

Return type

matrix, shape=(n_states, n_states)

fast.sampling.rankings.get_unique_states(msm)[source]

returns a list of the visited states within an msm object

class fast.sampling.rankings.page_ranking(d, init_pops=True, max_iters=100000, norm=True, spreading=False, maximize_ranking=True, **kwargs)[source]

Bases: base_ranking

page ranking. ri = (1-d)*init_ranks + d*aij

property class_name
property config
rank(msm, unique_states=None)[source]
fast.sampling.rankings.rank_aij(aij, d=0.85, Pi=None, max_iters=100000, norm=True)[source]

Ranks the adjacency matrix.

Parameters
  • aij (matrix) – The adjacency matrix used for ranking.

  • d (float) – The weight of page ranks [0, 1]. A value of 1 is pure page rank and 0 is all the initial ranks.

  • Pi (array, default=None) – The prior ranks.

  • max_iters (int, default=100000) – The maximum number of iterations to check for convergence.

  • norm (bool, default=True) – Normilizes output ranks

Return type

The rankings of each state

class fast.sampling.rankings.string(start_states, end_states, statistical_component=None, maximize_ranking=False, **kwargs)[source]

Bases: base_ranking

Uses the string method with MSMs to relax pathway. Samples from states along the highest flux pathway between start-states and end-states. Uses the statistical component to rank the states on this pathway.

Parameters
  • start_states (int or array-like, shape = (n_start_states, )) – The starting states for defining the pathway.

  • end_states (int of array-like, shape = (n_end_states, )) – The ending states for defining the pathway.

  • statistical_component (ranking function) – A ranking class object to rank the pathway states. If none is selected, evens is used.

  • maximize_ranking (bool, default=False,) – Optionally maximize the ranking. This will favor states with high statistical components, i.e. favor states with high counts (unlikely to be desireable).

property class_name
property config
rank(msm, unique_states=None)[source]

fast.sampling.scalings module

class fast.sampling.scalings.feature_scale(maximize=True)[source]

Bases: base

Feature scales data: (x - xmin) / (xmax - xmin)

property class_name
property config
scale(values)[source]
class fast.sampling.scalings.sigmoid_scale(maximize=True, a=3)[source]

Bases: object

Scales values with a sigmoid

scale(values)[source]