swamp.search.searchtarget module

class SearchTarget(workdir, conpred, sspred, conformat='psicov', nthreads=1, template_subset=None, logger=None, target_pdb_benchmark=None, alignment_algorithm_name='mapalign', n_contacts_threshold=28, python_interpreter='/empty/path/bin/ccp4-python', platform='sge', queue_name=None, queue_environment=None)[source]

Bases: object

Class to search the SWAMP library and rank search models according to their CMO with a given target.

Using CMO alignment tools determine the best fragments in the library to be used as search models for a given target. First, the target will be split into several subtargets (one for each helical pair with enough interhelical contact information, and several SearchJob instances will be executed.

Parameters:
  • workdir (str) – working directory for the SearchJob instances
  • conpred (str) – contact prediction file of the target
  • sspred (str) – secondary structure prediction file of the target (must be topcons format file)
  • conformat (str) – format of the contact prediction file provided for the target (default: ‘psicov’)
  • nthreads (str) – number of parallel threads to use in the library search (default: 1)
  • template_subset (tuple) – set of templates to be used instead of the full fragment library (deafult: None)
  • target_pdb_benchmark (str) – provide a target’s pdb file for benchmark purposes (default: None)
  • alignment_algorithm_name (str) – algorithm used for CMO calculation (default: ‘mapalign’)
  • logger (SwampLogger) – logging interface for the search (default None)
  • n_contacts_threshold (int) – min. no. of interhelical cont. to include a subtarget in the search (default: 28)
  • platform (str) – scheduler system where the array will be executed (default ‘sge’)
  • queue_name (str) – name of the scheduler queue where the tasks should be sent (default None)
  • queue_environment (str) – name of the scheduler environment where the tasks should be sent (default None)
  • python_interpreter (str) – python interpreter to be used for task execution (default ‘$CCP4/bin/ccp4-python’)
Variables:
  • shell_interpreter (str) – shell interpreter to be used for task execution (default ‘/bin/bash’)
  • error (bool) – True if errors have occurred at some point in the pipeline
  • target (TargetSplit) – contains information about the target and subtargets
  • con_precision_dict (dict) – a dictionary with the contact precission for each given subtarget prediction
  • search_pickle_dict (dict) – a dictionary with the pickle_fname created by each SearchJob instance in this search
  • results (list) – a nested list with the results obtained in the search againts the library
  • scripts (list) – a list with the instances of pyjob.Script that will be executed to complete the search
  • ranked_searchmodels (pandas.DataFrame) – a dataframe with the search models ranked by the CMO obtained in the search
Example:
>>> from swamp.search import SearchTarget
>>> my_rank = SearchTarget('<workdir>', '<conpred>', '<sspred>')
>>> my_rank.search()
>>> my_rank.rank()
library_format

Dictionary specifying the template library format to be used in each SearchJob instance

rank(consco_threshold=0.75, combine_searchmodels=False)[source]

Get the top search models as indicated by the results of the search. This method can also try to combine multiple search models by adding their CMOs. Takes in consideration same combination of fragments may appear more than once with fragments in matching subtargets.

Parameters:
  • consco_threshold (float) – CMO threshold to consider an alignment valid (default 0.75)
  • combine_searchmodels (bool) – if True combine search models matching different subtargets (default False)
recover_results()[source]

Recover the results from all the pickle_fname indicated in search_pickle_dict

Returns:a list with the results loaded from the pickle files at search_pickle_dict
search()[source]

Search the library by calculating the CMO between the observed contacts and the query predicted contacts

This method will run a SearchJob instance for each subtarget at swamp.utils.targetsplit.TargetSplit.ranked_subtargets that passes the selected n_contacts_threshold threshold

search_header

Header displayed when initiating SwampLogger

template_library

Location of the template library to be used with the SearchJob instances