swamp.search.searchtarget module¶
-
class
SearchTarget
(workdir, conpred, sspred, conformat='psicov', nthreads=1, template_subset=None, logger=None, target_pdb_benchmark=None, alignment_algorithm_name='mapalign', n_contacts_threshold=28, python_interpreter='/empty/path/bin/ccp4-python', platform='sge', queue_name=None, queue_environment=None)[source]¶ Bases:
object
Class to search the SWAMP library and rank search models according to their CMO with a given target.
Using CMO alignment tools determine the best fragments in the library to be used as search models for a given target. First, the target will be split into several subtargets (one for each helical pair with enough interhelical contact information, and several
SearchJob
instances will be executed.Parameters: - workdir (str) – working directory for the
SearchJob
instances - conpred (str) – contact prediction file of the target
- sspred (str) – secondary structure prediction file of the target (must be topcons format file)
- conformat (str) – format of the contact prediction file provided for the target (default: ‘psicov’)
- nthreads (str) – number of parallel threads to use in the library search (default: 1)
- template_subset (tuple) – set of templates to be used instead of the full fragment library (deafult: None)
- target_pdb_benchmark (str) – provide a target’s pdb file for benchmark purposes (default: None)
- alignment_algorithm_name (str) – algorithm used for CMO calculation (default: ‘mapalign’)
- logger (SwampLogger) – logging interface for the search (default None)
- n_contacts_threshold (int) – min. no. of interhelical cont. to include a subtarget in the search (default: 28)
- platform (str) – scheduler system where the array will be executed (default ‘sge’)
- queue_name (str) – name of the scheduler queue where the tasks should be sent (default None)
- queue_environment (str) – name of the scheduler environment where the tasks should be sent (default None)
- python_interpreter (str) – python interpreter to be used for task execution (default ‘$CCP4/bin/ccp4-python’)
Variables: - shell_interpreter (str) – shell interpreter to be used for task execution (default ‘/bin/bash’)
- error (bool) – True if errors have occurred at some point in the pipeline
- target (TargetSplit) – contains information about the target and subtargets
- con_precision_dict (dict) – a dictionary with the contact precission for each given subtarget prediction
- search_pickle_dict (dict) – a dictionary with the
pickle_fname
created by eachSearchJob
instance in this search - results (list) – a nested list with the results obtained in the search againts the library
- scripts (list) – a list with the instances of
pyjob.Script
that will be executed to complete the search - ranked_searchmodels (pandas.DataFrame) – a dataframe with the search models ranked by the CMO obtained in the search
Example: >>> from swamp.search import SearchTarget >>> my_rank = SearchTarget('<workdir>', '<conpred>', '<sspred>') >>> my_rank.search() >>> my_rank.rank()
-
library_format
¶ Dictionary specifying the template library format to be used in each
SearchJob
instance
-
rank
(consco_threshold=0.75, combine_searchmodels=False)[source]¶ Get the top search models as indicated by the results of the search. This method can also try to combine multiple search models by adding their CMOs. Takes in consideration same combination of fragments may appear more than once with fragments in matching subtargets.
Parameters:
-
recover_results
()[source]¶ Recover the results from all the
pickle_fname
indicated insearch_pickle_dict
Returns: a list with the results loaded from the pickle files at search_pickle_dict
-
search
()[source]¶ Search the library by calculating the CMO between the observed contacts and the query predicted contacts
This method will run a
SearchJob
instance for each subtarget atswamp.utils.targetsplit.TargetSplit.ranked_subtargets
that passes the selectedn_contacts_threshold
threshold
-
search_header
¶ Header displayed when initiating
SwampLogger
- workdir (str) – working directory for the