swamp.search.searchtarget module¶
-
class
SearchTarget(workdir, conpred, sspred, conformat='psicov', nthreads=1, template_subset=None, logger=None, target_pdb_benchmark=None, alignment_algorithm_name='mapalign', n_contacts_threshold=28, python_interpreter='/empty/path/bin/ccp4-python', platform='sge', queue_name=None, queue_environment=None)[source]¶ Bases:
objectClass to search the SWAMP library and rank search models according to their CMO with a given target.
Using CMO alignment tools determine the best fragments in the library to be used as search models for a given target. First, the target will be split into several subtargets (one for each helical pair with enough interhelical contact information, and several
SearchJobinstances will be executed.Parameters: - workdir (str) – working directory for the
SearchJobinstances - conpred (str) – contact prediction file of the target
- sspred (str) – secondary structure prediction file of the target (must be topcons format file)
- conformat (str) – format of the contact prediction file provided for the target (default: ‘psicov’)
- nthreads (str) – number of parallel threads to use in the library search (default: 1)
- template_subset (tuple) – set of templates to be used instead of the full fragment library (deafult: None)
- target_pdb_benchmark (str) – provide a target’s pdb file for benchmark purposes (default: None)
- alignment_algorithm_name (str) – algorithm used for CMO calculation (default: ‘mapalign’)
- logger (SwampLogger) – logging interface for the search (default None)
- n_contacts_threshold (int) – min. no. of interhelical cont. to include a subtarget in the search (default: 28)
- platform (str) – scheduler system where the array will be executed (default ‘sge’)
- queue_name (str) – name of the scheduler queue where the tasks should be sent (default None)
- queue_environment (str) – name of the scheduler environment where the tasks should be sent (default None)
- python_interpreter (str) – python interpreter to be used for task execution (default ‘$CCP4/bin/ccp4-python’)
Variables: - shell_interpreter (str) – shell interpreter to be used for task execution (default ‘/bin/bash’)
- error (bool) – True if errors have occurred at some point in the pipeline
- target (TargetSplit) – contains information about the target and subtargets
- con_precision_dict (dict) – a dictionary with the contact precission for each given subtarget prediction
- search_pickle_dict (dict) – a dictionary with the
pickle_fnamecreated by eachSearchJobinstance in this search - results (list) – a nested list with the results obtained in the search againts the library
- scripts (list) – a list with the instances of
pyjob.Scriptthat will be executed to complete the search - ranked_searchmodels (pandas.DataFrame) – a dataframe with the search models ranked by the CMO obtained in the search
Example: >>> from swamp.search import SearchTarget >>> my_rank = SearchTarget('<workdir>', '<conpred>', '<sspred>') >>> my_rank.search() >>> my_rank.rank()
-
library_format¶ Dictionary specifying the template library format to be used in each
SearchJobinstance
-
rank(consco_threshold=0.75, combine_searchmodels=False)[source]¶ Get the top search models as indicated by the results of the search. This method can also try to combine multiple search models by adding their CMOs. Takes in consideration same combination of fragments may appear more than once with fragments in matching subtargets.
Parameters:
-
recover_results()[source]¶ Recover the results from all the
pickle_fnameindicated insearch_pickle_dictReturns: a list with the results loaded from the pickle files at search_pickle_dict
-
search()[source]¶ Search the library by calculating the CMO between the observed contacts and the query predicted contacts
This method will run a
SearchJobinstance for each subtarget atswamp.utils.targetsplit.TargetSplit.ranked_subtargetsthat passes the selectedn_contacts_thresholdthreshold
-
search_header¶ Header displayed when initiating
SwampLogger
- workdir (str) – working directory for the