swamp.mr.mrarray module

class MrArray(id, workdir, target_mtz, target_fa, platform='sge', queue_name=None, logger=None, max_array_size=None, queue_environment=None, phased_mtz=None, max_concurrent_nprocs=1, job_kill_time=None, silent=False)[source]

Bases: swamp.mr.mr.Mr

An array of molecular replacement tasks to solve a given structure.

This class implements data structures to hold all the MR tasks to be executed on a target. It implements functions to run and store results of these tasks, contained as instances of MrJob instances.

Parameters:
  • id (str) – unique identifier for the MrArray instance
  • workdir (str) – working directory where the MrJob instances will be executed
  • target_mtz (str) – target’s mtz filename
  • target_fa (str) – target’s fasta filename
  • platform (str) – platform where the array of tasks will be executed (default ‘sge’)
  • queue_name (str) – name of the queue where the tasks should be submitted (default None)
  • queue_environment (str) – queue environment where the tasks should be submitted (default None)
  • phased_mtz (str) – target’s mtz filename containing phase information (default None)
  • max_concurrent_nprocs (int) – maximum number of concurrent tasks to be executed at any given time (default 1)
  • job_kill_time (int) – kill time assigned to MrJob instances (default None)
  • logger (SwampLogger) – logging interface for the MR pipeline (default None)
  • silent (bool) – if set to True the logger will not print messages
  • max_array_size (int) – set the maximum permitted number of pyjob.Scripts instances in a submitted pyjob.ClusterTask (default None)
Variables:
  • results (list) – A list with the figures of merit obtained after the completion of the pipeline
  • error (bool) – True if errors have occurred at some point on the pipeline
  • job_list (list) – A list of the MrJob instances contained on this MrArray instance.
  • job_dict (dict) – A dictionary of the MrJob instances contained on this MrArray instance. Key corresponds with swamp.mr.mrjob.MrJob.id
  • scripts (list) – List of pyjob.Scripts instances to be executed on this MrArray instance
  • shell_interpreter (str) – Indicates shell interpreter to execute MrJob (default ‘/bin/bash’)
Example:
>>> from swamp.mr import MrArray, MrJob
>>> mr_array = MrArray('<id>', '<workdir>', '<target_mtz>', 'target_fasta>')
>>> mr_array.add(MrJob('<id>', '<workdir>'))
>>> print(mr_array)
MrArray(id="<id>", njobs=1)
>>> mr_array.run()
add(value)[source]

Add an instance of MrJob to the array. This includes both the MrJob object and its pyjob.Script attribute.

Parameters:

valueMrJob instance to be added to the array for execution

Raises:
  • TypeError – value is not an instance of MrJob
  • ValueError – a MrJob instance with the same swamp.mr.mrjob.MrJob.id is already contained in the array
append_results()[source]

Append the results obtained in each MrJob instance listed at job_list into results

cleanup_dir_list

List of directories to cleanup after pipeline completion workdir

run(store_results=False)[source]

Send the array for execution in the HPC using pyjob.TaskFactory

Parameters:store_results (bool) – Not implemented