rstoolbox.components.DesignSeries.add_reference_shift

DesignSeries.add_reference_shift(seqID, shift, shift_labels=False)

Add a reference_shift attached to a chain seqID.

What is shift? In case the sequence does not start in 1, shift defines how to count it. It is a way to keep plotting an analysis showing residue number compatible with the PDB. There are two main ways to set the shift:

  1. Provide the number of the first residue of the chain; the rest will be set up from there. This is the simplest option, when its just a matter of the structure not actually starting at the begining of the real protein sequence.
  2. If the original PDB has breaks, one will need to provide an array with the numbers of each position. This is the only solution to consistently track those positions.
Parameters:
  • seqID (str) – Identifier of the sequence of interest.
  • shift (Union[int, list() of int]) – Starting residue number or per-residue number assignment.
  • shift_labels (bool) – When adding the shift, should it be automatically applied to any label present in the data container? (Default is False).
Raises:
TypeError:if the data container is not DataFrame or Series.
KeyError:If there is no reference structure or sequence for seqID.
KeyError:If shift is a list and the data container does not contain structure or sequence data for the given seqID.
IndexError:If shift is a list and the length is different than the reference sequence/structure

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_ssebig.minisilent.gz",
   ...:                         {'sequence': 'C', 'structure': 'C'})
   ...: df.add_reference_structure('C', df.iloc[0].get_structure('C'))
   ...: df.add_reference_shift('C', 3)
   ...: