rstoolbox.analysis.binary_overlap

rstoolbox.analysis.binary_overlap(df, seqID, key_residues=None, matrix='IDENTITY')

Overlap the binary similarity representation of all decoys in a DesignFrame.

Parameters:
  • df (Union[DesignFrame, DataFrame]) – Data container.
  • seqID (str) – Identifier of the sequence of interest.
  • key_residues (Union[int, list() of int, str, Selection]) – Residues of interest.
  • matrix (str) – Identifier of the matrix used to evaluate similarity. Default is IDENTITY.
Returns:

list() of int - ones and zeros for each position of the length of the sequence

Example

In [1]: from rstoolbox.io import parse_rosetta_file
   ...: from rstoolbox.analysis import binary_overlap
   ...: import pandas as pd
   ...: pd.set_option('display.width', 1000)
   ...: pd.set_option('display.max_columns', 500)
   ...: df = parse_rosetta_file("../rstoolbox/tests/data/input_2seq.minisilent.gz",
   ...:                         {'scores': ['score'], 'sequence': 'B'})
   ...: df.add_reference_sequence('B', df.get_sequence('B').values[0])
   ...: binoverlap = binary_overlap(df.iloc[1:], 'B')
   ...: "".join([str(_) for _ in binoverlap])
   ...: 
Out[1]: '11111100010110001101111101111110110000011111011111111111111111101100111111110111111000101111011000010110000000111011'