rstoolbox.analysis.
binary_similarity
(df, seqID, key_residues=None, matrix='IDENTITY')¶Binary profile for each design sequence against the reference_sequence
.
Makes a DesignFrame
with a new column to map binary identity (0/1) with
the reference_sequence
. If a different matrix than IDENTITY
is provides,
the binary sequence sets to 1
all the positive values.
New Column | Data Content |
---|---|
<matrix>_<seqID>_binary | Binary representation of the match with the
reference_sequence . |
Parameters: | |||||||
---|---|---|---|---|---|---|---|
Returns: | |||||||
Raises: |
|
See also
Example
In [1]: from rstoolbox.io import parse_rosetta_file
...: from rstoolbox.analysis import binary_similarity
...: import pandas as pd
...: pd.set_option('display.width', 1000)
...: pd.set_option('display.max_columns', 500)
...: df = parse_rosetta_file("../rstoolbox/tests/data/input_2seq.minisilent.gz",
...: {'scores': ['score'], 'sequence': 'B'})
...: df.add_reference_sequence('B', df.get_sequence('B').values[0])
...: df = binary_similarity(df.iloc[1:], 'B')
...: df.head()
...:
Out[1]:
score sequence_B identity_B_binary
0 -214.362 PKPEEAMREAYKLIKKYMLKAQKEAQEEWERMRRTDGTKEEKDMFPEKMIAQALRAIGEIFNAYYWAFLKLQEFKKYPSVRWEEQEEARKRLKIMMKIGAEWAREIAREMKERIKR 00111100010010000101000100011100010000011011011011101111111111100000100000010100001000100100000000000010000000010000
1 -203.582 TKPEEMAREAYKRMLKALKQGEEEMKRMYEQMKKGVDSKEERDMEPEKMIAIALRAIGELFNAWMKALRHMKELRKLGTSGPKEEEKHWRWIFELHRWAGEEIQRAAEIQERKARW 10111000010000001000101100100100100000011111011011101111111011100000001000110000100000000000000000010000000000001010
2 -213.779 TKPEEWARWAYKEHLKMAEKHRKEMEIEWEELKRRDGKEEEKDMWPERMIAMALRAIGELFNHHMYAEMRAKEEKKKPEAKTEEARRARREIMKYHHEAGRLIEEAMRRLMERHKK 10111000010000000001000101011100110000011011011111101111111011000000000001010101001000000010010000010000000000010001
3 -213.972 KKWEEMMREAERQGKEYAQKAWKEALLEWKWMRKRPVTEEMKDMAPEWMIAAALRAIGEHFNIYWQQKLEHEKLRKIPNVPEEELEKGKEELKRIEEEAARMAEKYMQELRKKMES 00011000010100000001010100011010000000010011011011111111111011001100110110110110011000001010001000000110000000101000
4 -195.138 PRPEEMARFAKEEMHKHEEKAYREFLLEYELAIRKNPTEEPKDMQPEWAIAAALRAIGEIFNQWMYHLLEIRKENGSSHTRYEEREKYRKLAKRLHEEAAKEIWKFMHEAMRRFES 01111000010000000001000100010100010000010011011001111111111111000000110011000000001000000101000000000000000000000000