DesignFrame.
score_by_pssm
(seqID, matrix)¶Score sequences according to a provided PSSM matrix.
Generates new column by applying the PSSM score to each position of the requested sequences:
New Column | Data Content |
---|---|
pssm_score_<seqID> | Score obtained by applying matrix |
Parameters: | |||||||
---|---|---|---|---|---|---|---|
Returns: | Union[ |
||||||
Raises: |
|
Example
In [1]: from rstoolbox.io import parse_rosetta_file
...: from rstoolbox.tests.helper import random_frequency_matrix
...: import pandas as pd
...: pd.set_option('display.width', 1000)
...: df = parse_rosetta_file("../rstoolbox/tests/data/input_2seq.minisilent.gz",
...: {'scores': ['score', 'description'], 'sequence': 'B'})
...: matrix = random_frequency_matrix(len(df.get_sequence('B')[0]), 0)
...: df.score_by_pssm('B', matrix)
...:
Out[1]:
score description sequence_B pssm_score_B
0 -206.678 test_3lhp_binder_labeled_00001 TRPEEARERAWRLAEIAMRKGWEEHEREWEWWKRASKGREERDMLPERMIAAALRAIGEIFNAEWQMRLEMEKERKNPNAGEEKMKEQKKEAWKIAYYWGLMAAYWIKQHREKERK 6.080453
1 -214.362 test_3lhp_binder_labeled_00002 PKPEEAMREAYKLIKKYMLKAQKEAQEEWERMRRTDGTKEEKDMFPEKMIAQALRAIGEIFNAYYWAFLKLQEFKKYPSVRWEEQEEARKRLKIMMKIGAEWAREIAREMKERIKR 6.086099
2 -203.582 test_3lhp_binder_labeled_00003 TKPEEMAREAYKRMLKALKQGEEEMKRMYEQMKKGVDSKEERDMEPEKMIAIALRAIGELFNAWMKALRHMKELRKLGTSGPKEEEKHWRWIFELHRWAGEEIQRAAEIQERKARW 5.296613
3 -213.779 test_3lhp_binder_labeled_00004 TKPEEWARWAYKEHLKMAEKHRKEMEIEWEELKRRDGKEEEKDMWPERMIAMALRAIGELFNHHMYAEMRAKEEKKKPEAKTEEARRARREIMKYHHEAGRLIEEAMRRLMERHKK 6.766995
4 -213.972 test_3lhp_binder_labeled_00005 KKWEEMMREAERQGKEYAQKAWKEALLEWKWMRKRPVTEEMKDMAPEWMIAAALRAIGEHFNIYWQQKLEHEKLRKIPNVPEEELEKGKEELKRIEEEAARMAEKYMQELRKKMES 5.507625
5 -195.138 test_3lhp_binder_labeled_00006 PRPEEMARFAKEEMHKHEEKAYREFLLEYELAIRKNPTEEPKDMQPEWAIAAALRAIGEIFNQWMYHLLEIRKENGSSHTRYEEREKYRKLAKRLHEEAAKEIWKFMHEAMRRFES 4.778006