S. Romero‐Molina, Y. B. Ruiz‐Blanco, M. Harms, J. Münch, E. Sanchez‐Garcia. PPI‐Detect: A support vector machine model for sequence‐based prediction of protein–protein interactions J. Comput. Chem. 2019, 1‐10. DOI: 10.1002/jcc.25780
The input of PPI-Detect web server:
To execute PPI-Detect, at least two sequences (FASTA format) most be provided by separate.
For example:
Sequences A
You can either "Enter a sequence(s)" or "Upload a file" with the lines:
Sequences B
You must provide a file with the sequences to combine, here PB and PC:
Then, the interaction likelihood will be computed for all the combinatorial pairs between the two sets of sequences:
The output of PPI-Detect web server:
A table with the next information for each protein-protein pair:
Example files:
sequences_A.fasta
sequences_B.fasta
Example output:
The server shows next table, that summarizes all the information provided in the output files, plus a link to download them:
# | Instance | Prediction | Score | AD 1st & 99th | AD 100th |
---|---|---|---|---|---|
0 | PF00189PF00163 | Interaction | 0.578 | Out | Out |
1 | PF01248PF01599 | Not interaction | 0.158 | Out | Out |
2 | PF01248PF01246 | Not interaction | 0.183 | Out | Out |
3 | PF00163PF00281 | Not interaction | 0.181 | Out | Out |
4 | PF00163PF01479 | Not interaction | 0.379 | Out | Out |
Notes:
PPI-Detect was built with a nonredundant benchmarking dataset of PPI gathered from three comprehensive, curated and publicly available databases. These databases contain information about pairs of protein domains with proven interactions (3did and iPfam), and domain pairs with very little chances of being involved in an interaction (Negatome 2.0).
We split the dataset into training and test sets. The interacting domains are the positive cases and the noninteracting domains are the negative cases.
Training: This subset includes 3491 pairs (1613 positive and 1878 negative). download
Testing: This subset includes 836 pairs of domains (309 positive and 527 negative).
To estimate the performance of the final model, we grouped the test data by degrees of difficulty:
The files contain only the pairs of domains, to obtain the sequences click here.