A tool to easily generate Profiles of Percent Identical Positions from a fasta file (nucleotide or peptide sequences). With the help of an annotation file, it will be possible to simply explore the results.

Why ?

PIP profiles are widely used by virologists to detect recombination events and can be found in most publications concerning the origins of SARS-CoV-2.

How to use ?

1- Firstly, you should select a demo dataset, upload a FASTA file or a Rdata file (previous analysis).

2- [not mandatory] Add an annotation file to enrich the exploration of the PIP profile or refine the search during alignments.

3- Align sequence (if you have imported a fasta file) and explore PIP profile.

Use cases

- Virology researchers who wish to compare genomic or protein sequences from different viral strains. A example is available here https://doi.org/10.1051/medsci/2020123

- Teachers/trainers in the context of biological sequence analysis TP. A example is available here https://jvanheld.github.io/shnc-origines-sars-cov-2/


Sallard, E., Halloy, J., Casane, D., van Helden, J. & Decroly, É. 2020. Retrouver les origines du SARS-CoV-2 dans les phylogénies de coronavirus. Med Sci (Paris) 36: 783–796. https://doi.org/10.1051/medsci/2020123

Sequence filters

Your parameters

1- Select Reference sequence

Select the reference sequence from all available sequences in the data.

2- Select query sequences

Select the sequences to be aligned against the reference sequence.

3- Pairwise Alignment type

Type of alignment for the function pairwiseAlignment (Biostrings).

Documentation :

4- Select a window size

Size of the sliding window to calculate PIP profiles


PIP exploration

The table below is dynamic. Hover the graph or select an area

Show the strains / species displayed in the table below


Hover over the plot to get the annotation information. The blue arrows are strand + and the orange arrows strand -.





Sub-selection of the annotation file for exploration

All annotations

Session Information

In this section is gathered all the information concerning the working environment, the versions of the packages used, ... to be able to reproduce the analyses.

R session information and parameters

The versions of the R software and Bioconductor packages used for this analysis are listed below. It is important to save them if one wants to re-perform the analysis in the same conditions.



To guide you in the use of the application, documentation is available here

Contact us

Questions, problems or comments ? Contact us !

Thomas Denecker

Lead Developer

Jacques van Helden

Lead project

Hélène Chiapello

Lead project