Skip to main content Skip to page footer

IMPI

Researchers

Susanne Schaller MMSc.

Dr. Julia Vetter MSc.

Prof.(FH) PD DI Dr. Stephan Winkler


Duration

2019 – present

Research Areas

Genomics

Partners

University of Applied Sciences Upper Austria, Hagenberg Campus, RG Bioinformatics

Ordensklinikum Linz GmbH, Barmherzige Schwestern

The interface for point mutation identification (IMPI) is a stand-alone software tool implemented in Python 3.9 with a GUI, especially for processing small-scale UMI-tagged sequencing data to identify point mutations. IMPI was designed, implemented, and tested using BCR-ABL1 fusion gene kinase domain sequencing data, but can be used for (UMI-tagged) NGS data derived from similar experiments investigating other genes. UMI-tagged NGS data provided by the Laboratory for Molecular Genetic Diagnostics at the Ordensklinikum Linz GmbH, Barmherzige Schwestern, was used for algorithm development and UI design.

IMPI requires NGS data in FASTQ file format as input. Additionally, IMPI needs further details, such as primer and reference sequences. Sequentially, data pre-processing, including data cleaning and quality control, is performed. The main data analysis comprises three calculations of the MAF since the study’s major aim was to investigate differences in MAF before and after UMI clustering and the effect of different clustering strategies. Therefore, a primary clustering (by identical UMI) and a secondary clustering (allowing single substitutions to cluster UMIs) were implemented. Different parameters can be set, such as the minimum quality of the reads to be considered. As workflow output, IMPI provides allele frequency matrices containing the MAF values for the analyzed samples (at each position and for clinically relevant mutation loci) and batch reports where multiple frequency matrices are concatenated for easier comparison of multiple samples.