Preprocessing methods

The preprocessing methods provided in IMEX enable splitting *.fasta (or text) files: Currently the IMGT/HighV-QUEST information system enables the upload of *.fasta files containing max. 500,000 sequences. If your file contains more sequences use the Split NGS Data tool of IMEX and prepare the data for the IMGT upload. (Of course you can use the splitter also for *.fasta files with less than 500k sequences for your own preprocessing purposes.) After preprocessing the files using IMGT/HighV-QUEST and splitting the original *.fasta files, you can use the file merger offered by IMEX.

Note

It is important that for all merged parts were created using the same parameter settings having the same filename including the ending “_PART_X.zip” (e.g., PatientX890_PART_1.zip and PatientX890_PART_2.zip and so on).

_images/preprocessingMethods.png

Figure 4: IMEX preprocessing methods