Russian Journal of Biotechnology Articles archive Archive 2013 N 1 A Method for Detection of Chimerical Sequences and Correction of Errors in Flowgrams during Genome Sequencing

A Method for Detection of Chimerical Sequences and Correction of Errors in Flowgrams during Genome Sequencing

Автор: V.V. Galatenko, A.E. Lebedev, V.K. Nikolaev, E.A. Tonevitsky, and A.G. Tonevitsky

Страница: 78-90

 

A Method for Detection of Chimerical Sequences and Correction of Errors in Flowgrams during Genome Sequencing

Biotekhnologiya, 2013, N 1, P. 78-90

UDC 57.087+51.76

Section:  “System Analysis, Mathematical Modelling, and Information Systems”

 

V.V. Galatenko 1, *,  A.E. Lebedev 2,  V.K. Nikolaev 3,  E.A. Tonevitsky,  and  A.G. Tonevitsky 4

1  The Lomonosov Moscow State University, Faculty of Mechanics and Mathematics,  119991, Moscow Russia

2  The Reseach-and-Technique Center BioClincum,  115088, Moscow Russia

3  The Belozersky Institute of Physicochemical Biology, Moscow State University,  119991, Moscow Russia

4  The Research Institute for General Pathology and Pathophysiology, Russ. Acad. Med. Sci.,  125315, Moscow Russia

e-mail:  vgalat@msu.ru

 

A novel method for the correction of errors in reads/flow­grams obtained during genome sequencing and detection of so-called chimerical reads/flowgrams has been suggested. The method is based on a fact that each genome fragment is generally covered by multiple reads (flowgrams). The method can potentially be applied to various sequencing technologies; however, it was developed primarily for the flowrogram-generating technologies, in particular, pyrosequencing and semiconductor sequencing. The use of the algorithm for the correction of errors and detection of chimerical sequences significantly decreases the number of errors in reads/flowgrams before the initial genome assembly and hence improves the quality of the latter. The suggested method was tested using the data obtained during the model simulation of sequencing; the testing showed that the method permits to correct 97,7-99,3% of errors. The majority of uncorrected errors are located within long homopolymers and/or at low number of positions in genome. The method was also applied to a set of flowgrams obtained as a result of pyrosequencing. The results proved to be self-adjusted, and they were supported by contigs obtained from the set of flowgrams using a Newbler (454 Life Sciences) assembler.

 

Key words:  error correction,  flowgram,  read,  sequencing.

 

The full English version of the article was published in “Biotechnology in Russia”, 2013, Issue 1, pp. 78-90 as V.V. Galatenko, A.E. Lebedev, V.K. Nikolaev, E.A. Tonevitsky, and A.G. Tonevitsky “A Method for Detection of Chimerical Sequences and Correction of Errors in Flowgrams during Genome Sequencing.

It is contained at the Russian Scientific Electron Library website:  http://elibrary.ru/item.asp?id=21987108

 

27.04.2015, 1574 просмотра.

Contacts

Postal address:
1-st Dorozhniy proezd, bld 1
117545 Moscow, Russia

Tel: +7 (495) 315-08-01
Fax: +7 (495) 315-05-01

Email: editor@genetika.ru