Subscribe to our Newsletter and get informed about new publication regulary and special discounts for subscribers!

IFSL > IFSL Volume 8 > Filtration of DNA Nucleotide Gene Expression...
< Back to Volume

Filtration of DNA Nucleotide Gene Expression Profiles in the Systems of Biological Objects Clustering

Full Text PDF


Researches on an optimization of the filtration process of DNA nucleotides gene expression profiles are presented in the article. The data of lung cancer patients E-GEOD-68571 of Array Express database were used as experimental data. Filtration was carried out under the terms of the expression detecting of corresponding gene, herewith the variance of gene expression, the absolute value of expression and the Shannon entropy were used as criteria. The value of thresholding coefficient was estimated on the basis of average (of) proximity measure of objects within the homogenous group and between groups. 470 columns were removed in the process of data filtering, and the matrix dimension of the test data has changed from (96´7129) to (96´6659). Estimation of the quality of information processing was performed by the comparative analysis of the clustering results of processed and unprocessed data.


International Frontier Science Letters (Volume 8)
S. Babichev et al., "Filtration of DNA Nucleotide Gene Expression Profiles in the Systems of Biological Objects Clustering", International Frontier Science Letters, Vol. 8, pp. 1-8, 2016
Online since:
June 2016

[1] F. Ozsolak, P.M. Milos, RNA sequencing: advances, challenges and opportunities, Nature Reviews Genetics. 12(2011) 87-98.

[2] M. Schena, R.W. Davis, Microarray biochip technology, Eaton Publishing, (2000).

[3] P. Baldi, G.W. Hatfield, DNA Microarrays and gene expression: From experiments to data analysis modeling, Cambridge University Press, (2011).

[4] M.R. Berthold, C. Borgelt, F. Hoppner, F. Klawonn, Data Preparation, Guide to Intelligent Data Analysis, Springen-Verlag London Limited, (2010).

[5] W. Jianan, Z. Chunguang, L. Zhangxu, X. Xuefei, Z. You, L. Guixia, A Novel Workflow for Microarray Data Analysis under Expression Level of genes, Information and Computational Science. 9(2012) 4745-4754.

[6] R.A. Irizarry, B. Hobbs, F. Collin, Exploration, normalization, and summaries of high density oligonucleotide array probe level, Biostatistics. 2(2003) 249-264.

[7] S. Babichev, V. Lytvynenko, A. Kornelyuk, V. Osypenko, Computational analysis of microarray gene expression profiles of lung cancer, Biopolymers and Cell. 1(2016) 70-79.

[8] C.E. Shannon А mathematical theory of communications, Bell System Technical Journal. 27(1948) 379-423, 623-656.

[9] D.G. Beer, S.L. Kardia, C.C. Huang, T.J. Giordano, A.M. Levin, D.E. Misek, L. Lin, G. Chen, T.G. Gharib, D.G. Thomas, M.L. Lizyness, R. Kuick, S. Hayasaka, J.M. Taylor, M.D. Iannettoni, M.B. Orringer, S. Hanash, Gene-expression profiles predict survival of patients with lung adenocarcinoma, Nature Medicine. 8(2002).

[10] J. Dorazo, J.M. Carazo, Phylogenetic reconstruction using an unsupervised growing neural network that adopts the topology of a phylogenetic tree, Journal of Molecular Evolution. 2(1997) 226-259.

Show More Hide
Cited By:

[1] S. Babichev, V. Lytvynenko, M. Korobchynskyi, M. Taiff, Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation, Vol. 716, p. 359, 2017


[2] S. Babichev, "Technology of Wavelet-Filtration of the Gene Expression Profiles in Order to Remove the Background Noise", Upravlâûŝie sistemy i mašiny, p. 25, 2017


[3] V. Osypenko, V. Osypenko, "Application of Inductive Modeling Principles to Solve the Double Clustering Problems", 2021 IEEE 16th International Conference on Computer Sciences and Information Technologies (CSIT), p. 398, 2021