Researches on an optimization of the filtration process of DNA nucleotides gene expression profiles are presented in the article. The data of lung cancer patients E-GEOD-68571 of Array Express database were used as experimental data. Filtration was carried out under the terms of the expression detecting of corresponding gene, herewith the variance of gene expression, the absolute value of expression and the Shannon entropy were used as criteria. The value of thresholding coefficient was estimated on the basis of average (of) proximity measure of objects within the homogenous group and between groups. 470 columns were removed in the process of data filtering, and the matrix dimension of the test data has changed from (96´7129) to (96´6659). Estimation of the quality of information processing was performed by the comparative analysis of the clustering results of processed and unprocessed data.
International Frontier Science Letters (Volume 8)
S. Babichev et al., "Filtration of DNA Nucleotide Gene Expression Profiles in the Systems of Biological Objects Clustering", International Frontier Science Letters, Vol. 8, pp. 1-8, 2016