Surprisal Analysis of Transcripts Expression Levels in the Presence of Noise: A Reliable Determination of the Onset of a Tumor Phenotype

Date Published:

APR 23

Abstract:

Towards a reliable identification of the onset in time of a cancer phenotype, changes in transcription levels in cell models were tested. Surprisal analysis, an information-theoretic approach grounded in thermodynamics, was used to characterize the expression level of mRNAs as time changed. Surprisal Analysis provides a very compact representation for the measured expression levels of many thousands of mRNAs in terms of very few - three, four - transcription patterns. The patterns, that are a collection of transcripts that respond together, can be assigned definite biological phenotypic role. We identify a transcription pattern that is a clear marker of eventual malignancy. The weight of each transcription pattern is determined by surprisal analysis. The weight of this pattern changes with time; it is never strictly zero but it is very low at early times and then rises rather suddenly. We suggest that the low weights at early time points are primarily due to experimental noise. We develop the necessary formalism to determine at what point in time the value of that pattern becomes reliable. Beyond the point in time when a pattern is deemed reliable the data shows that the pattern remain reliable. We suggest that this allows a determination of the presence of a cancer forewarning. We apply the same formalism to the weight of the transcription patterns that account for healthy cell pathways, such as apoptosis, that need to be switched off in cancer cells. We show that their weight eventually falls below the threshold. Lastly we discuss patient heterogeneity as an additional source of fluctuation and show how to incorporate it within the developed formalism.