Gene Chips, Cell Tables
and a Thought Experiment
Van Warren
2/19/2001
The Science Genome Chart, of gene/protein types, is interesting. Out of 32,000 (± 6000) putative genes, there are 1543 receptor types occupying five percent of the genome. Imagine taking a driver's test and having to learn 1543 traffic signs! From language learning we know it is possible to recognize 1543 signs. Shapes are memorable. Legends never die.
According to "The Way Life Works" there are 350 types of human cells. We will use this as a working figure for the sake of argument. Imagine a table with 350 rows, with a corresponding set of 32,000 ± columns, indicated which genes are turned on and off at any given time.
gene1
|
gene2
|
gene3
|
...
|
gene32,000
±
|
|
nerveType3
|
...
|
||||
muscleType3
|
...
|
||||
B-Cell
|
...
|
||||
T-Cell
|
...
|
||||
...
|
...
|
...
|
...
|
...
|
...
|
tissue
350
|
...
|
The table would be about 90 times wider than it is tall depending on how you
count genes. The
table represents the gene expression, frozen in time in black and white. The
expression in each cell type is smeared over the entire life cycle and cell
clock. In a living cell, gene expression varies continuously over time subject
to metabolism, growth and signaling via 2308
DNA binding proteins.
Cell tables form the basis of an interesting set of thought experiments, especially when one adds the time dimension. A more accurate table changes over time, to reflect the birth, mitosis and death of each cell in the organism. As cells differentiate in the body plan each responds to different local conditions and different growth rates. We might want to track the gene expression in a single cell of a single type, to watch its life cycle at time 0, at time 1 and so on:
TIME 0
|
gene1
|
gene2
|
gene3
|
...
|
gene32,000
±
|
tissueType32
|
...
|
TIME 1
|
gene1
|
gene2
|
gene3
|
...
|
gene32,000
±
|
tissueType32
|
...
|
We could mark and watch
several cell table entries at once. They would each vary continuously, indicating
the degree of gene expression.
TIME T
|
gene1
|
gene2
|
gene3
|
...
|
gene32,000
±
|
nerve
|
...
|
||||
muscle
|
...
|
||||
B-Cell
|
...
|
||||
T-Cell
|
...
|
||||
...
|
...
|
...
|
...
|
...
|
...
|
tissue
350
|
...
|
The trouble is, we cannot examine the total gene expression of a cell without destroying the cell and extracting the mRNA's and putting them on a gene chip and observing the corresponding hybridization.
When a cell is in an undifferentiated state, as in a stem cell, it is more difficult to assign it a cell type, because it can mature to be one of many different kinds of cells under the influence of various signals, for example cytokine signaling in hemapoesis. Which cell it currently "is", depends on which cytokine it has just been exposed to. Since our original table involved classes, or categories of cells, instead of specific instances of cells, it is hard to assign a cell to a specific category. The category changes during development.
It is the net pattern of gene expression that determines the cell type, not our initial characterization! If we sort cells by the gene expression pattern that they manifest, then they will be typed by the sorting! As the cell matures into a given type, the subset of its genes that are morphogenic will be inactivated. Eventually this gives rise to a mature cell type that follows the clocks of cellular metabolism in addition to hormone and signal factor induced clockings and transformations.
If one was to track every cell from birth do adult in 100 kg of biomass (corresponding to a 220 pound person) one would have to make an entry for each of approximately 100 trillion cells. The table that tracks specific instances of cells is much taller than it is wide. 100 trillion entries tall, 32,000 ± wide. A large table, 3.2 million trillion entries, or 320 thousand gigabytes, to first order.
In a fine needle biopsy of a suspected tumor, a micro-coring penetrates consecutive layers. These consecutive layers consist of a several tissue types, only one of which may be of concern. It is necessary to microdissect those cells from the coring that are histologically significant from a tumorigenesis point of view. Even after microdissection any subsequent gene expression analysis is typically made from an ensemble of cells, not necessarily from the specific cancer clone that gave rise to the tumor. This implies that the gene expression studies may be diluted with the expression of normal cells whose mRNA's are inadvertently thrown into the batch with those of the cancer clone.
One of the problems of cancer cells, is that they are, immunologically speaking, cells of self. Thus it is difficult to devise immuno based strategies that preferentially attack the cancer cells, that do not also cause the problems in the patient's normal tissues. There are tumor markers, that are are unique to cancer cells, but these are not universally expressed in all cancers or at all times. Cancer is nothing more (or less!) than DNA damage. Further it is "nothing more" than six to ten accumulated mutations in growth regulators, cell cycle checkpoints, oncogenes and the like. This makes the problem of specifically targeting the cancer cell peculiarly difficult even in the face of growing understanding of human genome sequence. Your cancer belongs to you and no one else. Remember that.
I had held out the hope that cell simulation would open the door to understanding cancer to the point of treating it more effectively. The cancer clone that gives rise to a specific line of cancer cells. Simulating the cancer clone is akin to simulating one of the patient's own cells, with a few important differences. Cancer cells are a little different than the patients normal cells, but not enough different to be easily recognized. Cancer cells typically divide rapidly without going through the DNA quality control checkpoints (apologies for the anthropomorphic POV) and without triggering the growth arrest triggers via apoptotic/ caspase pathways. These cells are slightly "weaker". They incur progressively more damage as they multiply, else therapies which exploit their weakness would not succeed. Cancer cells are more primal and express a broader range of gene products. But alas these gene products are most often products that you would find in the normal patient, albeit in overexpressed quantities.
Without being excessively pessimistic my concern is that, even if one could completely simulate the cancerous clone, one would simply have recreated the tragic portrait of a broken system, without providing the critically important handle on fixing it. Fixes, if they do exist, depend on exploiting very specific mutations in the cancer clone that would belong to the individual. Cancer is personal. Thus, you will not find me buying shares in any new "cancer wonder drug" since any plausible drug would have to be patient specific.
This tells us that for near term, practical simulations, we want to identify those cells who are known to conspire in the cancer process, such as fibroblasts and ductile epithelial cells in breast cancer. The direct simulation of the normal and abnormal gene expression states of these selected cells will produce the highest benefit per unit of cost, whatever that benefit might be.