This is a preliminary and incomplete version of the glossary for a
series of
articles on epigenetics. Readers are invited to
submit suggestions for
improvement.
acetylation. Attachment of an acetyl group to a molecule, which is then said to be "acetylated". Certain enzymes (known as acetyltransferases) can do this. See also histone modifications. Deacetylation is the corresponding removal of an acetyl group, accomplished by histone deacetylases.
acetyl group. A small chemical group with the formula, -COCH3. See acetylation.
acetyltransferase. An enzyme that can attach a acetyl group to another molecule. In an epigenetic context, the term generally refers to the attachment of a acetyl group to a histone (by a histone acetyltransferase).
actin. A very common protein that forms filaments. It provides a kind of cellular "skeleton" in the cytoplasm of all cells, and also plays a major role in muscle contraction. It has been found to be present in the cell nucleus as well, and to be required for certain chromosomal movements.
activator. A protein transcription factor with a positive effect on gene expression. (Compare repressor.) An activator may work in conjunction with one or more co-activators. With or without the help of co-activators, the activator commonly plays a role in bringing RNA polymerase (the transcribing enzyme) to the gene promoter. The DNA sequence bound by the activator is called an enhancer, and it may lie in the promoter region immediately adjacent to the gene being activated, but it can also be far distant on the chromosome - or even on an altogether different chromosome. It must, however, be brought near to the promoter in order to recruit RNA polymerase. Note: the term "activate" can be used more generally to refer to any process that helps to bring a gene to expression.
adenine. See nucleotide base.
allele. Human genes occur in pairs, one on each chromosome of a chromosome pair. The two members of a pair are called "alleles" and can have a differing form and significance, as when (in the case of some flowers) the allele on one chromosome tends to produce, say, a red petal color and the other allele tends to produce a white petal color. (The actual color will depend on the relative effects of the two alleles.)
alternative splicing. Precursor mRNA can be spliced in different ways, so that a single gene may lead, via differently spliced RNAs, to different proteins. It is thought that a high-percentage of human genes are subject to the alternative splicing of their RNA transcripts, and the transcripts of some genes can be spliced in hundreds of different ways. See also RNA splicing and trans-splicing
amino acid. Amino acids are, among other things, constituent elements of protein. There are twenty different kinds of amino acids in protein, and any number of them -- up to many thousands -- are arranged in sequence to form the main body of a protein.
antigen. A substance that stimulates the generation of antibodies as part of an immune response.
ATP. Adenosine triphosphate, a molecule playing a central role in the storage and transfer of energy within the cell. It is used, for example, by ATP-dependent chromatin remodeling complexes, which apply energy derived from ATP to the restructuring of nucleosomes.
basal transcription factor. See general transcription factor.
base pair. Two bonded nucleotide bases joined to opposite strands of the DNA double helix. These paired bases form the "rungs" on the spiraling DNA "ladder", and the bases in each pair are nearly always complementary to each other. See also nucleotide base.
base pair complementarity. The four DNA nucleotide bases are cytosine (indicated by the letter "C"), guanine (G), adenine (A), and thymine (T). It happens that these four bases normally form base pairs in only two ways: cytosine paired with guanine and adenine paired with thymine (C-G and A-T). The members of each pair are said to be "complementary" to each other. This means that a complete, double-stranded DNA molecule can always be formed from a single strand, because added nucleotide bases will pair up with the bases of the single strand in the "correct" way. Actually this is a great simplification, since there are other constituents of DNA beside the nucleotide bases. But because the nucleotide bases are thought of as containing the essential genetic code, one can picture the complementarity of bases as a means of preserving the fidelity of the code. RNA, while normally single-stranded, also preserves this code: in the formation of an RNA molecule from the template of a DNA strand, the nucleotide bases of the forming RNA are added sequentially in the same complementary fashion as when a DNA strand is replicated, except that the base uracil in RNA takes the place of thymine in DNA.
bind. To attach chemically; form a chemical bond with. See also binding site.
binding site. Typically refers to the particular sequence of nucleotide bases on a DNA or RNA molecule that a protein or RNA molecule can "target" and bind to -- that is, can attach to. The affinity of a protein for such a binding site is given by the folded shape, distribution of electrical charges, and perhaps other characteristics of the protein molecule. The binding affinity of an RNA molecule for another RNA or DNA is a matter of sequence base pair complementarity.
branching. The separation of the two strands of DNA. It occurs, for example, during the processes of transcription and replication, but can also occur locally as a result of various chromosome dynamics.
cell nucleus. A membrane-bound organelle in the cells of all higher organisms. It contains the cell's DNA.
chaperone. A molecule that assists in the folding or assembly of other molecules or complexes without itself becoming a part of the end product.
chromatin. The complex of DNA, proteins, and RNA that constitutes chromosomes. The Histones that form nucleosome "spools" are the most abundant proteins in chromatin, but many other proteins -- transcription factors, activators, repressors, chromatin remodeling complexes, and other sorts architectural proteins -- play a role. Many of these proteins transiently associate with, and dissociate from, chromatin, which is highly dynamic in form and structure.
chromatin remodeling. The architectural re-structuring of chromatin. This re-structuring can take a number of forms: compaction or opening-up of the chromatin fiber; sliding nucleosomes along the DNA; making histone modifications; contributing to the assembly or disassembly of nucleosomes; and loosening or tightening the binding of DNA to nucleosome spools. All of these changes play a substantial role in gene regulation.
chromatin remodeling complex. Various classes of protein that modify and restructure chromatin. For examples of their action, see under chromatin remodeling.
chromosome. A long, continuous length of DNA "packaged" by means of histones and other proteins and containing many chemical sequences known as "genes". Humans have 46 chromosomes, which come in 23 pairs - one member of each pair being inherited maternally and one paternally. The "same" genes occur in both members of a pair; any two such corresponding genes are known as alleles of a single gene.
chromosome territory. A particular region of the cell nucleus characteristically occupied by a chromosome in a given tissue type at a given stage of development and under a particular set of conditions. The spatial organization of chromosome territories within a nucleus has a bearing on gene regulation.
co-activator. A protein or protein complex that, like an activator, encourages expression of a gene. But whereas the activator, like all transcription factors, recognizes and binds to a specific DNA sequence, a co-activator is not sequence-specific. Rather than binding directly to DNA, it binds to the activator. It may thereby aid, for example, in recruiting RNA polymerase to a gene promoter. See also co-repressor.
codon. "Words" of the genetic code consisting of three successive nucleotide bases, or "letters". Each codon of a protein-coding gene is supposed to correspond to one amino acid in the protein coded for by the gene. See also synonymous codon.
co-factor. A general term referring either to a co-activator or co-repressor.
complementarity. See base pair complementarity
co-repressor. A protein or protein complex that, like a repressor, discourages expression of a gene. But whereas the repressor, like all transcription factors, recognizes and binds to a specific DNA sequence, a co-repressor is not sequence-specific. Rather than binding directly to DNA, it binds to (and is often said to be recruited by) the repressor. It may thereby aid, for example, in blocking access of RNA polymerase to a gene promoter. See also co-activator.
cytoplasm. All the contents of a cell outside the nucleus.
cytosine. See nucleotide base.
demethylation. Removal of the marks of DNA methylation.
development. In an epigenetic context, "development" refers most narrowly to the process by which originally undifferentiated cells (for example, stem cells) progressively become specialized or differentiated, or else produce more specialized offspring through cell division. In a broader sense, the term refers to all the processes of growth and maturation.
differentiation. The movement from less specialized cellular forms to more specialized ones. We can also speak of "organ differentiation", referring to the way that organs, with their specialized cell types, develop from an earlier organism lacking those specializations. These developments typically occur without any changes in the genome - that is, in the genetic sequence of the cells' DNA. Understanding of differentiation therefore requires a reckoning with epigenetic processes.
diploid. Possessing two sets of chromosomes -- that is, possessing a pair of each type of chromosome, with one member of the pair inherited maternally and one inherited paternally. In mammals, all cells except the gametes are normally diploid. Compare haploid.
DNA. Deoxyribonucleic acid, a molecule that figures centrally in inheritance. Constituting part of the material of chromosomes, it is commonly double-stranded in the form of a double helix. Connecting the two strands are base pairs consisting of nucleotide bases. Here you will find a conventional animated stick figure of DNA; it schematically represents a few isolated features abstracted from whatever reality the actual material chromosome presents in the cell.
DNA breathing (1). The rhythmic unwrapping and rewrapping of DNA from nucleosomal spools. This takes place at the entry and exit sites -- that is, where the DNA meets or leaves the spool. The breathing takes place rapidly, on the order of milliseconds. Not to be confused with DNA breathing (2).
DNA breathing (2). The dynamic opening and closing of "bubbles" between the two strands of the DNA double helix. That is, for a certain length the two strands of the double helix become disconnected, and then later they reconnect. This is thought to be important for, among other things, the initiation of transcription, because RNA polymerase can only begin transcribing once the double helix has begun to be "unzipped".
DNA methylation. The attachment of a methyl chemical group to particular nucleotide bases (usually cytosine) of the DNA molecule. Methylation is recognized by various regulatory factors and therefore plays a major role in gene regulation. In general, DNA methylation tends to have a repressive effect on gene expression, but this generality is qualified by many subtleties.
DNA replication. The process by which both strands of a double-stranded DNA molecule serve as templates for strand reproduction. The result is two double-stranded DNA molecules, each containing one strand from the original molecule and one newly synthesized strand.
double helix. The form taken by DNA (and also by double-stranded RNA. Speaking very generally, it's the form you get when you take two cords and twist them together, so that each one spirals around the other.
double-stranded RNA (dsRNA). RNA that, like normal DNA, has two complementary strands joined by nucleotide base pairs. dsRNA can be brought into cells by viruses, and it can also be produced natively. This can happen, for example, when a length of RNA happens to contain two adjacent, complementary, and probably rather short sequences of nucleotide bases. That is, when the RNA folds sharply (into a hairpin shape) at the point between the two sequences, it brings a series of complementary bases together, allowing them to form base pairs that hold the two strands together.
downstream. See upstream/downstream.
enhancer. A DNA sequence that transcription factors known as activators can recognize and bind to and thereby aid in recruiting RNA polymerase to a gene's promoter to increase transcription.
epigenetic inheritance. Depending on context this can refer either to inheritance between generations of an organism, or between cell generations within an organism. In both cases the reference is to inherited traits that are mediated, not by the DNA sequence, but by epigenetic processes or conditions. Thus, something in the parents' activity or environment may lead to epigenetic changes in their cells -- and particularly in their germ cells -- that are passed on to their offspring, producing traits in the offspring that cannot be accounted for by the parents' DNA or any mutation of it.
epigenetics. More or less literally: that which is "added to" genetics. The term is most commonly taken to refer to heritable changes in gene expression that do not result from changes in actual gene sequences. ("Heritable" here can refer not only to inheritance between parents and offspring, but also between parent and daughter cells.) The changes result from the way the larger cellular context interacts with the genes. Nearly all the transformations involved in cellular differentiation fall under the heading of epigenetics.
epigenome. All the structures and processes of the cell bearing on gene expression. The term gains its main force from the (largely false) analogy with "genome". The latter was classically (and now rather disreputably) thought of as the sum total of the genetic code -- a digitally precise "database" containing all the "information" needed to fashion a human being. At least the genome does contain a more or less exact DNA sequence reliably passed (with various recombinations) from one generation to the next. The epigenome, by contrast, manifests nothing like the same sort of fixity.
euchromatin. Chromatin in its less condensed, more open and accessible, and (often) more actively transcribed state, typically richer in genes. Compare heterochromatin.
exon. A segment of the DNA sequence of a gene; more specifically: a segment whose corresponding segment in the gene's RNA transcript is retained until translation rather than deleted as part of the RNA splicing process. Or, in the case of noncoding DNA and its RNA transcripts: exons are those segments retained in the final functional form of the RNA. "Exon" can refer either to the DNA sequence or the corresponding transcribed RNA sequence. Compare intron.
expression. The production of RNA using a DNA sequence as a template. The DNA sequence is then said to have been "expressed" or "transcribed". The DNA sequence may represent a protein-coding gene (see also gene expression), or else it may be noncoding, in which case the expressed RNA is not translated into protein, but may have any of countless regulatory functions within the cell.
gamete. A haploid reproductive cell -- an egg cell in the female or sperm in the male.
gene. Sorry, but you won't pin me down on this one. "A gene is anything a competent biologist has chosen to call a gene" (philosopher of science Phillip Kitcher, 1992). "Our knowledge of the structure and function of the genetic material has outgrown the terminology traditionally used to describe it. It is arguable that the old term gene, essential at an earlier stage of the analysis, is no longer useful, except as a handy and versatile expression, the meaning of which is determined by the context" (geneticist Peter Portin, 1993). For a brief overview of the history of the concept of the gene, see this article by biologist Craig Holdrege.
gene activation. Generally, this can refer to any process leading to increased expression of a gene. More specifically, it refers to the role of an activator in increasing expression.
gene expression. Most commonly this term is applied to protein-coding genes, where it refers to the production of messenger RNA (mRNA) using a DNA gene sequence as a template. The gene is then said to have been "expressed" or "transcribed". The mRNA will (after various sorts of processing, such as splicing) be translated into a protein. Expression also has a more general meaning.
general transcription factor. A protein that is like a transcription factor but without being specific to particular genes; rather it enables the actual process of transcription as such, regardless of the (protein-coding) genes being transcribed. Many of these factors are part of the pre-initiation complex. However, more recent research is showing that the word "general" is a misnomer; these factors can be more or less specific, playing different roles with different genes, or different classes of genes. General transcription factors are also known as "basal transcription factors".
gene regulation. The management (by the cell as a whole) of gene expression. This involves gene activation, gene silencing, the timing and extent of gene transcription, the "editing" of the resultant transcripts, the regulation of translation, and so on - everything that affects what the cell ultimately makes of the gene. Particular regions of DNA that participate in regulation - for example, by being targets for transcription factors - are known as "regulatory regions" or "regulatory sites". Gene regulation is sometimes more narrowly referred to as "transcription regulation".
gene repression. Generally, this can refer to the reduction of expression of a gene, from whatever cause. More specifically, it refers to the role of a repressor in blocking gene expression.
gene silencing. Blocking of the processes that lead from a gene to its possible protein end products. This blocking can occur at many different points. Most generally: it can involve the prevention of gene transcription ("transcriptional silencing"), or the modification or destruction of the gene's mRNA transcript so as to prevent translation ("post-transcriptional silencing"). More particularly, gene silencing can refer to the action of a DNA sequence known as a silencer
genetic code. This term has many meanings both legitimate and illegitimate. Most basically, it refers to the sequence of nucleotide bases, or "letters", in DNA, and to the way that successive groups of three such bases in a protein-coding gene can (with various complications) correspond to the successive amino acids making up a protein. The gene is then said to "code for" that protein. Each three-letter group of a coding sequence is a codon.
genome. All the DNA in an organism or cell, especially with reference to the total sequence of base pairs, or "letters" of the genetic code.
germ cells. The cells (male or female, sperm or egg) that come together in reproduction. Each such cell has only half the normal complement of chromosomes -- one chromosome from each chromosome pair.
germ line. The "line" or succession of cells that leads from one generation to the next through the germ cells ("sex cells" or "gametes").
guanine. See nucleotide base.
haploid. Possessing only a single set of unpaired chromosomes. Gametes are normally haploid. Compare diploid.
helical axis. If you imagine the two strands of a double helix wrapped around a wire core, this wire would represent the helical axis.
heterochromatin. Chromatin in its more tightly packed, less accessible, and less actively transcribed state, often containing fewer genes. nucleosomes and various chromatin-associated proteins play a major role in the compact structuring of heterochromatin. Compare euchromatin.
heterozygous. An organism is said to be heterozygous with respect to a particular gene if the two alleles of the gene are different, as when a pea plant has one allele for a white flower color and one allele for violet-colored flowers. (The actual trait in such cases depends upon the dominance relations between the two alleles.) Compare homozygous.
histone. A family of simple proteins, abundant in the cell nucleus and constituting a substantial part of the (mostly) protein-and-DNA complex known as chromatin - the physical substance of chromosomes. A group of eight histones - two each of four different kinds - makes up the "spool" of a nucleosome. Linker histones also participate in chromatin.
histone code. The code presumed to be found in the collection of histone modifications. The idea is this: for any given nucleosome there are many possible (co-valent) modifications of its constituent histones, leading to countless possible combinations of such modifications. It could be, then, that for each distinct combination -- or, at least, for many of them -- there is a specific gene-regulatory implication. For example, a particular combination might be a signal for the binding of a specific chromatin remodeling complex. This mapping from specific combinations of histone modifications to specific effects would be the "code". However, the idea that these modifications not only have regulatory significance but have it in a fixed, precise, combinatorially encoded fashion now looks as if it is being increasingly discredited.
histone modification, often referred to as "histone post-translational modification", because the changes occur after the translation that produces the histone protein. A histone modification consists of the addition or subtraction of any one of several chemical groups to an individual amino acid of a histone - especially a histone belonging to a nucleosome. The modified amino acid might be on either the histone tail or the main body of the histone. Depending on the chemical group involved, the modification is called methylation (addition of a methyl group), acetylation (addition of an acetyl group), phosphorylation, ubiquitination, sumoylation, and so on. These modifications can dramatically affect the electrical and other properties of nucleosomes, and they play a major role in gene regulation.
histone tail. A thin, filamentary "tail" typically extending from each of the eight histone proteins constituting the core particle, or spool, of a nucleosome.
homozygous. An organism is said to homozygous with respect to a particular gene if the two alleles of the gene are essentially the same, as when a pea plant has two alleles specifying a white color for flowers. Compare heterozygous.
hormone. A substance produced in particular cells (for example, in a gland) that can travel to other parts of the body and (often in very small quantities) influence those other parts. The hormone, which may be recognized by receptor molecules, is often said to carry a signal.
initiator (Inr). The initiator, one of the components of a gene promoter. In the absence of the TATA box -- or in conjunction with it or with other promoter elements -- Inr can provide a base for the constellation of the pre-initiation complex.
in vitro. "In glass" -- that is, in an artificial environment such as a test tube or laboratory dish.
in vivo. In a living context -- more specifically, in the living cell.
Inheritance of acquired characteristics (Lamarckism). The idea that traits can be passed from an organism to its offspring, not only as those traits are determined in a fixed way by genes, but also as they are altered by the activity of the parent organism during its life. The classic example for ridiculing this notion is the giraffe's neck: no matter how much a giraffe stretches its neck during its lifetime in order to browse on higher leaves, this will not affect the inherited neck length of the giraffe's offspring. But researchers today are exploring a rapidly increasing number of cases where acquired characteristics are passed on to offspring quite independently of genetic inheritance. This inheritance is achieved by epigenetic means. "Lamarckism" refers to Jean-Baptiste Pierre Antoine de Monet, Chevalier de la Marck (1744-1829), who argued for the inheritance of acquired characteristics.
insulator. A DNA sequence that acts as a kind of boundary element, blocking the effects of certain regulatory elements. In particular, an insulator can block the role of an enhancer, or, more broadly, it can prevent the spread of highly condensed chromatin into neighboring regions, where the condensed chromatin might have the effect of suppressing gene expression. Insulators help make possible the independent regulation of nearby chromosome regions.
intron. A segment of the DNA sequence of a gene; more specifically: a segment whose corresponding segment in the gene's RNA transcript is deleted from the transcript before translation. The deletion occurs as a result of the RNA splicing process. Or, in the case of noncoding DNA and its RNA transcripts: introns are those segments deleted before the final functional form of the RNA is achieved. "Intron" can refer either to the DNA sequence or to the corresponding transcribed RNA sequence. Compare exon.
LCR. See locus control region.
linker DNA. The relatively short length of DNA extending between successive nucleosomes. It is typically a few tens of base pairs long.
linker histone. A histone (most often the histone known as "H1") that binds the DNA entering a nucleosome spool to the exiting DNA, thereby stabilizing the nucleosome and conducing to the formation of more regular arrays of compact chromatin.
locus control region (LCR). A DNA sequence that helps to regulate a cluster of related genes. These genes may be both nearby and far away on the same chromosome, or even on different chromosomes. The LCR plays a role in organizing the chromatin sections containing the genes and coordinating their expression.
major groove. If you wrap two cords around each other in the manner of a double helix, there will be two grooves between the cords. Each groove winds along with the cords. However, in this case, you would not see a difference in the "width" of the two grooves -- or any width at all. But if the cords are separated by bulky material attached to both of them, then -- depending on the shape of that material -- the distance in going from one cord to the other (passing around the bulky material in the middle) may be greater in one direction than in the other. This is the case when the "filler" material consists of assymetric nucleotide bases (the "letters" of the genetic code. The wider of the two grooves is the major groove, and the narrower one is the minor groove.
MAR (matrix attachment region). A DNA sequence particularly well suited to serve as an anchoring site for tethering DNA to the nuclear matrix. The constellation of many such tetherings contributes to the looping structure of chromosomes.
mark. Geneticists commonly refer to the attached chemical groups resulting from DNA methylation or histone modification as "marks". Using the word verbally, one can say, for example, that an enzyme "marks" DNA with methyl groups, while another enzyme removes such marks -- that is, removes the methyl groups.
membranome. A rather vague term referring to the collection of biological membranes in a cell or organism -- particularly with reference to their informational role. In all likelihood, this is simply to say: with reference to their biological significance and functioning. "Membranome" may (not by accident) include in its connotations something like "digital code" (ala the genome), but that is presumably only for a certain feel-good effect.
Mendel's Laws. The first law -- the Law of Segregation -- states that germ cells (gametes) receive only one member of each parental chromosome pair. So a single gamete contains only one allele of any particular gene. The second law -- the Law of Independent Assortment (also known as the Law of Inheritance) -- states that distribution of alleles to germ cells occurs independently for each gene. That is, if gene X has alleles x' and x", and if Gene Y has alleles y' and y", then the following combinations in germ cells are equally possible: x'y', x'y", x"y', and x"y". This second law, as it happens, is not true in general; it is valid only where genes are not linked, as they generally are when they reside on the same chromosome -- in which latter case the alleles of the two genes on that chromosome will commonly be passed along to the germ cells together rather than independently.
messenger RNA (mRNA). A kind of RNA. Different mRNAs result from transcription of protein-coding genes and can lead, via translation, to the formation of proteins. See also RNA
methylation. Attachment of a methyl group to a molecule, which is then said to be "methylated". Certain enzymes (known as "methylases" or methyltransferases) can do this. See also histone modifications and DNA methylation. "Demethylation" is the corresponding removal of a methyl group, accomplished by demethylases.
methyl group. A small chemical group with the formula, -CH3. See methylation.
methylome. The sum total or overall pattern of DNA methylation in a genome.
methyltransferase (or methylase). An enzyme that can attach a methyl group to another molecule. In an epigenetic context, the term generally refers to the attachment of a methyl group to DNA (by a DNA methyltransferase) or to a histone (by a histone methyltransferase).
micro-RNA (miRNA). A small RNA, 21-23 nucleotide bases in length. Like the siRNA involved in RNA interference, miRNA is derived from double-stranded RNA (although not double-stranded RNA that originates from viruses). And it, too, becomes associated with a protein complex known as a "RISC", in cooperation with which it disables messenger RNA molecules containing sequences complementary to its own. One difference between miRNA and the siRNA involved in RNA interference is that the complementarity between the miRNA and the target messenger RNA need not be exact, so that a single miRNA molecule can neutralize many different messenger RNA molecules. This effectively silences, or at least reduces the expression of, the genes producing those messenger RNAs. There are at least several hundred different miRNAs in humans.
minor groove. The narrower of the two grooves running the length of the double helix. For further explanation, see major groove.
multipotent. Capable of developing into two or more closely related types of cell. For example, blood stem cells can develop into red cells, white cells, and platelets. Compare totipotent and pluripotent.
mutation. A change in the DNA sequence of nucleotide bases -- which is to say (in the usual terminology), a change in the genetic code. An organism containing a mutation is (when the mutation is what one has in view) said to be a "mutant".
myosin. A contractile protein, or, rather, a family of proteins. It is the most common protein found in muscles, working together with actin to produce muscle contraction. Myosin consumes energy in driving movements along actin filaments.
naked DNA. DNA that is not wrapped around nucleosomes. Generally, this refers to longer stretches of DNA, not the short linker DNA leading directly from one nucleosome to another.
noncoding. RNA or DNA that does not code for a protein is said to be "noncoding". Noncoding DNA can have many regulatory functions and can even be transcribed into RNA, but the resultant RNA is also noncoding (will not be translated into a protein) and likewise can have many regulatory functions.
nuclear envelope. The membrane (technically, a double lipid bilayer) that encloses the cell nucleus, separating the genetic material and other contents from the rest of the cell. However, there is intimate communication across the envelope, and numerous "nuclear pores" offer passage between the nucleus and the larger cellular environment.
nuclear lamina. A fibrous network, together with associated proteins, located in the periphery of the cell nucleus, at the inner face of the nuclear envelope. At any given time some chromosome sites can be attached to the nuclear lamina, a situation that tends to correlate with reduced gene expression.
nuclear matrix. A poorly characterized and highly dynamic structural skeleton giving organizational structure to the cell nucleus.
nucleoprotein. Protein contained in a complex with DNA or RNA.
nucleosome. A group of (usually) eight histone proteins that together form a kind of "spool" around which DNA is commonly wrapped about two turns. (The length of DNA wrapped around a "standard" nucleosome is commonly given as 147 base pairs. But many variations upon this standard length are currently being investigated.) There are millions of nucleosomes in the human genome, and they are key elements in the compaction, or condensation, of DNA. They are a focus of many different aspects of gene regulation.
nucleosome free region. A stretch of DNA that is free of nucleosomes, perhaps because they have been disassembled and removed, or else have shifted their position by sliding along the DNA.
nucleosome_sliding. The process by which DNA slides around a nucleosome spool. The effect is to displace the spool linearly along the DNA. As a result, some DNA sequences that were wrapped around the nucleosome (and therefore less accessible to regulatory factors) are exposed as free or naked DNA, while other sequences, previously free, are bound to the nucleosome.
nucleotide base. A class of nitrogen-containing chemical groups that are constituents of DNA and RNA. The four main bases in DNA are adenine, guanine, cytosine, and thymine (A, G, C, and T, respectively - "letters" of the genetic code). In RNA, uracil (U) stands in the place of thymine. These bases combine in restricted ways to form complementary base pairs. This complementation is central to DNA replication and gene expression because of the way it allows the strands of DNA to be used as templates for replication or for production of RNA that preserves the sequential information employed by the cell in protein production.
open chromatin. See the more technical term, euchromatin.
phosphorylation. Attachment of a phosphate group to a molecule, which is then said to be "phosphorylated". Certain enzymes (known as kinases or phosphotransferases) can do this. See also histone modifications. Dephosphorylation is the corresponding removal of a phosphate group, accomplished by phosphatases.
phosphate group. A small chemical group with the formula, PO4. See phosphorylation.
pluripotent. Capable of developing into a considerable range of different cell types. For example, embryonic stem cells can transform themselves into many, but not all, tissue types during fetal development. Compare totipotent and multipotent.
pre-initiation complex (PIC). A group of multi-subunit protein complexes, including RNA polymerase that come together on a gene promoter as a preparatory step for gene transcription.
pre-cursor messenger RNA (pre-mRNA). RNA transcripts that have not yet been spliced.
promoter. A regulatory DNA sequence, usually close to, and upstream from, the gene or genes it regulates. It serves as a binding site for transcription factors and for the protein complexes that initiate gene transcription, and it serves to identify the start site for transcription.
protein coding gene. A gene whose DNA sequence can lead, via transcription and translation, to production of one or many different proteins. The gene is said to code for the protein that eventuates from it.
receptor. A protein (residing in cytoplasm or embedded in a cell membrane) to which a signaling molecule, such as a hormone, can attach. The result is typically a change in conformation of the protein, which in turn may lead to changes, sometimes dramatic, in the surrounding milieu of the protein.
regulatory factor. See gene regulation.
regulatory region. See gene regulation.
repressor. A protein transcription factor with a negative effect on gene expression. (Compare activator.) A repressor may work in conjunction with one or more co-repressors. With or without co-repressors, the repressor commonly blocks access to the gene promoter by RNA polymerase (the transcribing enzyme). The DNA sequence bound by the repressor is called a silencer.
ribosomal RNA (rRNA). See under RNA.
RISC (RNA-induced silencing complex). A protein complex that plays a central part in RNA interference. The complex consists of several proteins together with a small interfering RNA (siRNA). The complex locates mRNA molecules containing sequences complementary to the siRNA, after which a protein in the complex cleaves the mRNA or otherwise damages it so as to prevent translation.
RNA. Ribonucleic acid. Like DNA, it contains a series of nucleotide bases (often thought of as a "letters" encoding for, or specifying, the amino acid constitutents of protein. However, in RNA the uracil base occurs instead of the thymine of DNA. RNA is classically thought of as existing in three primary forms. (1) mRNA (messenger RNA), produced by RNA polymerase from a protein-coding gene-template, preserves the gene's code and is an intermediary between the gene and the protein it specifies. mRNA is normally single-stranded. (2) rRNA (ribosomal RNA), which belongs to the protein-producing ribosomes of the cell, interprets the mRNA sequence as a set of "codes" specifying the series of amino acids from which the protein is to be constellated and engages in the actual production of the protein. (3) tRNA (transfer RNA) then brings the actual amino acids for adding to the growing protein molecule. More recently, a great variety of RNA types, both small and large, both protein-coding and noncoding, have been discovered. They play a major role in many epigenetic processes.
RNA editing. The process by which particular nucleotide bases ("letters") are removed from a precursor RNA and replaced with different bases.
RNA interference (RNAi). Regulation of gene expression -- and especially the silencing of genes -- by processes involving small RNA molecules about 21 - 25 nucleotide bases long. This RNA is known as "small interfering RNA" or siRNA. A protein complex incorporating an siRNA and called a RISC locates mRNA molecules with a sequence complementary to that of the siRNA and proceeds to cleave the mRNA or otherwise prevent it from being translated. This is known as "post-transcriptional silencing" because it effectively silences genes (preventing the production of protein from them) only after the genes have been transcribed. However, more and more other roles are being discovered for siRNAs -- for example, in DNA methylation, chromatin remodeling and small RNA-induced gene activation.
RNAi (RNA interference) code. A vague term meaning little more "the sum total of what we know about the role of RNA interference" in gene expression.
RNA polymerase. The enzyme (protein) that transcribes DNA (protein-coding genes, but also various noncoding sequences) into RNA. In humans, different RNA polymerases (I, II, and III) transcribe different sorts of DNA sequences.
RNA splicing. The process by which introns are removed from a pre-mRNA transcript and the remaining exons are joined together. Splicing occurs preliminary to translation of the transcript or, in the case of noncoding transcripts, preliminary to the achievement of the functional RNA end-product. The nature of the splicing will determine what sort of protein a protein-coding RNA can produce. The splicing is usually carried out by a large RNA-protein complex known as a "spliceosome". See also alternative splicing and trans-splicing.
sequence. A contiguous group of nucleotide bases ("letters") in a DNA or RNA molecule. Particular sequences may be significant in many different respects. For example: (1) they can define specific locations recognizable by transcription factors and other regulatory molecules; (2) they can influence the structure and stability of the double helix and the associated chromatin; (3) they can influence the positioning of nucleosomes and in general the chromatin structure; and (4) they can code for proteins. "Sequence" can also refer to the linear chain of amino acids constituting the main structure of a protein. Such protein sequences correlate (more or less) with DNA and RNA sequences by means of the genetic code.
sequencing. The process of ascertaining the sequence of a DNA or RNA molecule or of a particular region of the molecule.
signaling. A broad term referring to various aspects of complex molecular communication within the cell and organism. See, for example, hormone and receptor. A coherent sequence of transactions whereby information is carried from one place to another via molecular signals is called a "signaling pathway".
silencer. A DNA sequence that transcription factors known as repressors can recognize and bind to, thereby (more or less) blocking access to a gene and preventing its transcription.
small interfering RNA (siRNA). A small RNA 21-25 nucleotide bases in length that plays a key role in RNA interference. siRNAs are derived from double-stranded RNA molecules, which are often brought into cells by viruses. The double-stranded RNA is cleaved into small lengths, and a product of this cleavage is assimilated to a RISC protein complex, at which time the two strands of the RNA are separated and one is discarded. See further under RNA interference.
small RNA-induced gene activation. ??
stem cell. A more or less undifferentiated (nonspecialized) cell capable of dividing indefinitely as a stem cell, or else of differentiating into more specialized cell types. Embryonic stem cells are primitive stem cells in the embryo capable of differentiating into most of the cell types of the body. Adult stem cells, normally found in adult tissues, can differentiate into at least a few different cell types. And induced pluripotent stem cells result from the reversion of a differentiated cell to a pluripotent form through human engineering.
supercoil. If you twist two strands around a linear wire core, you will have a double helix that coils, or spirals, around an axis represented by the wire. (The wire is invoked here only to identify the axis of the double helix.) If now you twist that whole arrangement so that the axis coils on itself, you have what is called a supercoil. Further, there are two directions in which you can perform this second level of twisting. One is "with" the original twist of the double helix (which yields positive supercoiling), and the other is against this original twist (negative supercoiling). If the ends of the two strands are fastened together so that they are not free to slide around each other, then negative supercoiling will tend to force the strands apart, or "open" them up, while positive supercoiling will have the opposite effect. (It's best to try this with real cords!)
synonymous codon. Two codons are said to be synonymous if they both code for the same amino acid. Because the genetic code is redundant, several codons can map to the same amino acid.
synonymous mutation. The alteration of a codon into a different form that is synonymous with the original.
TATA box. A DNA sequence having these nucleotide bases ("letters") as its core: TATAAA. The TATA box is one of the several elements contained in gene promoters. Whereas it was once thought to be a more less canonical element of promoters, it is now believed to be present in less than 25 percent of human promoters. Recognition of the TATA box by the TATA-binding protein is an initial step in the formation of the pre-initiation complex.
TATA-binding protein. A general transcription factor that binds to the TATA box promoter element, typically to begin constellation of the pre-initiation complex on the promoter.
TFIIB. A general transcription factor contributing to the formation of the pre-initiation complex. It helps to stabilize the .Gs "binding" "" "bind" of .Gs "TBP" "" "tbp" and to prepare the way for binding of
TFIID. A general transcription factor contributing to the formation of the pre-initiation complex.
TFIIE. A general transcription factor contributing to the formation of the pre-initiation complex.
TFIIF. A general transcription factor contributing to the formation of the pre-initiation complex.
TFIIH. A general transcription factor contributing to the formation of the pre-initiation complex.
thymine. See nucleotide base.
topoisomerase. Enzymes (proteins) that cut the strands of a DNA molecule and then reconnect the strands. The effect may be to release the tension of supercoiling or to untangle knots. Some topoisomerases cut just one of the strands of the double helix, allow it to wind or unwind around the other strand, and then reconnect the severed ends. Other topoisomerases cut both strands, pass a loop of the chromosome through the gap thus created, and then seal the gap again.
totipotent. Capable of developing into every cell of the body. A zygote is totipotent. Compare pluripotent and multipotent.
trans-splicing. The splicing together of entirely different gene transcripts to form a translation-ready mRNA. The genes may reside on the same or on different chromosomes. See also RNA splicing and alternative splicing.
transcribing enzyme. See RNA polymerase.
transcript. The RNA molecule that is the product of gene transcription. Transcripts begin as "primary" or "precursor" transcripts, which then can be spliced, edited, or otherwise transformed before (in the case of many RNAs) being translated into a protein.
transcription. The process by which an RNA polymerase (in cooperation with many other cellular elements) uses a DNA gene template to form an RNA molecule such as a messenger RNA (mRNA). The gene is said to have been "transcribed", and the RNA is a "transcript".
transcription complex. The aggregate of numerous proteins (typically scores of them) that must bind to a gene's promoter region before actual transcription of the gene can begin.
transcription factor. A protein that binds directly to a recognized DNA sequence, thereby playing a role in gene regulation. Transcription factors called activators may increase a gene's expression, while repressors may decrease expression.
transcription start site. The nucleotide base at the upstream end of a gene where actual transcription of the gene begins.
translation. The production of a protein from mRNA. This protein was formerly said to be "coded for" by the gene from which the mRNA was transcribed, but it is now known that many different activities of the cell help to determine the protein end-product.
tRNA (transfer RNA). See under RNA.
tumor suppressor gene. A gene from which a protein is derived that helps to protect a cell against cancer. The protein may do this, for example, by preventing or damping cell division (cells that are becoming cancerous tend to divide without proper restraint) or by promoting cell death in the event of DNA damage.
ubiquitin. A small protein molecule consisting of 76 amino acids. See ubiquitination.
ubiquitination. Attachment of ubiquitin to a molecule, which is then said to be "ubiquitinated". Certain enzymes can do this. See also histone modifications.
upstream/downstream. DNA consists of the two strands of a double helix. The orientation of the chemical constituents of these strands gives a directionality to the strands and enables one to distinguish the two ends, which are referred to as the 5' and the 3' ends. The two strands of a double helix are oriented oppositely, so that the 5' end of one strand is adjacent to the 3' end of the other strand. Gene transcription typically proceeds from the 5' end of the gene (that is, from the end of the gene closer to the 5' end of the chromosome) toward the 3' end. The stretches of DNA lying beyond the gene and toward the 5' end of the chromosome are said to be "upstream" from the gene, while the DNA lying toward the 3' end of the chromosome are "downstream" - in the direction of usual transcription. Promoters lie adjacent to their genes on the upstream side, where transcription begins.
uracil. See nucleotide base.
variant histone. A "non-standard" form of any one of the four different types of histone making up a nucleosome spool. For example, the H2A.Z histone can substitute for the canonical H2A, with the effect of destabilizing the spool and making it more susceptible to sliding. There are also variant linker histones.
zygote. A fertilized, diploid egg cell resulting from the union of two haploid gametes.