A gene stores the information for making an RNA molecule in its sequence of nucleotide bases. The RNA is transcribed in a sequence complementary to the sequence of bases in the DNA of the gene. While in certain instances the RNA itself may serve as the final product, most often the RNA will perform as a template for translation into a protein molecule. 'Transcription and translation' is the central dogma of molecular biology. The events involved in the transcription of DNA into RNA and the subsequent translation of RNA into protein comprise the subject matter of the topic of gene expression.
The operon concept underlying prokaryotic gene expression is the starting point to discuss regulation of gene expression. Prokaryotic gene expression is easily comprehensible, but eukaryotic gene expression is a much more wide-ranging and complex subject. Although there is quite a bit you can understand and need to know for the MCAT, eukaryotic gene expression is not something a person really ever masters. As you grow in fluency in the topic of gene expression within a complex organism like a human being, the topic becomes a kind of cultivated disposition towards emergent complexity and highlighted mechanisms overlap subject matter in cell-signaling, development and physiology. The goal in study is to become able to participate in the discussion through mastery of a conceptual vocabulary.
WikiPremed Resources
Mendelian Genetics Images
Conceptual Vocabulary Self-Test
Basic Terms Crossword Puzzle
Basic Puzzle Solution
Conceptual Vocabulary for Gene Expression
Gene Expression
A DNA sequence is a succession of letters representing the primary structure of a DNA molecule or strand.
The central dogma of molecular biology is a framework for understanding the transfer of sequence information between sequential information-carrying biopolymers in living organisms.
Transcription is the process by which genetic information from DNA is transferred into RNA.
A gene is a locatable region of genomic sequence, corresponding to a unit of inheritance.
Messenger ribonucleic acid is a molecule of RNA encoding a chemical blueprint for a protein product.
Messenger RNA is decoded in the process of translation to produce a specific polypeptide according to the rules specified by the genetic code.
A ribosome is a small, dense, structure found in most known cells that assembles proteins in a process called translation.
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells.
A base pair consists of two nucleotides on opposite complementary DNA or RNA strands connected via hydrogen bonds.
A ligase is an enzyme that can catalyse the joining of two large molecules by forming a new chemical bond.
Transfer RNA is a small RNA chain that plays a role during translation in shuttling a specific amino acid to a growing polypeptide chain at the ribosomal site of protein synthesis.
A protein precursor, also called a pro-protein or pro-peptide, is an inactive protein that can be turned into an active form by posttranslational modification.
A signal peptide is a short portion of a protein dedicated to directing the post-translational transport of a protein.
Protein targeting or sorting is the mechanisms by which a cell transports proteins to the appropriate positions in the cell or outside of it.
Directionality refers to the end-to-end chemical orientation of a single strand of nucleic acid.
A promoter is a regulatory region of DNA located upstream of a gene, providing a control point for regulated gene transcription.
A terminator is a section of genetic sequence that marks the end of gene or operon on genomic DNA for transcription.
A primary transcript is an RNA molecule that has not yet undergone any modification after its synthesis.
Introns are non-coding sections of DNA which are spliced out once a DNA sequence has been transcribed as a hnRNA strand.
An exon is any region of DNA within a gene that is transcribed to the final messenger RNA molecule, rather than being spliced out from the transcribed RNA molecule.
A gene product is the biochemical material, either RNA or protein, resulting from expression of a gene.
An operon is a functioning unit of key nucleotide sequences including an operator, a common promoter, and one or more structural genes, which are controlled as a unit to produce messenger RNA.
Enzyme induction is a process in which a molecule, such as a drug, induces the expression of an enzyme.
The non-coding or template strand is the DNA strand that is read by the RNA polymerase.
A structural gene is a gene that codes for any RNA or protein product other than a regulatory element.
Synexpression is a type of eukaryotic gene organization in which genes may not be physically linked, but they are involved in the same process and they are coordinately expressed
The signal recognition particle is a protein-RNA complex that recognizes and transports specific proteins to the endoplasmic reticulum in eukaryotes and the plasma membrane in prokaryotes.
A reading frame is a contiguous and non-overlapping set of three-nucleotide codons in DNA or RNA
The codon ATG in DNA, which corresponds to AUG in RNA, is the start codon or initiation codon which the amino acid methionine in eukaryotes and a modified methionine in prokaryotes.
A ribosomal protein is any of the proteins that, in conjunction with rRNA, make up the subunits of the ribosome.
Exonucleases are enzymes that cleave nucleotides one at a time from an end of a polynucleotide chain.
Initiation factors are proteins that bind to the small subunit of the ribosome during the initiation of protein synthesis.
A transcription factor is a protein that binds to specific parts of DNA using DNA binding domains as part of the system that controls the transfer of genetic information from DNA to RNA.
A TATA box (also called Goldberg-Hogness box) is a DNA sequence found in the promoter region of most genes in eukaryotes, which is considered to be the core promoter sequence.
A cis-regulatory element is a region of DNA or RNA that regulates the expression of genes located on that same strand.
The coding region of a gene is the portion of DNA that is transcribed into mRNA and translated into proteins.
An enhancer is a short region of DNA that can be bound with proteins to enhance transcription levels of genes in a gene-cluster.
Alternative splicing is the variation mechanism in which the exons of the primary gene transcript, the pre-mRNA, are separated and reconnected so as to produce alternative ribonucleotide arrangements.
Junk DNA is a collective label for the portions of the DNA sequence of a chromosome or a genome for which no function has yet been identified.
A ribozyme is an RNA molecule that catalyzes a chemical reaction.
A repressor is a DNA-binding protein that regulates the expression of one or more genes by decreasing the rate of transcription.
The lac operon is a functional unit of nucleotide sequences conrolling the production of gene products required for the transport and metabolism of lactose in Escherichia coli and some other enteric bacteria.
An inducer is a molecule that starts gene expression.
The Trp operon is a functional in certain bacteria that controls the production of gene products to increase the production of tryptophan in the absence of tryptophan in the environment.
Tryptophan repressor is a DNA binding protein which silences a set of genes involved in tryptophan production.
N-Formylmethionine, often abbreviated as fMet, is a modified form of methionine in which a formyl group has been added to methionine's amino group.
RNA polymerase is an enzyme that makes an RNA copy of a DNA or RNA template.
A stop codon, or termination codon, is a nucleotide triplet within messenger RNA that signals a termination of translation.
Post-translational modification is the chemical modification of a protein after its translation.
The release factor is a protein that recognises the termination codon or stop codon in a mRNA sequence on the ribosome.
An activator is a DNA-binding protein that regulates one or more genes by increasing the rate of transcription by recruiting RNA polymerase to the promoter region.
Expressome refers to the whole set of gene expression in a cell, tissue, organ, organisms, and species.
The transcriptome is the set of all messenger RNA molecules produced in one or a population of cells.
A gene regulatory network is a collection of DNA segments in a cell which interact with each other and with other substances in the cell, to govern the rates at which the associated genes are transcribed.
The term RNA editing describes those molecular processes in which the information content is altered in a RNA molecule through a chemical change in the bases themselves.
RNA interference is a mechanism for RNA-guided regulation of gene expression in which double-stranded ribonucleic acid inhibits the expression of genes with complementary nucleotide sequences.
Spatiotemporal gene expression is the activation of genes within specific tissues of an organism at specific times during development.
A transcription bubble is a molecular structure that occurs during the transcription or replication of DNA when DNA helicase and DNA topoisomerase unzip the DNA double strand.
Elongation factors are a set of proteins that facilitate the events of protein synthesis from the formation of the first peptide bond to the formation of the last one.
Polyribosomes, or polysomes, are a cluster of ribosomes, bound to an mRNA molecule.
SRP receptor also called docking protein, is a dimer composed of 2 different subunits that are associated exclusively with the rough ER in mammalian cells.
An intergenic region is a stretch of DNA sequences located between clusters of genes that comprise a large percentage of the human genome but contain few or no genes.
RNA polymerase I transcribes DNA to synthesize ribosomal RNA.
Present in eukaryotic cells, RNA polymerase II catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA.
RNA polymerase III transcribes DNA to synthesize ribosomal 5S rRNA, tRNA and other small RNAs.
A single nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide in the genome differs between members of a species
An open reading frame is a portion of an organism's genome which contains a sequence of bases that could potentially encode a protein.
The Pribnow box is the sequence TATAAT of six nucleotides that is an essential part of a promoter site on DNA for transcription to occur in prokaryotes.
Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded DNA or, more commonly, in RNA. The structure is also known as a hairpin or hairpin loop.
Polyadenylation is the covalent linkage of a poly(A) tail to a messenger RNA molecule. It is part of the route to producing mature messenger RNA for translation.
A regulatory sequence is a promoter, enhancer or other segment of DNA where proteins such as transcription factors bind preferentially.
RNA-binding proteins are typically cytoplasmic and nuclear proteins that associate with and facilitate the translation of RNAs.
DNA methylation involves the addition of a methyl group to DNA.
Epigenetics refers to features such as chromatin and DNA modifications that are stable over rounds of cell division but do not involve changes in the underlying DNA sequence of the organism.
The lac repressor is a DNA-binding protein which inhibits the expression of genes coding for proteins involved in the metabolism of lactose in bacteria.
A regulon is a collection of genes under regulation by the same regulatory protein.
A stimulon is a collection of genes under regulation by the same stimulus.
In prokaryotic cells, the attenuator refers to a specific regulatory sequence that, when transcribed into RNA, forms hairpin structures to stop translation when certain conditions are not met.
Wobble base pairing is a process of using modified base pairs in the first base of the anti-codon. It describes how the genetic code makes up for the disparity in the number of codons and tRNA molecules
An aminoacyl tRNA synthetase is an enzyme that catalyzes the esterification of a specific amino acid or its precursor to one of all its compatible cognate tRNAs to form an aminoacyl-tRNA.
A nuclear localizing sequence is an amino acid sequence which acts like a 'tag' on the exposed surface of a protein to target the protein to the cell nucleus through the nuclear pore complex.
A coactivator is a protein that increases gene expression by binding to an activator or transcription factor which contains a DNA binding domain.
A corepressor is a protein that decreases gene expression by binding to a transcription factor which contains a DNA binding domain.
5S ribosomal RNA is a component of the large ribosomal subunit in both prokaryotes and eukaryotes.
The five prime untranslated region, also known as the leader sequence, is a particular section of messenger RNA or corresponding DNA which starts where transcription begins and ends just before the start codon.
A genetic pathway is the set of interactions occurring between a group of genes who depend on each other's individual functions in order to make the aggregate function of the network available to the cell.
A hormone response element is a short sequence of DNA within the promoter of a gene that is able to bind a specific hormone receptor complex and therefore regulate transcription.
Polyadenine polymerase is an enzyme responsible for the addition of the three prime polyadenine tail to a newly synthesized pre-messenger RNA molecule during the process of gene transcription.
RNA-induced transcriptional silencing is a form of RNA interference by which short RNA molecules - microRNA or small interfering RNA - trigger the downregulation of transcription of a particular gene or genomic region.
The Signal Transducers and Activator of Transcription proteins, or STAT proteins, are transcription factors which regulate many aspects of cell growth, survival and differentiation.
In genetics a silencer is a DNA sequence capable of binding transcription regulation factors termed repressors.
Transcription coregulators are proteins that interact with transcription factors either to activate or repress the transcription of specific genes.
UTR, which stands for Untranslated Region, refers to either of two sections on each side of a coding sequence on a strand of mRNA.
A sigma factor is a prokaryotic transcription initiation factor that must be part of RNA polymerase for specific binding to promoter sites on DNA.
The Initiatior motif is a DNA transcription promoter that is similar in function to the Pribnow box in prokaryotes or the TATA box in eukaryotes.
A consensus sequence is a way of representing the results of a multiple sequence alignment, where related sequences are compared to each other, and similar functional sequence motifs are found.
A rho factor is a protein found in prokaryotes such as E. coli, involved in the termination of transcription by dissociating the ternary transcription complex at the termination of a gene.
Transcription-coupled repair is a DNA repair mechanism which operates in tandem with transcription.
Mature messenger RNA is a eukaryotic RNA transcript that has been spliced and processed and is ready for translation in the course of protein synthesis.
Precursor mRNA, more correctly termed heterogeneous nuclear RNA, is an immature single strand of messenger mRNA.
In the field of molecular biology, trans-acting generally means acting from a different molecule. It may be considered the opposite of cis-acting which generally means acting from the same molecule.
The enhanceosome is a protein complex that binds to the enhancer region of a gene, found upstream or downstream, of the promoter, or within a gene, accelerating the gene's transcription.
A spliceosome is a complex of specialized RNA and protein subunits that removes introns from a transcribed hnRNA segment.
snRNPs (pronounced snurps) are particles that combine with pre-mRNA and various proteins to form spliceosomes.
Group I catalytic introns are large self-splicing ribozymes which catalyse their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms.
CpG islands are genomic regions that contain a high frequency of CG dinucleotides.
Histone acetyltransferases are enzymes that acetylate conserved lysine amino acids on histone proteins, a process linked to transcriptional activation.
Gene silencing is a general term describing epigenetic processes which switch off a gene by a mechanism other than genetic modification.
Aminoacylation is the process of adding an aminoacyl group to a compound.
The Kozak consensus sequence, which occurs on eukaryotic mRNA, is recognized by the ribosome as the translational start site.
Ribosome recycling factor is a protein found in bacterial cells as well as eukaryotic organelles which functions to recycle ribosomes after completion of protein synthesis.
The antiterminator is the procaryotic cell's aid to fix premature termination of RNA synthesis during the transcription of DNA.
In derepression the repressor is inactivated so that the operator gene becomes active again
Heterogeneous ribonucleoprotein particles are protein RNA complexes which, bound to a pre-mRNA molecule, serve as a signal that the pre-mRNA is not yet fully processed and ready for export to the cytoplasm.
The mediator functions as a coactivator in eukaryotes, binding to the C-terminal domain of RNA polymerase II to act as a bridge between this enzyme and transcription factors.
A modulon is an operon concerned with multiple pathways or functions, in which operons may be under individual controls as well as common, pleiotropic regulatory protein.
Dicer is a ribonuclease in the RNase III family that cleaves double-stranded RNA and pre-microRNA into short double-stranded RNA fragments called small interfering RNA.
Protein splicing is an intramolecular reaction of a particular protein in which an internal protein segment is removed from a precursor protein.
Ribosome shunting is a mechanism of translation initiation in which ribosomes physically bypass parts of the five prime untranslated region to reach the initiation codon.
Rho-independent transcription termination is a mechanism in bacteria whereby mRNA transcription is stopped by means of a stem-loop followed by several uracil residues.
The five prime cap is a specially altered nucleotide end to the five prime end of precursor messenger RNA in eukaryotes which ensures the messenger RNA's stability while it undergoes translation.
Twintrons are introns-within-introns excised by sequential splicing reactions
Genomic imprinting is a genetic phenomenon by which certain genes are expressed in a parent of origin-specific manner.
Dihydrouridine is a pyrimidine which is the result of adding two hydrogen atoms to a uridine. It is found in tRNA and rRNA molecules.
Pseudouridine is the C-glycoside isomer of the nucleoside uridine. This is the most prevalent of the over one hundred different modified nucleosides found in RNA.
The aryl hydrocarbon receptor is a member of the family of basic-helix-loop-helix transcription factors.
A capping enzyme is a guanylyl transferase enzyme that catalyzes the attachment of the five prime cap to messenger RNA molecules that are in the process of being synthesized in the cell nucleus.
Cleavage factors are two closely related proteins involved in the cleavage of the three prime signaling region from a newly synthesized pre-messenger RNA molecule in the process of gene transcription.
Exon trapping is a molecular biology technique to identify potential exons in a fragment of eukaryote DNA of unknown intron-exon structure.
Guanosine pentaphosphate is an alarmone which is involved in the stringent response in bacteria, causing the inhibition of RNA synthesis when there is a shortage of amino acids present.
High mobility group is a group of chromosomal proteins that help with transcription, replication, recombination, and DNA repair.
An internal ribosome entry site is a nucleotide sequence that allows for translation initiation in the middle of a messenger RNA sequence as part of the greater process of protein synthesis.
The JAK-STAT signaling pathway takes part in the regulation of cellular responses to cytokines and growth factors.
Myogenin is a basic-helix-loop-helix transcription factor expressed during the development, maintenance, and repair of skeletal muscle.
Non-stop decay is a recently identified cellular mechanism of mRNA surveillance to detect mRNA molecules lacking a stop codon and prevent these mRNAs from translation.
Nonsense mediated decay is a cellular mechanism of mRNA surveillance to detect nonsense mutations and prevent the expression of truncated or erroneous proteins.
The polypyrimidine tract is a region of messenger RNA that promotes the assembly of the spliceosome, the protein complex specialized for carrying out RNA splicing.
Protein arginine N-methyltransferase-4 methylation of arginine residues within proteins, known as the PRMT4 pathway, plays a critical key role in transcriptional regulation
A core enzyme is a RNA polymerase enzyme without the sigma factor.
Nuclear cap-binding protein complex is a RNA-binding protein which binds to the five prime cap.
SR proteins are Serine/Arginine-rich proteins which are involved in regulating and selecting splice sites in eukaryotic mRNA.
The minor spliceosome is a ribonucleoprotein complex that catalyses the removal of an atypical class of spliceosomal introns, U12-type, from eukaryotic messenger RNAs.
Trans-splicing is a special form of RNA processing in eukaryotes where exons from two different primary RNA transcripts are end to end ligated.
An intein is a segment of a protein that is able to excise itself and rejoin the remaining portions with a peptide bond.
p300 and CBP are transcriptional coactivators which interact with numerous transcription factors to increase the expression of their target genes.
Inosine is a nucleoside that is formed when hypoxanthine is attached to a ribose ring. This modified nucleoside is commonly found in tRNAs.
The Rossmann fold is a protein structural motif found in proteins that bind nucleotides, especially the cofactor NAD.
30S is the smaller subunit of the 70S ribosome of prokaryotes.
50S is the larger subunit of the 70S ribosome of prokaryotes.
60S is the large ribosomal subunit in eukaryotes.
The Shine-Dalgarno sequence is a ribosomal binding site in prokaryotes generally located 6-7 nucleotides upstream of the start codon AUG which helps recruit the ribosome to the mRNA to initiate protein synthesis.
Activating Protein 2 is a family of closely related transcription factors which plays a critical role in regulating gene expression during early development.
In the regulation of gene expression in prokaryotes, anti-sigma factors bind to sigma factors and inhibit their transcriptional activity.
A basic-helix-loop-helix is a protein structural motif that characterizes a family of transcription factors.
CCAAT-enhancer-binding proteins are a family of transcription factors that interact with the CCAAT box motif which is present in several genes promoters.
Cleavage and polyadenylation specificity factor is involved in the cleavage of the three prime signaling region from a newly synthesized pre-messenger RNA molecule in the process of gene transcription.
Eukaryotic initiation factor 2 is a heterotrimer of subunits alpha, beta, and gamma that mediates the binding of methionyl-tRNA to the ribosome in a GTP-dependent manner.
An expressed sequence tag is a short sub-sequence of a transcribed spliced nucleotide sequence which may be used to identify gene transcripts.
Geranylgeranylation is a post-transcriptional modification of proteins that involves the attachment of the isoprene geranylgeranyl diphosphate to the C-terminus at the cysteine residue.
Homoserine lactones are signaling chemicals involved in microbiological quorum sensing.
NANOG is a transcription factor critically involved with self-renewal of undifferentiated embryonic stem cells.
A member of the Signal Transducers and Activators of Transcription family of transcription factors, STAT1 is involved in upregulating genes due to a signal by either type I or type II interferons.
A member of the Signal Transducers and Activators of Transcription family of transcription factors, STAT3 is tyrosine-phosphorylated and activated by a number of kinases.
The Ski complex is a multi-protein complex involved in the three prime end degradation of messenger RNAs.
RLI is an essential and highly conserved protein that is required for both eukaryotic translation initiation as well as ribosome biogenesis.