Genetics Chapter 21 (not 20)

Genome

-the complete set of DNA in a single cell of an organism

Genomics

-Sequencing the genome
-Compiling the Sequence
-Annotating the Sequence
-Classifying Genes, candidates (ORFs)
-a subdiscipline of genetics created by the union of classical and molecular biology with the goal of sequencing and understanding genes, gene in

Positional cloning

--the identification and subsequent cloning of a gene without knowledge of its polypeptide product of function
-the process uses cosegregation of mutant phenotypes w/ DNA markers to identify the chromosome containing the gene; the position of the gene is

Structural Genomics

-focuses on sequencing genomes and analyzing nucleotide sequences genomes to identify genes and other important sequences such as gene-regulatory regions

Whole-genome sequencing/ shot-gun sequencing

-1) First, an entire chromosome is cut into short, overlapping fragments, either by mechanically shearing DNA in various ways or by using restriction enzymes to cleave the DNA at different locations. This creates a series of continuous fragments, or "cont

Alignment

-one of the earliest bioinformatics applications to be developed for genomic purposes was the use of the algorithm-based software programs for creating a DNA sequence _______, in which similar sequences of bases, such as contigs, are lined up for comparis

High-throughput sequencing

-computer-automated sequences
-a collection of DNA sequencing methods that outperform the standard (Sanger) method of DNA sequencing by a factor of 100-1000 and reduce sequencing costs by more than 99%

Contiguous fragments ("contig")

-a continuous DNA sequence reconstructed from overlapping DNA sequences derived by cloning or sequence analysis

Clone-by-clone approach/ map-based cloning

-individual DNA fragments from restriction digests of chromosomes are aligned to create the restriction maps of a chromosome
--these restriction fragments are then ligated into vectors such as bacterial artificial chromosomes (BACs) or yeast artificial ch

Compiling (the sequence)

-the assembly of a final genomic sequence from multiple sequencing runs is known as ________

Annotation

-after a genome has been sequenced and compiled, scientist are faced with the task of identifying gene-regulatory sequences and other sequences of interest in the genome so that gene maps can be developed
-this process, called ______, relies heavily on bi

Analyzing Sequence

-AAAAGAAAAGGTTAGAAAGATGAGAGA.......
-Genetic code organized in triplets; to find coding sequence, analyze in triplets.
1. Assume first base is first in triplet:
AAA AGA AAA GGT TAG....
2. Assume second base is first in triplet:
AAA GAA AAG GTT AGA....
3.

ORF (opening reading frames)

-sequences of triplet nucleotides that, after transcription and mRNA splicing, are translated into the amino acid sequence of a protein, including an initiation and termination codon
-typically begin with an initiation sequence, usually ATG, which transcr

Human Genome Project

-a coordinated effort international effort to determine the sequence of the human genome and to identify all the genes it contains
-most valuable contribution will perhaps be the identification of disease genes and the development of new treatment strateg

Classifying Genes

-Functional Groups - based on sequence
-Confidence Levels
1. Known gene with known function
2. Strong homolog with known function from other organisms
3. Gene with proposed function
4. Gene reported in another species, no function known
No homology to any

Confidence

same gene has same function

The Human Genome

Size: 3.2 Gb (Haploid genome)
Chromosomes: 24
% Coding for Genes: 5%
Intergenic Spacer: 75%
Repetitive Sequences: >50%
Genes: 25,000 - 30,000
Gene Density: 1gene/100,000 bp

Single-nucleotide polymorphisms (SNP) and copy number variation (CNVs)

Most genetic differences b/w humans result from:

Single-nucleotide polymorphisms (SNP)

-Most genetic differences/variations b/w humans result from _________
-single-base changes in the genome and variations of many _____ are associated with disease conditions

Copy number variations (CNVs)

-duplications or deletions of relatively large sections of DNA on the order of several hundred or several thousand base pairs
--results in variations in the number of copies of a DNA segment inherited by individuals

Proteomics

-the analysis of all the proteins in a cell or tissue
-Analysis of proteins present in a cell under a given set of circumstances
-Proteome - the set of proteins present or all the possible proteins encoded by a genome
-Humans - 25,000 -30,000 genes, but a

Nutrigenomics

-focuses on understanding the interactions b/w diet and genes

Stone-age genomics

-generating fascinating data from miniscule amounts of ancient DNA obtained from bone and other tissues such as hair that are tens of thousands to about 700,000 years old, and often involve samples extinct species

Automated Sanger Sequencing

-at the peak of this technique, a single machine could produce hundreds of thousands of base pairs in a single run

Exome Sequencing

-a DNA sequencing method in which only the protein-coding regions (exons) of the genome are sequenced
-reveals mutations involved in disease by focusing only on exons as protein-coding segments of the genome

Encyclopedia of DNA Elements (ENCODE) project

-goals was to use both experimental approaches, including chip-based methods, and bioinformatics to identify and analyze functional elemtns of the genome, such as transctiptional start sites
-found that:
--80% of human genome is considered funcional
---th

Human Microbiome Project

-a $170 million project to complete the genomes of an estimated 600-1000 microorgamisms, bacteria, etc.
-goals were:
--determining if individuals share a core human microbiome
-understanding whether changes in the microbiome can be correlated with changes

Transcriptome Analysis/transcriptomics

-studies the expression of genes by a genome both qualitatively--by identifying which genes are expressed and which genes are not expressed--an quantitatively--by measuring varying levels of expression for different genes
-reveals gene-expression profiles

DNA Microarray analysis

-an ordered arrangement of DNA sequences or oligonucleotides on a substrate (often glass)
-used in quanititative assays of DNA-DNA or DNA-RNA binding to measure profiles of gene expression
-enables researchers to analyze all of a sample's expressed genes

Gene Chips/DNA Microarray

-Diagnosis can be made by screening
all genes in the genome at once.
-This method uses DNA microarrays, also called
______
-consist of a glass microscope slide onto which single-stranded DNA molecules are attached, or "spotted" using a computer robot arm

Cluster Algorithm

-programs that can be used to retrieve spot-intensity data from different on a microarray and to group gene-expression data from one or multiple microarrays into cluster images incorporating results from many experiments
-groups genes according to whether

DNA Microarray Process

-1) Isolate mRNA
-2) Make cDNA by reverse transcription, using fluorescently labeled nucleotides
-3) Hybridization: apply to the cDNA mixture to a DNA microarray
-4) Rinse off excess cDNA, put the microarray in a scanner to measure fluorescence of each sp

Circadian Rhythm

-oscillations in biological activity that occur on a regular cycle of time, such as 24 hours

Microarray Process

-1) Isolate mRNA
-2) Make cDNA by reverse transcription, using fluorescently labeled nucleotides
-3) Hybridization: apply the cDNA mixture to a DNA microarray
-4) Rinse off excess cDNA, put the microarray in a scanner to measure fluorescence of each spot.

Cluster algorithm

-can be used to retrieve spot-intensity date from different locations on a microarray and to group gene-expression data from one or multiple mircoarrays into cluster images incorporating results from many experiments

Functional Genomics

seeks to understand functional components within the genome and similarities of genomes across phylogenetic and evolutionary distances

Comparative genomics

analyzes the arrangement and organization of families of genes within and among genomics

BLAST

-software which directs searches through databanks of DNA and protein sequences
-will compute a similarity score or identity value to indicate the degree to which two sequences are similar
-one of the many sequence alignment algorithms that may sacrifice

Hallmarks of annotation

-begin by annotating sequence by comparing it using BLAST to sequences already stored in various databases
-________ to annotation are the identification of gene regulatory sequences found upstream of genes (promoters, enhancers, and silencers), down stre

Genome 10K plan

-given the speed, efficiency and recent cost reductions associated with modern sequencing technologies, some scientists believe it is reasonable to expect that 10,000 (10K) vertebrate genomes can be sequenced in five years
-they believe that such a massiv