DNA Fingerprinting
by Prof.Siddharth Sanghvi
1. Introduction: The Uniqueness of DNA
As revealed by the Human Genome Project, approximately 99.9 percent of the nucleotide base sequence is identical among all humans. This means that out of the roughly 3 × 109 base pairs in the human genome, there are still millions of base sequences that differ between any two individuals. It is these minute differences in DNA sequences that contribute to the unique phenotypic appearance and characteristics of every individual.
If one were to identify genetic differences between individuals or within a population by sequencing their entire DNA every time, it would be an incredibly daunting and expensive task. Imagine comparing two sets of 3 × 106 base pairs (the 0.1% difference). DNA Fingerprinting, also known as DNA Profiling, provides a very quick and efficient way to compare specific DNA sequences between individuals.
2. Discovery and Pioneers
- The technique of DNA Fingerprinting was initially developed by Alec Jeffreys in 1984 at the University of Leicester, UK.
- He used a specific type of repetitive DNA sequence, known as Variable Number of Tandem Repeats (VNTR), as a probe due to its high degree of polymorphism.
Father of DNA Fingerprinting in India:
- In India, Dr. Lalji Singh is widely recognized as the "Father of DNA Fingerprinting". He established the Centre for Cellular and Molecular Biology (CCMB) in Hyderabad, which pioneered DNA fingerprinting technology in the country.
3. The Basis of DNA Fingerprinting: Repetitive DNA
The fundamental principle behind DNA fingerprinting lies in identifying differences in specific regions of the DNA sequence known as repetitive DNA. These are stretches of DNA where a small sequence is repeated many times in tandem.
3.1. Genomic DNA Composition: Coding vs. Non-coding
It's important to understand the overall composition of the human genome. While the Human Genome Project revealed that 99.9% of DNA is identical between individuals, it also showed that:
- Less than 2 percent of the human genome actually codes for proteins (these are the 'genes' that produce functional products).
- A very large portion of the human genome is made up of non-coding DNA sequences, which include a significant amount of repetitive DNA. This non-coding DNA is often referred to as "junk DNA" because its direct function in protein synthesis is not yet fully understood, though it plays crucial roles in chromosome structure, regulation, and evolution.
Therefore, the vast majority of our genome, including much of what forms the "bulk DNA" (explained next), consists of non-coding and repetitive sequences.
3.2. Bulk DNA vs. Satellite DNA (Separation by Density)
During density gradient centrifugation of total genomic DNA, the DNA separates into distinct fractions based on their buoyant density. This process reveals two main components:
- Bulk DNA: This forms the major peak during centrifugation. It represents the vast majority of the genomic DNA. Crucially, this "bulk DNA" is not entirely non-repetitive; it contains the majority of the genome, which itself is largely composed of both unique sequences (coding genes) and various types of repetitive sequences (including many that do not form distinct satellite bands).
- Satellite DNA: These form other small, distinct peaks that are separated from the main "bulk DNA" peak. Satellite DNA sequences are a specific type of repetitive DNA characterized by:
- Being highly repetitive (short sequences repeated many times).
- Having a different base composition (e.g., unusually high AT-rich or GC-rich content) compared to the average genomic DNA. This difference in base composition leads to a different buoyant density, causing them to separate as distinct "satellite" bands during centrifugation.
In essence, while the entire genome contains a large amount of repetitive DNA (most of the "junk" DNA), only a subset of these highly repetitive sequences with distinct base compositions are dense enough to form separate "satellite" peaks during centrifugation. The "bulk DNA" still contains a significant amount of repetitive DNA, just not the kind that separates into distinct satellite bands by this method.
3.3. Micro-satellites and Mini-satellites
Satellite DNA is further classified into various categories based on their base composition, the length of the repetitive segment, and the number of repetitive units:
- Mini-satellites: These are also known as Variable Number of Tandem Repeats (VNTRs). They consist of a small DNA sequence repeating unit (typically 10-100 base pairs long) arranged tandemly (one after another) in many copy numbers. The total length of a VNTR locus (i.e., the entire region containing these repeats) can vary significantly, ranging from approximately 0.1 to 20 kilobases (kb). The copy number of a specific VNTR locus varies significantly from chromosome to chromosome within an individual, and even more so between different individuals. This variation in copy number leads to a very high degree of polymorphism.
- Micro-satellites: These are even shorter repetitive sequences, typically 1-6 base pairs long, repeated multiple times. They are also highly polymorphic and are increasingly used in modern DNA fingerprinting techniques, often amplified by PCR, due to their smaller size and ease of amplification.
3.4. DNA Polymorphism
Polymorphism: In simple terms, it refers to variation at the genetic level. These variations arise due to mutations.
- An inheritable mutation is considered a DNA polymorphism if it is observed in a population at a high frequency (typically greater than 0.01).
- Mutations can occur in somatic cells or germ cells (cells that produce gametes). If a germ cell mutation does not severely impair an individual's ability to reproduce, it can be transmitted to offspring and spread through the population.
- Polymorphisms are more frequently observed in non-coding DNA sequences because mutations in these regions are less likely to have an immediate negative effect on an individual's reproductive ability (as they don't alter protein products). This allows these mutations to accumulate over generations, forming the basis of genetic variability and polymorphism.
- The high degree of polymorphism in repetitive DNA sequences (like VNTRs and microsatellites) makes them ideal markers for individual identification because the probability of two unrelated individuals having the exact same pattern of these repeats is extremely low.
4. Methodology of DNA Fingerprinting (RFLP-based)
The traditional DNA fingerprinting technique, as developed by Alec Jeffreys, involves several key steps. Modern methods often incorporate PCR for increased sensitivity, but the fundamental principles remain.
- Isolation of DNA:
- DNA is extracted from any biological sample containing cells, such as blood, hair follicles, skin, bone, saliva, or sperm. Since DNA from every tissue of an individual shows the same degree of polymorphism, any cellular sample can be used.
- Digestion of DNA by Restriction Endonucleases:
- The isolated DNA is cut into fragments at specific recognition sites by restriction enzymes (also known as 'molecular scissors'). These enzymes recognize and cleave DNA at particular nucleotide sequences. The variation in the number of tandem repeats (VNTRs) means that the restriction sites flanking these repeats will be at different distances, leading to fragments of varying lengths.
- Separation of DNA Fragments by Gel Electrophoresis:
- The DNA fragments generated by restriction digestion are separated based on their size (and charge) using agarose gel electrophoresis. Smaller fragments migrate faster and further down the gel than larger ones. This separation creates a unique pattern of bands for each individual.
- Southern Blotting (Transferring):
- The separated DNA fragments from the gel are transferred to a synthetic membrane, such as nitrocellulose or nylon membrane. This process is called Southern blotting. The DNA fragments are denatured (separated into single strands) before transfer to allow for probe hybridization.
- Hybridization with Labelled VNTR Probe:
- The membrane with the transferred DNA is then incubated with a radiolabelled VNTR probe. A probe is a single-stranded DNA or RNA molecule that is complementary to a specific target sequence. The VNTR probe will bind (hybridize) only to the repetitive DNA sequences on the membrane that are complementary to it. Because of the varying number of repeats, the probe will bind to fragments of different lengths from different individuals, creating a unique pattern.
- Detection by Autoradiography:
- After hybridization, the membrane is washed to remove unbound probes. The hybridized probe (which is radioactive) is then detected by autoradiography. This involves exposing the membrane to an X-ray film. The radioactive probe exposes the film, creating dark bands at the positions where the VNTR sequences are located.
- The resulting pattern of bands is unique for each individual (except identical twins) and is known as the DNA fingerprint.
The sensitivity of this technique has been significantly increased by the use of Polymerase Chain Reaction (PCR). PCR allows for the amplification of specific repetitive regions (like microsatellites or VNTRs) from even minute amounts of DNA, making it possible to perform DNA fingerprinting analysis from a single cell.
5. Applications of DNA Fingerprinting
DNA fingerprinting has revolutionized various fields due to its high accuracy and ability to identify individuals based on their unique genetic profiles.
5.1. Forensic Science and Crime Investigation
- DNA fingerprinting is an indispensable tool in crime investigation. DNA samples collected from a crime scene (e.g., blood, semen, hair, skin cells, saliva, cigarette butts) can be compared with DNA samples from suspects.
- The unique banding pattern (DNA fingerprint) generated from the crime scene evidence is compared to the patterns from suspects. If the patterns match, it provides strong evidence linking the suspect to the crime. If they don't match, it can exonerate a suspect.
- This method is highly reliable because the probability of two unrelated individuals having identical DNA fingerprints is astronomically low.
5.2. Paternity and Maternity Disputes
- Since polymorphisms (variations in DNA sequences) are inheritable from parents to children, DNA fingerprinting is the definitive basis for paternity testing and resolving maternity disputes.
- A child inherits half of its genetic material (and thus half of its DNA fingerprinting pattern) from the biological mother and the other half from the biological father.
- By comparing the DNA fingerprint of the child with that of the alleged parents, biological relationships can be established with very high certainty. Specifically, every band in the child's DNA fingerprint must be present in either the mother's or the father's pattern. If a band in the child's profile cannot be accounted for by either alleged parent, then biological parentage is excluded.
5.3. Genetic Diversity and Evolutionary Biology
- DNA fingerprinting is also used to determine population and genetic diversities within and between species.
- It helps in understanding evolutionary relationships, tracking migration patterns of populations, identifying endangered species, and assessing the genetic health of populations, which is crucial for conservation efforts.