Sunday 19 June 2011

STRs

This will be presented in some detail because STRs are important in current, forensic DNA testing.  The abbreviation, STR stands for Short Tandem Repeat.  STRs are the type of DNA used in most of the currently popular forensic DNA tests.  STR is a generic term that describes any short, repeating DNA sequence.  For example, the DNA sequence ATATATATATAT is an STR that has a repeating motif consisting of two bases, A and T.  It turns out that our DNA has a variety of STRs scattered among DNA sequences that encode cellular functions.  For reasons that are not entirely understood, people vary from one another in the number of repeats they have, at least for some STR loci.  For example, person #1 may have ATATAT at a particular locus while person #2 may have ATATATATATAT.  Thus, STRs are often variable (polymorphic) and these variations are used to try and distinguish people.  The term, STR doesn't necessarily imply PCR.  PCR is one of many methods that might be used to help analyze STRs.  STRs have also been analyzed by DNA sequencing for example.  To understand PCR-assisted STR typing, it is useful to briefly consider how such PCRs are designed.
Suppose that laboratory data revealed the following DNA sequence:
 --ATGCTAGTATTTGGATAGATAGATAGATAGATAGATAGATAAAAAAATTTTTTTT--
The STR is underlined and consists of the sequence, GATA repeated 7 times.  The dashes at the beginning and end of the overall sequence shown indicate that there is more sequence available both upstream and downstream of the region shown.  Remember, DNA is relatively very long and linear and we are just going to look at a small region of it. 
Now, let's say we want to design a PCR to examine this same locus in other people.  To design the PCR, we need two primers, short synthetic DNA molecules that recognize the region.  One primer might be, ATGCTAGTA (Italics, in the above sequence) a sequence that would recognize the DNA flanking the left side of the STR.  The second primer might be, AAAAAAAATTTTTT.  This is called the downstream primer and it might be difficult to recognize in the sequence.  The reason it is difficult to recognize at first is that it is the complement of the sequence, AAAAAAAATTTTTT (italics, on the right in the longer sequence above).  See "General Considerations", for a more detailed discussion.
 What is the complement of a DNA sequence?  This might be more information than you would like, but to really understand PCR primers, try to walk through this: 
The complement of a DNA sequence is the sequence written backwards exchanging all A's for T's, all T's for A's, all G's for C's and all C's for G's.  For example, the complement of the sequence, AGTA is TACT.  An easy way to get the complement of a DNA sequence is to write another line below the original sequence remembering that A replaces T and G replaces C.  Then read the lower line backwards: 
So, for the sequence:
 GATCTTAGCTTTAAAGCCC
 write the complementary line below it giving:
GATCTTAGCTTTAAAGCCC
CTAGAATCGAAATTTCGGG
 Then, just read the lower line backwards (from right to left) giving the complement: 
GGGCTTTAAAGCTAAGATC
In practical words, the upstream (left) primer can be a direct reading of the target sequence while the downstream primer (right) must be the complement of the directly read sequence.  
If the above is confusing, it may suffice to think of the primers as  two arrows that point at one another with the STR located between them.  This is how the PCR targets the locus and the STR.
In practice, PCR primers are usually at least 17 bases in length.  The point here is that to use PCR to target an STR, the primers recognize constant, conserved sequences that flank the actual STR.  This means that the actual length of the target sequence depends on where the primers are placed in the flanking sequence.  For example, the Promega and PE, Applied Biosystems test kits use mostly different primers.  For example, the upstream primer could be designed to recognize DNA 100 bases upstream of the sequence shown.  Similarly, the downstream primer could be designed to recognize DNA further downstream.  Such placement of the primers by design, further upstream and downstream, would make all alleles (variations) of the STR appear to be larger than if the primers are placed by design close to the STR itself.  Wherever the primers are placed, that defines the region we will examine.  That region will then vary among individuals due to changes in the STR itself as explained above for the simple STR based on the repeating AT motif.

After PCR is used to provide many copies of a given person's STR, the products (copies) are separated according to size on an electrophoretic gel (see RFLP above for more details about gels).  The gel can be flat, as for RFLP, or it can be in a round tube, called a capillary with a detector at the end of it.  Typical flat gel STR results look like this:
 The black bars are called bands.  Each band is made up of many identical-size DNA molecules that were produced by PCR.  The gel separates smaller bands (DNA molecules) from larger ones.  The bands near the lower end of the gel are smaller (ie. the DNA fragments are shorter in length)  than those near the top.  For example, looking at the reference ladder, the first band near the lower end of the gel is the smallest STR.  For simplicity, let's say this smallest band contains a single repeat such as CATG, flanked by other DNA that the primers actually recognize in everyone's DNA.  The next higher band in the ladder would then contain 2 repeats, CATGCATG; the next 3 repeats and so on.  By comparing the positions of bands in the unknown samples with the reference ladder, the allele sizes are deduced.  In this example, Sample A had bands at the 2-repeat position and the 5-repeat position. Common terminology would call this sample a 2,5 type.  Sample B would be called,  2,4.  For a single person, each locus normally has two alleles and these can be different (heterozygous) or the same (homozygous).  

No comments:

Post a Comment