Target Selection by IS1111 Insertion Sequences
Ruth M. Hall, Alexander Wu, Christopher J. Harmer, Stephanie J. Ambrose, Sandro F. Ataide.
School of Life and Environmental Sciences, The University of Sydney
The IS1111 and IS110 families include an unusual group of insertion sequences (IS) that were known to differ in several ways from well-studied IS but were poorly characterised. Althoughboth families use a transposase with two catalytic domains, a RuvC-like domain and a novel domain, other features of members of the two families differ significantly. In both families,each specific IS is always found surrounded by the same or a similar sequence, indicating that that IS targets a specific sequence motif. We recently showed that an RNA, the seekRNA, transcribed from a region internal to the IS provides the recognition of the specific target and guides the IS both from it and to it. However, much remains to be understood about these IS. IS in the IS1111 family are distinguished by subterminal inverted repeats with extensions that are longer on one side. Using a specific IS1111 family IS, we have shown that the transposition frequency is affected by the strength of a promoter located outside the IS but upstream of the tnp gene and the seekRNA determining region.
We have also used bioinformatic approaches to detect known and new IS1111 family members in Escherichia coli genomes, and to first identify the precise IS ends then confirm them experimentally. This revealed that IS in different subgroups have terminal extensions of different lengths. To assess the accuracy of target recognition we also examined the target site consensus and variation for a group of related IS found predominantly in E. coli. Manual curation revealed an unexpected tolerance for insertion of bases at the centre of the target thatobscured the target consensus using purely bioinformatic approach. Examination of the location of the target for several different IS revealed that many targets are in mobile elements from other families.