![]() |
Suzanne
Sindi |
|
|
Research Interests Since coming to the University of Maryland I have been interested in a number of different research areas. As part of the Chaos Group I have been exposed to varied problems in applied non-linear dynamics and have become particularly interested in computational biology and genomics as well as ideas concerning the measure theoretic properties of dynamical systems. I am currently involved in research in two different areas.
Brief outlines of my research area follow. If you are interested in pre-prints of papers in progress please email me at ssindi(at)math.umd.edu. Current Research Projects Repetitive Regions in DNA: The genome or DNA sequence of an organism is a sequence of nucleic acids represented by letters from the alphabet {A,C,G,T}. The length of this sequence varies from several million letters in types of bacteria to several billions in a mammalian genome.While genomes have been obtained for a variety of organisms, including fruit fly, worm and rat, there is still much to be learned about the properties of the genome itself. Repeat regions, sequences that occur repeatedly throughout the genome, are an area of significant interest. Since DNA sequences come from a small alphabet we expect to see subsequences that occur more that once; for example if a DNA sequence is longer than (4k+k+1) letters for some positive integer k, by the Pigeon Hole Principle there will be at least one repeated k length subsequence. However there are substantially longer sequences that appear far more often than chance would allow in DNA. These repeated regions of DNA have been the focus of my research. The concept of a repetitive region in a genome is difficult to precisely define. Repeat regions can occur many times in a genome and distinct copies have differences between them. Papers on repetitive DNA tend to be vague when discussing what constitutes a repeat region. I begin my research by addressing this problem through providing a quantification of repeat regions that is well defined for an arbitrary DNA sequence. Let us define a repeat string S to be a subsequence of DNA where, for some fixed n, every n-letter word in S occurs at least twice in a genome. I have investigated the structure of repeat strings in the genomes of C. elegans (a worm) and Arabidopsis (a plant). I have found a surprising power law structure in the distribution of lengths of repeat strings. I several simple models of evolution of repeat strings in the genome and found the that underlying power law structure will emerge as the stationary distribution of this evolutionary process. Symbolic Dynamical System for Reconstructing Repetitive DNA: The process of determining the DNA sequence of an organism, called genome assembly, remains extremely expensive and the process is far from perfected. The initial draft of the Human Genome took a decade to develop and cost many billions of dollars. In order for sequencing the genome of individual organisms to be viable, the process of genome assembly must be substantially improved. The main algorithmic complication in genome assembly is the presence of highly repetitive DNA. Unfortunately repetitive DNA can make up a significant portion of the genome of an organism. For example, nearly 50% of the human genome is expected to be repetitive. The goal of my work in genome assembly was to use information present to construct what we call consensus copies of repetitive regions. I create a symbolic dynamical systems algorithm on a graph whose nodes are k-letter words from the DNA alphabet. A trajectory of our dynamical system is a string that hopefully corresponds to a repeat string in our genome. I've tested my method for constructing consensus copies of repeat strings using data from several species including fruit fly, rat, chicken and mosquito. I am currently looking at developing software to assemble these regions that could be provided to the genome assembly community. Wada Basins in One-Dimensional Maps: In dynamical systems with more than one co-existing attractor the basins of attraction can become entangled in such a way that the boundary of all basins coincides. A special type of entangled basin boundaries is that of a Wada Basin, where every point on the boundary of the basin is also a boundary point of three attractors. Such phenomena have been show to occur in many physical systems such as a forced damped pendulum and forced Duffing Oscillator. Additionally, under certain conditions Wada Basins also exist in One-Dimensional Maps. Wada Basins have been shown to emerge from tangent bifurcation when there are three co-existing attracting fixed points and particular conditions are met. We look to extend these results on the emergence of Wada Basins to periodic points of arbitrary period and to higher dimensional systems. |
||
|
Suzanne Sindi Graduate Student Applied Mathematics ssindi (at) math.umd.edu Last Updated: May 5, 2005 |
||