Browse Active Research Projects

Undergraduates can participate in projects for credits by registering in CS 4974 or 4994. Consult the Faculty Advisor or Research Supervisor before you register for this course.

Participation on a VTURCS project could also lead to an honors thesis for CS majors interested in graduating with honors.

Can't find anything that piques your curiousity? Don't be afraid to check out the Computer Science faculty list for someone who has a research interest you'd like to know more about. They might just have something for you.

T. M. Murali

Automatic Class Discovery in Biological Data

Faculty Advisor
T. M. Murali
Research Supervisor
T. M. Murali
Description of Work
In spite of hundreds, if not thousands of years, of development in medicine, identification of complex diseases such as cancer, e.g., by viewing diseased cells under a microscope, still remains an art, to some extent. Over the last 10 years, DNA microarrays have opened up a promising avenue for this problem. A DNA microarray measures the expression levels (activities) of all the genes in a cell. Therefore, by taking DNA microarray samples from patients diagnosed with various diseases and comparing these measurements, it may be possible to put disease diagnosis on a solid molecular footing, where gene expression patterns define a molecular signature for a disease. This project will explore the use of biclustering algorithms to automatically discover diseases types and sub-classes. A bicluster isolates a subset of genes and a subset of samples with very coherent gene expression patterns. Thus, it is possible that these genes form a molecular signature for the disease associated with the samples. If no single disease is associated with the samples, the bicluster may point to a new disease or a new sub-class of an existing disease. There are three main aspects to this project: (i) Implement different biclustering algorithms already published in the literature. (ii) Develop and implement an automatic class discovery framework that uses the biclusters computed by the algorithms implemented in step (i). There is tremendous scope for innovation and new ideas in this aspect. (iii) Validate the class discovery methodology on actual gene expression and disease datasets. This project can involve two students.
Application Instructions
Send CV to murali@cs.vt.edu
Project URL
http://bioinformatics.cs.vt.edu/~murali/papers/xmotif-classifier/
Area(s) of Research
Computational Biology, Data Mining
Compensation
Work for Credit
Contact
murali@cs.vt.edu
Lenwood S. Heath

Computational Biology and Bioinformatics

Faculty Advisor
Lenwood S. Heath
Research Supervisor
Various
Description of Work
The Department of Computer Science has a number of faculty members involved in computational biology and bioinformatics (CBB) research. Such research often employs CS skills involving Perl programming, relational data bases, web service development, and mathematical or statistical analysis of biological data. The field of bioinformatics changes rapidly and offers many opportunities, so it is not possible to list all specific projects in VTURCS. Basically, if you know Perl, database, web development, algorithmic, or have other relevant skills, consider CBB.
Application Instructions
See Dr. Heath's web site for his current office hours. Stop by during office hours for a chat. He can direct you to faculty members who might be able to use your skills.
Project URL
http://people.cs.vt.edu/~heath/
Area(s) of Research
Bioinformatics, Theory, Computational Biology
Compensation
Work for Credit
Contact
heath@vt.edu
Wu Feng

High-Performance Biological Sequence Search

Faculty Advisor
Wu Feng
Research Supervisor
Jeremy Archuleta
Description of Work
Biological sequence searching has become a fundamental aspect of all bioinformatics. It can help in tasks such as sequencing the human genome, designing pathogen signatures for pathogen detection, identifying unknown viruses (e.g., the virus now known as SARS), and so on. In this project, you will be coding different modules of part of a much larger project (i.e., mpiBLAST at http://www.mpiblast.org) in order to improve functionality, maintainability, and performance.
Application Instructions
E-mail a resume to feng@cs.vt.edu. Optional, but preferred, materials include unofficial undergraduate transcript and a brief one-paragraph statement of what interests you about this project.
Project URL
http://www.mpiblast.org/
Area(s) of Research
Bioinformatics, Parallel Computation, Software Engineering, Systems, Theory, Computational Biology, Databases, Data Mining, Artificial Intelligence
Compensation
Negotiable
Contact
feng@cs.vt.edu
Wu Feng

Parallel Programming with Video Cards and More ...

Faculty Advisor
Wu Feng
Research Supervisor
Description of Work
The world of computing is now irrevocably parallel. CPUs have "topped" out roughly 3.0 GHz. So, while performance in the past has doubled roughly every 2 years due to increases in clock frequency, future performance increases will be due to the doubling of the number of cores in a system every 2 years. As such, we are looking at programming models, environments, and applications on multicore and manycore architectures. Of particular relevance and accessibility for VTURCS students are mapping applications onto traditional multicore (Intel and AMD), hybrid multicore (Cell and PlayStation3), manycore (video cards), and reconfigurable multicore (Tilera TILE64) architectures.
Application Instructions
E-mail a resume to feng@cs.vt.edu. Optional, but preferred, materials include unofficial undergraduate transcript and a brief one-paragraph statement of what interests you about this project.
Project URL
http://synergy.cs.vt.edu/
Area(s) of Research
Bioinformatics, Computational Biology, Data Mining, Human-Computer Interaction, Parallel Computation, Systems, Theory
Compensation
Negotiable
Contact
feng@cs.vt.edu
Alexey Onufriev, Lenwood Heath

Protein Completion (A Structural Biology Web Server)

Faculty Advisor
Alexey Onufriev, Lenwood Heath
Research Supervisor
Jon Myers, Alexey Onufriev
Description of Work
Structure of a biological molecule is a key determinant of its biological function. However, experimentally available structures (from X-ray crystallography) are missing the hydrogen atoms. Without them, structures are seriously incomplete. We have developed a (first in the world) prototype web application that uses theoretical methods to add the missing hydrogens. Lots of work is still to be done, and we need help in virtually every aspect of the project: PHP, web design, C++/PERL programming, core algorithm development, testing. We are also planning to use the server to address some important biological questions. This is an "instant gratification" project, as your contribution becomes immediately accessible to researchers world-wide + you get your name on the project's credits page (good for your resume...)
Application Instructions
Contact Alexey Onufriev alexey@cs.vt.edu
Project URL
http://chekhov.cs.vt.edu/completion
Area(s) of Research
Bioinformatics, Human-Computer Interaction, Software Engineering, Computational Biology
Compensation
Negotiable
Contact
alexey@cs.vt.edu
Alexey Onufriev

Protein folding on a PC.

Faculty Advisor
Alexey Onufriev
Research Supervisor
Alexey Onufriev and grad. students
Description of Work
Have you heard of the famous "protein folding" problem? What people call the "grand challenge of computational science"? We are working on an algorithm that has the potential to solve the problem on a --single PC--. If you want to be a part of the team and have a chance to publish in prestigious journals, join us. No prior knowledge of biology or physics is required, only enthusiasm for solving hard problems. However, excellent programming skills and solid math background are a must.
Application Instructions
email me.
Project URL
http://
Area(s) of Research
Bioinformatics, Software Engineering, Theory, Computational Biology
Compensation
Negotiable
Contact
alexey@cs.vt.edu
Liqing Zhang

Revealing the mystery of the evolution of overlapping genes

Faculty Advisor
Liqing Zhang
Research Supervisor
Liqing Zhang
Description of Work
Increasing studies have shown that overlapping genes are an important phenomenon in many eukaryotic genomes such as human, mouse, flies, and plants. These genes play an important role in regulation of gene expression at the levels of transcription, mRNA processing, splicing, or translation. However, functional studies of these genes are still in its infancy. Little is known about the exact functional role of these genes and their evolutionary dynamics in the genome. In this project, we will perform a large scale analysis of these genes in several animal and plant genomes and analyse the birth and death of these genes using bioinformatics approaches.
Application Instructions
Send an email to me.
Project URL
http://
Area(s) of Research
Bioinformatics, Computational Biology
Compensation
Work for Credit or Volunteer
Contact
lqzhang@v.tedu
Lenwood S. Heath

XcisClique

Faculty Advisor
Lenwood S. Heath
Research Supervisor
Lenwood S. Heath
Description of Work
The genome of an organism consists of DNA molecules (chromosomes) in every cell that encode information for the functioning of the cell. The genome is typically thought of as sequences over the chemical alphabet {A,C,G,T}. These sequences encode, among other things, the genes of the organism. In turn, genes carry the genetic codes for proteins. For a genetic code to result in a protein, the gene must be transcribed (copied) to a messenger RNA (mRNA) molecule, which later forms the template to translate into a protein. The transcription step is controlled by regulatory sequences embedded in the genomic sequence. If the gene is actually transcribed into mRNA, then the gene is said to be expressed.

XcisClique is a system that combines the analysis of genomic sequence, known regulatory sequences, and experimental data on gene expression to analyze the statistical significance of combinations (bicliques) of regulatory sequences and gene expression. It consists of local data resources in a relational database together with tools for analyzing sequences and bicliques. Currently, it only has the genome of a small model plant called Arabidopsis thaliana. Amrita Pati completed the current version in 2005, and she is still part of the research group.

Opportunities for Enhancements

(1) A very important genome that recently became available is that of rice. In addition, other organisms will become available over time that can benefit from the capabilities of XcisClique. Every organism has unique challenges related to putting it into a relational database. In other words, there are no standards for what must be included in a genome and in what format. The rice genome will be highly valuable to add to XcisClique, but it will take some effort.

(2) There are some time-consuming analyses that take too long to be done through the web interface. Instead, they are precomputed for a limited set of parameters and stored in a database. A research task is to develop and implement methods that eliminate precomputation and to enhance the web interface to support greater user capabilities.

(3) Certain functionalities of the XcisClique system could be made more efficient with appropriate enhancements to the code. Improving the running time for an analysis in the current system is another research task.

(4) The computational biology and bioinformatics (CBB) group is acquiring a database server so we can expand the size of the databases that are available through our web services. The rice genome is much larger than the Arabidopsis genome. And there is more gene expression data available on the web that could be integrated with the rest of the data.

(5) With enough data, one can imagine mining the database for biologically meaningful patterns. Tools available from Amrita and others can be used, or new mining tools based on specific needs can be developed.

Background Required

Knowledge of Perl and MATLAB is required. Knowledge of C++ is desirable but not essential. The current database is built on the Postgres platform, so knowledge of SQL will be helpful. The existing web-interface has been built using PHP and Perl.

Application Instructions
Visit Dr. Heath during his office hours to discuss your interest.
Project URL
https://bioinformatics.cs.vt.edu/xcisclique/
Area(s) of Research
Bioinformatics, Computational Biology, Databases
Compensation
Work for Credit
Contact
heath@vt.edu