Virus Discovery and Functional Viral Metagenomics

Our laboratory uses viral proteins to uncover and explore cellular regulatory pathways. All viruses encode gene products that function in evading or suppressing host antiviral defense systems or in altering the cellular environment to make it more conducive for infection. These products act by directly associating with cellular targets, usually proteins that are components of signaling pathways, and redirecting or altering their functions. We are developing computational methods that facilitate the identification of proteins from a broad array of viruses that target cellular systems. The panviral proteome currently consists of over 600,000 virus-encoded proteins and is growing rapidly. These viral proteins serve as a highly selective toolbox for identifying and studying regulatory circuits that govern key cellular processes such as proliferation, death, metabolism, and immunity. The ultimate goal is to harness these proteins and use them to explore cellular systems important for viral infection.

Searching NGS databases for novel viruses. Thousands of specimens encompassing tissues, organisms, and metagenomes are being subjected to next generation sequencing (NGS) and the resulting data placed in databases. In addition to sequences of the target species these databases also contain sequences of microbes and viruses present in the specimen. We have developed powerful new computational tools for detecting viruses in sequence databases. We are applying these tools to search for known and novel viruses in human microbiome, environmental metagenomic, and The Cancer Genome Atlas (TCGA) databases.

Developing high throughput microfluidics for virus discovery. The inability to isolate rare viral nucleic acids from complex samples such as raw sewage or human tissues is a major obstacle to virus discovery. We have been working with Dr. David Weitz and colleagues (Applied Physics, Harvard) to develop high throughput droplet-based microfluidics for virus isolation. Microfluidics enables the production of millions of small “test tubes” that can be screened by immunoassay or PCR at remarkable speeds. Currently we are optimizing the sequencing of single viral nucleic acids from droplets.

Surveilling viruses in the environment: Project Premonition. Project Premonition is a large multi-institutional collaboration spearheaded by Microsoft Research (MSR). The Premonition team consists of computer scientists and engineers at MSR and scientists and engineers at a number of academic institutions. Our role in the project is to sequence mosquito genomes and detect viruses and other pathogens they harbor and to work with MSR to develop new software for pathogen detection.

The goal of Premonition is to detect pathogens in the environment and track their movement across different geographies and between different host species. The strategy is to use mosquitoes as a device to capture blood samples from many different vertebrate species (rodents, wild and domestic animals, and humans). Mosquitoes will be captured and subjected to NGS. In collaboration with MSR, we are developing a novel computational pipeline that simultaneously identifies: (a) the species of mosquito; (b) the species of animal from which the mosquito obtained its last blood meal; and, (c) viral, bacterial, and parasitic agents present in the mosquito or its blood meal.


Role of Viral Proteins in Infection and Cancer

One of the major goals of virology is to determine how each viral protein contributes to viral infection and pathology. In many cases disease results from the destruction of tissue by infection itself, or from the consequences of immune action in response to the infection. In other cases, pathology is the unintended consequence of an infection gone wrong. Such is the case in many virus-associated cancers. Currently we are focused on learning the mechanisms by which polyomaviruses and papillomaviruses manipulate cellular pathways, and how these actions contribute to cancer.

Characterizing New Jersey Polyomavirus (NJPyV) and Human Polyomavirus 9 (HuPyV9). Recently, in collaboration with Dr. Ian Lipkin at Columbia University, we have described a new polyomavirus, NJPyV, isolated from a pancreatic transplant patient. In addition, we have been studying HuPyV9, a relatively uncharacterized human polyomavirus. Like all polyomaviruses, NJPyV and HuPyV9 encode a collection of proteins called T antigens. However, the T antigens from the new viruses have a number of unique characteristics that distinguish them from previously studied T antigens. We are studying how their T antigens alter cell biology.

Exploring BKV infection with single cell transcriptomics. BKV is a human polyomavirus that is an important pathogen of kidney transplant patients. BKV establishes lifelong persistent infections in humans and most cases are harmless. However, in individuals that are immunosuppressed, such as transplant patients, the virus can undergo a rampant infection that destroys their kidneys. We are examining the effects of BKV on the global gene expression patterns of infected cells. To accomplish this we are using a newly developed strategy that merges droplet microfluidics with single cell transcriptomics. This work is in collaboration with the laboratory of Dr. David Weitz (Harvard, Applied Physics).

Determining how Human papillomavirus (HPV) contributes to head and neck cancers (HNSCC). Approximately 20% of HNSCC contain HPV and there is strong evidence indicating that the virus directly contributes to tumorigenesis in this subset of cancers. In many tumors the HPV genome is integrated in the chromosomes of the tumor thus ensuring that every time a tumor cell divides the viral genome is transmitted to both daughter cells. We have developed computational methods for mapping these integration sites and assessing their effect on gene structure and expression. We have also examined global patterns of tumor gene expression and correlated them with viral gene expression.