Decoding the Tree of Life: How Research Computing Enables Large-Scale Genomics at URI
At the University of Rhode Island, the push to answer fundamental questions about evolutionary history increasingly relies on massive computational power. For Dr. Rachel Schwartz, an Associate Professor who runs a “dry lab” focused on computational evolutionary genetics, access to URI’s research computing resources is not just helpful—it’s critical to her work analyzing some of the largest genomics datasets in science today.
The Computational Core of Evolution
Dr. Schwartz’s research centers on developing methods and software to interpret the vast amounts of data produced by next-generation sequencing. Her lab, which draws students from fields spanning computer science, math, and evolutionary biology, seeks to extract meaningful, phylogenetically informative data to map the relationships among organisms, from strains of pathogens to the entire tree of life.
One key challenge is sifting through raw genomic data to find the relevant evolutionary signals. Dr. Schwartz developed software called SISRS (Site Identification from Short Read Sequences), which has revealed that non-coding loci often contain more robust phylogenetic signal than traditionally studied coding loci.
“Being able to analyze large-scale genomic data across many species to determine phylogenetic relationships requires substantial compute power,” Dr. Schwartz explains. “Data are getting larger, more complex, and analyses are getting larger and more complex. Research computing is becoming even more critical.”
HPC: The Essential Engine for Big Data Genomics
For Dr. Schwartz, the ability to pursue cutting-edge computational work hinges entirely on URI’s high-performance computing (HPC) resources, particularly the Unity cluster.
The scale of modern genomics data simply exceeds the capacity of local machines. The Unity cluster provides the storage and computational muscle necessary for:
- Massive Data Storage: Accommodating terabytes of raw genomic sequences.
- Complex Analyses: Running intricate analyses that require many compute nodes and large amounts of memory simultaneously.
- Specialized Tools: Accessing specialized hardware like GPUs (Graphics Processing Units), which is essential as her work evolves toward machine learning and AI applications in genomics.
As a researcher in a fundamentally computational field, Dr. Schwartz emphasizes the must-have nature of the resources and the supporting team.
“Any sort of GPU work requires this research infrastructure. Team support—troubleshooting, package installation, discussing upgrades, and maintenance—has been crucial. I don’t have to deal with it; it all just happens in the background. We just ask, and it shows up.”
Mentorship and Reproducible Science
Beyond the computational lift, the resources and supporting team play a crucial role in enabling student mentorship and fostering the lab’s commitment to reproducible research.
When onboarding new students, the quality of the technical documentation makes a significant difference. Dr. Schwartz praises the documented resources and user interface of the Unity cluster, noting that they are exceptionally well-done compared to other academic HPC systems.
“The Unity documentation is really helpful. For example, it explains exactly how to set up scratch space and use Slurm scripts,” she notes, “which helps make the system broadly accessible.”
This robust support structure ensures that students can quickly dive into complex computational work, focusing on the science rather than struggling with system configuration. This support is particularly valuable for students coming from empirical labs whose PIs may not be computational experts.
Looking Ahead: The AI-Driven Future of Evolution
Dr. Schwartz sees the field of evolutionary genetics pushing rapidly toward machine learning and AI, which will make the need for powerful computing resources even more pronounced.
“I think we’re moving toward a place where doing this work on a laptop will simply not be possible,” she says.
She is optimistic that URI’s investment in integrated HPC, AI, and advanced data analytics through the Institute for Artificial Intelligence & Computational Research (IACR) will keep her lab competitive in the next decade of genomics research.
Advice to the URI Community
When asked what advice she would give to faculty and students considering using URI’s research computing resources, Dr. Schwartz strongly recommends embracing them:
“Take advantage of the fact that this is here. It’s well-documented and supported. It’s simply the easiest option. You don’t want to run something small in-house because you have to support it yourself. Cloud services can also be challenging to administer on your own and are often expensive. Here, you have massive resources and professional support. What more could you want?”
Interviewed by Leann Biancani
Biology Department, URI
