High Performance Computer Cluster Advances
A Wide Range of Pursuits at the Garden
Over the past 15 years, the rapid advancement of molecular techniques, large-scale data-mining operations, and large scale studies of plant genes, populations, species, and families—to name just a few—have led to an imbalance between the ability of Garden scientists to generate data and their ability to rigorously analyze that data. This has all changed, now, with the installation of a high performance computer cluster at the Garden, funded by the National Science Foundation (NSF) and The Ambrose Monell Foundation.
Housed in a central location, the cluster is composed of several computers and related resources linked together and acting as one machine for greater speed, power, performance, efficiency, and cost-effectiveness. It is busy with analyses twenty-four hours a day, seven days a week, and performs them on a first-come, first-serve basis as submitted to the cluster by Garden scientists, post-doctoral researchers, graduate students, interns, and others.
When submitted work reaches 85% of the cluster’s capacity, it automatically shifts allocation of its analytical resources from first-come, first-serve to allocation that maximizes the number of submitted jobs that can be completed in the shortest amount of time.
Garden scientist Dr. Fabian Michelangeli is currently using the computer cluster to analyze DNA sequences in 1,800 species in the poorly understood tribe Miconieae (plant family Melastomataceae), in order to understand and map their evolutionary relationships. This tribe is a dominant element in the Neotropical forest understory, where their fruits are an important part of the diet of birds and mammals.
Dr. Michelangeli explains “The cluster enables my team to carry out large analyses that include several hundred species for multiple genes. Such analyses thoroughly explore data using different techniques and computational methods in a timeframe that is reasonable—a matter of days instead of months. This speeds up our understanding of the natural world.”
Perhaps the greatest impact of the computer cluster will be to dramatically improve the quality of training programs for Garden graduate students as well as undergraduate and high school interns.
Students will be better and more broadly trained in new ways in which information can be extracted from data and will now be able to address a variety of questions that were previously beyond our computing capacity.
As just one example, intern Gary Tan from the Bronx High School of Science is assisting Garden scientist Dr. Damon Little in developing a rapid plant DNA extraction technique. Little and Tan are then using the computer cluster to find complex patterns in extraction success rate data and developing modifications to their technique based on these patterns.