Assistant professor in DePaul University’s College of Computing and Digital Media Tanu Malik was recently awarded a faculty early career development grant, or CAREER grant, from the National Science Foundation.
In order for scientists to make advancements, they must be able to validate and build on each other’s work, the university said in its report.
Now that so much science relies on computations and data, many researchers are struggling to share their computational artifacts in ways that are usable for others, said the Indian American researcher in the report.
“We have results that are generated through computational artifacts but are being presented on PDF papers. As a researcher, there are no easy means for verifying the results being presented,” said Malik. “Emailing and sharing through websites are old methods. We need more efficient and usable methods to verify results from complex scientific experiments.”
Now, the NSF has awarded Malik with a CAREER grant to support her work to lay the foundation for establishing reproducibility of real-world computational and data science.
Malik’s project will also increase awareness of the need for computational reproducibility tools through a research and education plan involving scientists, students and instructors, the university said.
The $498,889 five-year research grant is NSF’s most prestigious award in support of early-career faculty.
Malik knew she was onto something in 2013 as a research associate scientist at the University of Chicago while working with a group of geoscientists, DePaul noted.
Spread across seven universities, they were trying to collect and run their computations together, but it wasn’t working.
Malik and her colleagues created a product, called the Sciunit container, that could align not just the data but also the programs and environments where the information had been created. The geoscientists had been trying to share data and computation for several years, according to the university news report.
Malik’s system gave them results in 30 minutes, it said.
“They were able to run this tool, and it gathered everything from different machines and made it portable. It became a huge thing,” Malik said. She had discovered that it wasn’t enough just to share a program code and data, but researchers also need what’s called the “compute environment” to ensure that data is being run in the same way, getting relatively the same outputs. Malik likened it to trying to download a new program on your personal computer, but it just won’t run. “That’s the kind of situation we’re trying to avoid.”
The solution, said Malik, is to make it all portable — the data, the program, the operating system — so that others can move ahead and reproduce research, faster.
Malik’s work will also make it easier for researchers to judge whether their own attempts at an experiment are reproducible or not. Her research aims to define the phases of reproducibility in computational research, the university report said.
The CAREER grant will allow Malik to engage more students with her work, especially in DePaul’s data science program. She hopes to engage more women in the work, as representation of women in computer science is still lagging, said Malik.
“The number of women who get funded in this area is abysmally low — so I think it’s a big deal,” said Malik in the report. “I just feel honored to have that opportunity. If I could share somehow that would be fantastic,” she added.
“I have been doing this work for some time now, and the fact that this work is being recognized, that we did make an impact in a few lives by making it simpler, it feels good,” said Malik. “NSF has recognized my work, and is helping us to expand this further to make a greater impact. That’s the ultimate fun, to make a dent in this hard problem.”