WOOSTER, Ohio – Imagine being given a file with millions of data points and told “Do something interesting with this, something that will create useful insights for this data set’s owners, then come back and present your results. You’ve got about 36 hours.”
That’s DataFest, a program of the American Statistical Association now in its eighth year, which takes place at sites across the country in April and early May and brings together “teams of undergraduates [to] work around the clock to find and share meaning in a large, rich, and complex data set.”
Earlier this month, two teams from The College of Wooster participated for the first time, travelling to Miami University to compete against 150 students from six other schools: Miami, Bowling Green State University, the University of Dayton, Xavier University, the University of Cincinnati, and Northern Kentucky University. The ten Wooster students were a mix of computer science and mathematics majors (or double majors), with a couple of political science double majors thrown into the mix. Every class year was represented, and a wide range of hometowns, from Albion, Mich., and Kolkata, India, to Casablanca, Morocco, and North Potomac, Md.
“Friday night we spent a lot of time planning, getting a feel for the data, figuring out what the story was that we wanted to tell with it,” said Joe MacInnes, a junior computer science major. “Around two Saturday afternoon was when the stress really started.”
Once a team has settled on the research question they want to pursue and identified any other available data resources they might be able to pull in to help them tease out insights from the main data set, they must organize, divide up tasks, and of course overcome the unforeseen problems that invariably arise, all while managing their time against a deadline and maintaining clear communication among all members of the team at all times. (Because this year’s DataFest competitions are still going on, the nature of the data set being used cannot be disclosed.)
The experience is very different from that in a classroom, where students are given a clearly defined assignment. “The first thing they have to do is figure out their research question,” said Drew Pasteur, associate professor of mathematics and computer science. “Then they find that the data is messy, there are pieces missing and they have to adjust, which is what happens in the real world.”
The best teams, Pasteur says, have a diverse set of skills. Before then can even apply their technical expertise to organizing and analyzing the data, they need to be able to zoom out, see the big picture, and brainstorm an interesting research question. They need strong project management and time management skills to keep the team on track, and good presentation skills to convey the significance of their results.
For MacInnes, presenting their results before a panel of industry and academic judges on Sunday was a highpoint of the weekend. His teammate Avi Vajpeyi, a senior computer science major, agreed. “Wooster prepares students well to do presentations,” he said, citing his own experience presenting at an academic conference earlier this year.
MacInnes honed his skills on an AMRE project team last summer, making presentations to senior executives at Goodyear’s corporate headquarters. “Now that was terrifying.”
Although neither of Wooster’s teams made the cut to be one of the eight finalists this year, the experience has whetted their appetite. Expect to see more Fighting Scots at DataFest 2019.