Skip to main content

UW, Berkeley, NYU collaborate on $37.8M data science initiative

UW core team (clockwise from lower left): Tom Daniel (Biology + Computer Science & Engineering), Andy Connolly (Astronomy), Bill Howe (Computer Science & Engineering), Ed Lazowska (Computer Science & Engineering), Randy LeVeque (Applied Mathematics), Tyler McCormick (Statistics + Sociology), Cecilia Aragon (Human Centered Design & Engineering), Ginger Armbrust (Oceanography), Sarah Loebman (Astronomy). Missing: Magda Balazinska (Computer Science & Engineering), Josh Blumenstock (iSchool), Mark Ellis (Geography), Carlos Guestrin (Computer Science & Engineering), Thomas Richardson (Statistics), Werner Stuetzle (Statistics), John Vidale (Earth & Space Sciences).

The University of Washington, the University of California at Berkeley, and New York University are partners in a new five-year, $37.8 million award from the Gordon and Betty Moore Foundation and the Alfred P. Sloan Foundation whose goal is to dramatically accelerate the growth of data-intensive discovery in a broad range of fields.

UW’s team, which includes more than a dozen faculty from across the campus, is led by Ed Lazowska, Bill & Melinda Gates Chair in Computer Science & Engineering and Director of the UW eScience Institute. Berkeley’s team is led by Nobel laureate astrophysicist Saul Perlmutter, and NYU’s by neuroscientist and computer scientist Yann LeCun.

“All across our campus, the process of discovery will increasingly rely on researchers’ ability to extract knowledge from vast amounts of data,” Lazowska said. “In order to remain at the forefront, UW must be a leader in advancing the methodologies of data science, and in putting these methodologies to work in the broadest imaginable range of fields. This partnership with Berkeley and NYU – which builds on investments by UW in the eScience Institute and in the Center for Statistics and the Social Sciences – puts us in a leadership position.”

The new initiative was announced today (Nov. 12) as the featured talk at a White House Office of Science and Technology Policy event highlighting public-private partnerships that support “big data” analytics and research.

Under the partnership, cross-university teams will organize their efforts around six primary areas: strengthening an ecosystem of tools and software environments; establishing academic careers for data scientists; championing education and training in data science at all levels; promoting and facilitating reproducibility and open science; creating physical and intellectual hubs for data science activities; and measuring programs through directed ethnography and evaluation.

At UW, the grant will fund salaries for new research positions, including five data scientists who specialize in software and will work with researchers across campus, four postdoctoral data science fellows pursuing interdisciplinary research agendas, and four partially funded research scientists stationed in other departments and centers. A dedicated “data science studio” on campus will have meeting areas and drop-in workspaces to encourage collaboration across the UW’s colleges and schools.

This Washington Research Foundation video describes the transformational impact
of data science at UW.

To take advantage of these new resources, faculty members can submit short-term project proposals that require data science expertise: “analyzing a large dataset, accessing cloud resources, parallelizing an algorithm, or scaling up a statistical method,” said Bill Howe, co-lead of the new effort and a UW affiliate assistant professor of Computer Science & Engineering. Participants in the program would send a graduate student or research staff member to physically relocate for a period to work directly with the data scientists. The idea behind this embedded approach is to learn techniques, collaborate, then bring that knowledge back to individual labs and departments. “We see enormous potential in the cross-pollination that happens by having participants co-locate in the data-science studio,” Howe said. “These projects will help expose common problems and enable collaboration as we continue to scale up our investment in data science expertise.”

NSF_LogoThe UW also has received a $2.8 million Integrative Graduate Education and Research Traineeship (IGERT) grant from the National Science Foundation titled “Big Data U.” Together, the two grants will fund several dozen graduate students from a variety of departments to learn how to tackle big data in their research fields. The need to analyze vast amounts of data now touches nearly every department and discipline, and both grants will boost the university’s ability to prepare students.

The UW has been a leader in connecting big data experts with researchers in a variety of departments. This grant directly builds upon the UW’s eScience Institute, created in 2008 to focus on data-intensive discovery across campus, and the Center for Statistics and the Social Sciences, now more than a decade old.

Presentation by Ed Lazowska (UW), Saul Perlmutter (Berkeley), Yann LeCun (NYU), Josh Greenberg (Sloan), and Chris Mentzel (Moore) at the White House “Big Data Partnerships” event on November 12.

Faculty members see this new initiative as advancing the capacity for data-intensive scientific research and boost Seattle’s leadership in data science, while attracting more top data science talent back to universities at a time when big data is more pervasive than ever before. “These data scientists are coveted in industry as well as academia.  One of the missions we have in this effort is to provide competitive career paths that allow these experts the freedom to remain in academia and apply their skills to the most important problems in science,” Howe said.

In addition to Lazowska and Howe, co-PIs on UW’s Moore/Sloan award include Cecilia Aragon (Human-Centered Design & Engineering), Ginger Armbrust (Oceanography), Magda Balazinska (Computer Science & Engineering), Josh Blumenstock (iSchool), Andy Connolly (Astronomy), Tom Daniel (Biology + Computer Science & Engineering), Mark Ellis (Geography), Carlos Guestrin (Computer Science & Engineering), Randy LeVeque (Applied Mathematics), Tyler McCormick (Statistics + Sociology), Thomas Richardson (Statistics), Werner Stuetzle (Statistics), and John Vidale (Earth & Space Sciences). Many additional faculty have contributed significantly.

Guestrin is the PI on the IGERT; co-PIs include Armbrust, Balazinska, David Beck (Chemical Engineering), Connolly, Emily Fox (Statistics), Dan Grossman (Computer Science & Engineering), Jeff Heer (Computer Science & Engineering), Howe, Željko Ivezić (Astronomy), Lazowska, Marina Meila (Statistics), Bill Noble (Genome Sciences), Ben Taskar (Computer Science & Engineering), and LuAnne Thompson (Oceanography).



Press (a selection):