In November 2013, we announced a bold new partnership to harness the potential of data scientists and big data for basic research and scientific discovery. New York University, the University of California, Berkeley and the University of Washington launched this new 5-year, $37.8 million, cross-institutional effort with support from the Gordon and Betty Moore Foundation and Alfred P. Sloan Foundation.
At a time when the natural, mathematical, computational and social sciences are all producing data with relentlessly increasing volume, variety and velocity, capturing the full potential of a progressively data-rich world has become a daunting hurdle for researchers. At the intersection of domain science, computation and mathematics, data science is already contributing to scientific discovery, yet substantial systemic challenges need to be overcome to maximize its impact on academic research. This ambitious partnership will spur collaborations within and across the three campuses and with other partners pursuing similar data-intensive science goals.
This project seeks to achieve three core goals:
- Develop meaningful and sustained interactions and collaborations between researchers with backgrounds in specific subjects (such as astrophysics, genetics, economics), and in the methodology fields (such as computer science, statistics and applied mathematics), with the specific aim of recognizing what it takes to move each of the sciences forward;
- Establish career paths that are long-term and sustainable, using alternative metrics and reward structures to retain a new generation of scientists whose research focuses on the multi-disciplinary analysis of massive, noisy, and complex scientific data and the development of the tools and techniques that enable this analysis; and
- Build on current academic and industrial efforts to work towards an ecosystem of analytical tools and research practices that is sustainable, reusable, extensible, learnable, easy to translate across research areas and enables researchers to spend more time focusing on their science.
The partner universities have pioneered new approaches to discovery in fields as diverse as astronomy, biology, oceanography, and sociology through deep collaborations between researchers in these fields and researchers in data science methodology fields such as computer science, statistics and applied mathematics. This new collaboration – a coordinated, distributed experiment involving researchers at these leading universities – will work with other leaders to develop effective models that dramatically accelerate this data science revolution.
Cross-university teams organize their efforts around six focal areas:
- strengthening an ecosystem of tools and software environments,
- establishing academic careers for data scientists,
- championing education and training in data science at all levels,
- promoting and facilitating efforts that are accessible and reproducible,
- creating physical and intellectual hubs for data science activities, and
- identifying the scientists’ data-science bottlenecks and needs through directed ethnography.
The partnership will connect with others, practice open science and share lessons we’ve learned along the way.
Learn More about the Data-Driven Discovery Initiative