Drinking from a Fire-Hose

You’ve probably heard the phrase. In the IT world it’s a reference to the vast amounts of data flowing from virtually every field.

Turns out the federal government has the same problem. But unlike the rest of us, the government has a few more resources to meet the challenge.

On March 30, the White House Office of Science and Technology Policy unveiled the start of a new $200 million project involving six federal R&D agencies that aims to make the most of the fast-growing volume of digital data.

The “Big Data Research and Development Initiative” will look to improve tools and techniques needed to access, organize, and glean discoveries from huge volumes of digital data. The agencies tasked to lead this effort include the National Science Foundation, National Institutes of Health, Defense Department (including the Defense Advanced Research Projects Agency), the Energy Department and the US Geological Survey.

The first wave of this initiative will involve:

National Science Foundation and the National Institutes of Health – Core Techniques and Technologies for Advancing Big Data Science & Engineering

Big Data: a new joint solicitation supported by NSF and NIH will help advance the core scientific and technological means of managing, analyzing, visualizing, and extracting useful information from large and diverse data sets.

National Science Foundation: In addition to Big Data, NSF will push a long-term strategy that includes new methods to derive knowledge from data; infrastructure to manage, curate and serve data to communities; and new approaches to education and workforce development.

Department of Defense: About $250 million will be invested annually (including funding for new research projects) across the Pentagon. To boost innovation, DoD will announce a series of open prize competitions. Also, DARPA is launching the “XDATA program,” investing $25 million annually for four years to develop computational techniques and software tools for analyzing large volumes of data.

National Institutes of Health -1000 Genomes Project Data Available on Cloud: NIH and Amazon Web Services have teamed up to host the world’s largest data set on human genetic variation – produced by the international 1000 Genomes Project. The massive file is now freely available to researchers via the cloud.

Department of Energy: DoE will use $25 million to establish the Scalable Data Management, Analysis and Visualization Institute, led by the agency’s Lawrence Berkeley National Laboratory. Expertise from six national labs and seven universities will come together to develop new tools to help scientists manage and visualize data on the Department’s supercomputers.

US Geological Survey – Big Data for Earth System Science: USGS announced grant awards through its John Wesley Powell Center for Analysis and Synthesis, whose mission is to catalyze innovative thinking in Earth system science.

Chances are the “fire-hose effect” will be around for some time to come, but these efforts should take on some of the most important immediate challenges and allow a wider range of researchers to be part of the movement to find solutions.

Stay in the know

Sign up for CRD updates by email and never miss a post.