Machine Learning for Science

From Data to Scientific Insights

At Berkeley Lab, computer scientists, mathematicians, and domain scientists from across the Lab are collaborating to turn burgeoning datasets into scientific insights through machine learning.

Machine learning methods make inferences from raw data using sophisticated algorithms and powerful computers. For online shoppers, that means better "you might also like..." suggestions, but for scientists, machine learning tools can reveal profound insights hiding in ballooning datasets.

Thanks to better instruments, including technologies developed at the Lab, we can see things at a microscopic and atomic scale. We can measure vibrations imperceptible to the human eye and capture high-resolution images of objects millions of light years away. But those instruments produce vastly larger datasets than ever. The Large Synoptic Survey Telescope (LSST) will produce 20 terabytes of data every night, about 60 petabytes over its lifetime. The Large Hadron Collider will have even more, with 50 petabytes in 2018 alone and 500 petabytes by 2024 (not including the 900 petabytes from past experiments). Conventional data analysis alone can't keep up.

With machine learning (ML), models are automatically derived from data, and can be used to identify features, reduce complexity, and control experiments. But scientists need to explain their findings, so Berkeley Lab's research into machine learning builds on its foundational work in mathematics to develop methods that are are consistent with physical laws, robust in the presence of noisy or biased data, and can be interpreted and explained in a way that is scientifically meaningful.

Using ML in over 100 different projects, Berkeley Lab scientists have tracked atomic particles, searched for better battery materials, analyzed traffic patterns, improved crop yields, pinpointed extreme weather in exascale climate simulations and pieced together metagenomic puzzles from billions of DNA fragments. And, we're just getting started.

Berkeley Lab: A Nexus for Machine Learning

With five DOE national user facilities (in molecular science, high-performance computing, synchrotron x-ray research, networking and genomics) world class applied mathematics, computer and computational science, and a pool of scientific talent that has produced 13 Nobel laureates, scientific machine learning has found fertile ground at Berkeley Lab.

As a Department of Energy National Laboratory, we also develop and share the algorithms, software, tools and libraries that are foundational to scientific machine learning. We gather, organize and store huge scientific datasets in areas such as materials, energy, environment, biology, genomics and astronomy. And we develop tools and advanced networking facilities to make these datasets more searchable and accessible using (what else?) machine learning.


Deep Learning for Neutrino Detection

IceCube Research Garners Best Paper Award at IEEE Machine Learning Conference

December 21, 2018

Extreme Weather ML Application Takes Gordon Bell Prize

Identifying extreme weather events at scale using deep learning

November 20, 2018

Optimizing Traffic

Using machine learning in smart and sustainable mobility solutions

October 28, 2018

Sustainable Farming

Harnessing the power of machine learning & microbiology

October 28, 2018