CERN is making the Large Hadron Collider's data more accessible

It's almost impossible for the organization to release raw datasets, however.

Sponsored Links

TOSome of the 1232 dipole magnets that bend the path of accelerated protons are pictured in the Large Hadron Collider (LHC) in a tunnel of the European Organisation for Nuclear Research (CERN), during maintenance works on February 6, 2020 in Echenevex, France, near Geneva. - Six years after the historic discovery of the Higgs boson, the world's largest particle accelerator is taking a break to boost its power, hoping to find new particles that would explain, among other things, dark matter, one of the great enigmas of the Universe. (Photo by VALENTIN FLAURAUD / AFP) (Photo by VALENTIN FLAURAUD/AFP via Getty Images)

The European Organization for Nuclear Research (CERN) will open up access to more data from Large Hadron Collider (LHC) experiments. Under an updated policy, data will be released around five years after it's collected and CERN hopes to release the full dataset publicly "by the close of the experiment concerned." Core LHC collaborators ALICE, ATLAS, CMS and LHCb all endorsed the move.

CERN will make level 3 data available, which will allow anyone to conduct “high-quality analysis” on information obtained from Large Hadron Collider experiments. Level 3 relates to "calibrated reconstructed data with the level of detail useful for algorithmic, performance and physics studies," according to CERN. 

The organization won't release raw data, however. The open data policy states that it's "not practically possible to make the full raw dataset from the LHC experiments usable in a meaningful way outside its collaborations." That's because of the complexity of the data, software and metadata and access issues to the vast troves of stored information, among other factors. LHC collaborators don't have general access to the raw data either. Instead, the assembly of level 3 data "is performed centrally."

Still, CERN suggests level 3 data can bolster particle physics research, for example. It noted the dataset could be used for scientific computing research too. Researchers may, for instance, tap into the data to "improve reconstruction or analysis methods based on machine learning techniques." The organization notes that approach needs rich datasets for training and validation.

Samples of level 1 and level 2 data were already available. Level 1 relates to supporting data for results that are published in scientific articles. Level 2, meanwhile, denotes dedicated datasets that are designed for outreach and education purposes. 

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.
Popular on Engadget