Importance of secure data management system demonstrated again

A wake-up call: the lost (or almost lost) research data stored in Yoda

daracenter Foto: 123rf
Photo: 123rf

It was early November when the ICT department of the Faculty of Geosciences sounded the alarm for the first time. “Something is going on with Yoda, but we don’t know what exactly.” On the fifth floor of the Vening Meinesz building, where both the ICT department and the board are located, communication with the parties involved went smoothly, but it was not yet clear how to assess the severity of the errors. “We weren’t that worried yet,” Dean Wilco Hazeleger recalls. “We thought that some files might have been lost.” 

But the signs got more serious each day. Not only the files in question had not been found, but they found out that the problem could have affected more than just a few files. Hazeleger: “In the event of a fire, we know right away that everything is gone. But here there was just smoke. We discovered little by little that a significant amount of research data might have disappeared. So, we thought: ‘Now it’s time to take action’.” In other words, time to communicate.

Two secure datacenters
Yoda is a research data management system developed by UU. It was built for the strategic theme Dynamics of Youth, for which researchers from different universities would collect considerable amounts of data for a long time and share it with each other. Therefore, the database not only had to accommodate a lot of data, but it also had to be secure because of privacy-sensitive information. Thanks to Yoda, researchers did not have to use USB sticks or external hard drives when travelling, at the risk of losing confidential information. Moreover, the system was set up in such a way that data was stored in not one, but two secure data centres, one in Utrecht and another one in Almere. The data was regularly copied from one data centre to another. 

Combination of human error and malfunction
However, things threatened to go wrong in November when more space had to be created for the Geosciences compartment. Things went wrong while extra digital space was added to one of the two datacenters,. An employee was in the process of replacing an old drive with a new one containing specific Geo-data. During the replacement, the path to the sources with ‘live data’ was accidentally deleted. As a result, the geo-data information was suddenly no longer accessible. At first glance, it wasn’t easy to resurface the lost data. “Due to a glitch in the Yoda system, not all large files in Yoda were properly replicated to the other datacenter. As a result, we were unable to retrieve some of the missing information immediately.”

When it became clear within ITS that the incident had had major consequences, the faculty was immediately contacted.  “When we heard, we were shocked,” says Hazeleger. “Data are an important basis for conducting research. If you lose data, it can have huge consequences. It goes to the heart of our work.”

Wilco Hazeleger Foto: Ivar Pel, UU

Wilco Hazeleger. Photo: Ivar Pel, UU

To the heart of science
In Geosciences, different types of data are stored. This can include transcripts of conversations, samples of soil types in a particular region, results of surveys and a collection of data from test subjects over the years. “You don’t want to think about such data disappearing. In the worst-case scenario, years of research can disappear overnight. And apart from that - it goes to the heart of science. You want to have a data system that ensures that data are reproducible and findable by the right people. The reliability of the system is the basis of the scientific reputation. All of that seemed to be at stake.”

Crisis teams
So the dean set up a faculty crisis team. He took on the role of chairman. The faculty director, the secretary of the board, someone from IT and a communications officer were also part of the team. “The good thing was that we had done crisis training earlier in the year. That turned out to be fortunate timing. We were able to get to work according to the rules we had learned. Plotting, or mapping out the problems and working them out through the system of perception, assessing and decision-making.”

It was soon decided that a central crisis team would be set up also. This included someone from the Executive Board. The team mapped out the broad plan that the faculty could implement. Among other things, they got in touch with the Dutch Data Protection Authority (though data hadn’t leaked, but disappeared) and the Supervisory Board was informed as well.

Compensating for damage
The faculty was going to inform all the main owners of the affected data. This involved about forty researchers. “All we could really say was that something might be gone. At that time, we were still working very hard to retrieve data.” The following Monday, there was an online crisis meeting with the scientists. “On the one hand, we wanted to indicate what the problem was and we wanted to estimate the consequences for the scientists. On the other hand, we also wanted to make it clear from the start that we wanted to compensate for any damage. Fortunately, we received the full support of the Executive Board on that point.”

Hazeleger was pleased with how the scientists reacted: it wasn’t an emotional meeting. It also turned out that quite a few researchers somehow still had their own backup, just in case. “Actually, a system like Yoda has to be so reliable that a scientist doesn’t need to do this. But in this case, it worked out well. It’s just a pity that people who worked according to the exact rules were hit the hardest.” 

Two PhD students affected
Behind the scenes, the search for the lost data continued day and night. There was a replica that contained all the data used when the mistake was made. The largest part of the lost data could be retrieved there. About 9 percent had actually been lost. The data that could not be found in the replica was largely retrieved with the help of the researchers. “In the end, two PhD students lost two months of work. We will compensate for that generously,” says Hazeleger. 

The damage turned out to be limited: no data was lost that cannot be gathered again. This was a prominent fear: some files contain information about samples that had been collected somewhere in the world. Those could not simply be replaced. “The fact that the damage is limited is of course a good thing. But you can also consider this a wakeup call. Data management is essential for science and its reliability. We had already appointed data stewards, people who specialize in the safe handling of data and support scientists. We now know how important their work is.”

System vulnerability
Another lesson is that we are now aware of the vulnerability of the system. Hazeleger: “You think you have a good system, but things can go wrong. That is why we have made some agreements: always consult with several colleagues first before you make a change. And we have learned to be transparent, even if the message is painful.”

“Since the incident, data storage procedures have been tightened. When these changes occur, you should make sure that more than one person is looking out so that you can correct each other. We immediately introduced that. From now on, both Yoda datacenters will work strictly according to the ‘four eyes principle’,” Jan-Paul van Staalduinen, director of Information & Technology Services, says.  “We carry out continuous checks to ensure that all newly uploaded data is properly stored in both data centres. In addition, deleted large files are now placed in a recovery queue for at least 96 hours. So, if a file is accidentally deleted, there are four days to recover the data. Additional copies have also been made in the replica environment in one of the two data centres.”

The power of Yoda
As strange as it sounds, the crisis with Yoda also shows this data system’s power. Hazeleger: “We developed this system ourselves and manage it ourselves too. As a result, we were not dependent on a third party when things went wrong. The IT department felt very responsible and worked very hard. If you have a system like that, but it’s run by a large international company, that involvement is often less significant.”

Hazeleger hopes that scientists will continue to have faith in Yoda. “Dealing with data is an essential part of research. We should not underestimate that. Yoda offers the possibility to share data securely and make it accessible when needed. It’s a part of science that we shouldn’t underestimate. We need to be able to count on the experts.”