The biggest hurdle to applying machine learning in medicine has been the difficulty in obtaining high-quality data with which to train the algorithm. Simply handing over confidential patient records to a tech company was not acceptable. But AI researchers have found a way around that particular hurdle: a split neural network.

Researchers claim that not only would a split neural network keep patient data safe, the method also demands less computational resources while producing models at greater accuracy.

More information about split neural networks can be found in Technology Review:

AI researchers have been advancing new techniques for training machine-learning models while keeping the data confidential. The latest method, out of MIT, is called a split neural network: it allows one person to start training a deep-learning model and another person to finish.

The idea is hospitals and other medical institutions would be able to train their models partway with their patients’ data locally, then each send their half-trained model to a centralized location to complete the final stages of training with their models together. The centralized location, whether that be the cloud services of Google or another company, would never see the raw patient data; they would only see the output of the half-baked model plus the model itself. But the hospitals would benefit from a final model trained on a combination of every participating institution’s data.

Ramesh Raskar, an associate professor at the MIT Media Lab and a coauthor of the paper, likens this process to data encryption. “Only because of encryption do I feel comfortable sending my credit card data to another entity,” he says. Obfuscating medical data through the first few stages of a neural network protects the data in the same way.