Today’s DNA synthesis methods are a great help to biologists in the lab and clinic. Yet that very same access to cheap DNA synthesizers could presumably allow terrorists to build dangerous viruses and toxins.

The US government is lending its support to researchers who are crafting AI-based tools that could detect if the string of letters that comprise the DNA base for synthesis contains the blueprint for biological terror weapons.

Sara Reardon writes about the effort to secure DNA research in this article from the Scientific American:

In 2009, several of the largest DNA-synthesis firms formed a consortium to create standardized procedures for checking sequences submitted by their customers against databases of known pathogens. If the automated screening flags up a sequence, the company can check whether the customer is a legitimate researcher before synthesizing the DNA.

But these existing programs pick out only the parts of sequences that exactly match those of known pathogens. A smart terrorist could fool the system by changing a few bases in DNA from a virus or a gene that produces a toxin, or even by designing an entirely new pathogen that does not exist in nature. Compounding the problem, the databases themselves are often riddled with errors, owing to differences in the way that DNA is sequenced.

With this in mind, in 2016, the US Intelligence Advanced Research Projects Agency (IARPA) launched an initiative to design better algorithms for spotting potentially threatening sequences. Five teams from industry and academia are competing in the programme, says its manager, John Julias. The agency declined to disclose the programme’s budget.

By 2020, the teams are expected to have developed a way to determine, in less than two weeks, whether an unknown sequence poses a threat. That will be a difficult task, says Andrew Warren, a software engineer at the University of Virginia in Charlottesville. “We have to be able to recognize any organism on the planet and also its molecular function.”

Warren’s team is designing a program that compares 40 million records of sequences from 90,000 microbial species. The algorithm learns to recognize the DNA sequences of known toxins and pathogens, identifies their common characteristics and then searches for similar sequences in other organisms.