In February 2018, Northrop Grumman Corporation announced that it had been awarded a $95 million contract to develop increments one and two of the Department of Homeland Security (DHS) Homeland Advanced Recognition Technology (HART) system.
The announcement said very little about HART, except that it is a “multi-modal processing and matching technology that uses a combination of face, finger and iris biometrics meeting DHS accuracy requirements.” It is a database and system designed to incorporate, expand and replace the existing Automated Biometric Identity System (IDENT) built in the 1990s.
Last week the Electronic Frontier Foundation (EFF) provided more information on HART. In a Deeplinks blog, senior staff attorney Jennifer Lynch explained, “The agency’s new Homeland Advanced Recognition Technology (HART) database will include multiple forms of biometrics — from face recognition to DNA, data from questionable sources, and highly personal data on innocent people. It will be shared with federal agencies outside of DHS as well as state and local law enforcement and foreign governments.”
HART will support, she expands, “at least seven types of biometric identifiers, including face and voice data, DNA, scars and tattoos, and a blanket category for ‘other modalities’. It will also include biographic information, like name, date of birth, physical descriptors, country of origin, and government ID numbers. And it will include data we know to be highly subjective, including information collected from officer ‘encounters’ with the public and information about people’s ‘relationship patterns’.”
EFF’s primary concern over this vast new database of DNA, physical biometrics and social behavior is what it describes as the chilling effect on people exercising their First Amendment-protected rights to speak, assemble and associate. “Data like face recognition makes it possible to identify and track people in real time, including at lawful political protests and other gatherings,” she writes.
Through EFF’s understanding of the HART project and its concern over civil liberties, we now know more about the DHS biometric database. But there are other concerns beyond civil liberties. Security for this vast trove of the nation’s most personal information is never mentioned. Indeed, Northrop Grumman’s contract announcement merely states, “A keen focus on safeguarding personally identifiable information as well as ensuring the critical sharing of data across interagency partners underpins the technology.”
But government does not have a good track record in securing the data it holds. In 2015, The Office of Personnel Management lost personal information on 21.5 million people to what is generally believed to be Chinese government-sponsored hackers.
In 2010, Chelsea Manning (born Bradley Manning) leaked 750,000 classified or sensitive military and diplomatic documents to WikiLeaks, including the infamous ‘collateral murder’ Baghdad airstrike video.
In 2013, Edward Snowden exfiltrated and leaked thousands of classified NSA documents exposing NSA and GCHQ clandestine global surveillance programs.
In 2016, the hacking group known as The Shadow Brokers leaked a series of exploits stolen from the Equation Group – believed to be the Tailored Access Operations (TAO) unit of the NSA. One of these exploits, EternalBlue, was used in both the WannaCry ransomware and NotPetya cyberattacks of 2017.
In March 2017, WikiLeaks began publishing a series of CIA classified documents and cybersecurity exploits under the name Vault 7.
These incidents demonstrate that government databases have historically been susceptible to both external hacks and insider breaches. However, the extent to which the HART database will become a magnetic target for hackers is conjecture, and not universally agreed.
Joseph Carson, chief security scientist at Thycotic, doesn’t believe the database will be very attractive to hackers. “The only reason this would be attractive to cybercriminals,” he told SecurityWeek, “would be to sell it onwards to nation states who would use such data for intelligence or economic advantages. However, the data alone would not be as valuable without the technology that analyzes the metadata for matches and relationships. So, cybercriminals and nation states would need to compromise both to make value of the stolen data.”
Others take a different view. “This massive, aggregated database will represent an incomparable trove of intelligence about US citizens. You can be sure it will be a target,” said Rick Moy, CMO at Acalvio.
Migo Kedem, director of product management at SentinelOne, adds, “There will be many criminals and states who would like to get their hands on this type of information, ranging from commercial and marketing, through business espionage to state level.”
Protecting this database from external hackers, whether organized crime or nation states, is going to be a challenge. But it will be equally difficult to protect it from insiders. According to the EFF’s figures, the IDENT fingerprint database already holds data on 220 million individuals, and processes 350,000 fingerprint transactions every day. The full HART database will go far beyond just fingerprints, and will be shared with federal agencies outside of DHS, with state and federal law enforcement, and even with foreign governments.
The ability to control everybody with access to the database will consequently be another challenge – health workers and policemen already covertly query their own databases to provide information for worried friends and relatives. The temptation to check on the relationship patterns of a daughter’s new boyfriend – if possible – is just one danger. Looking at private industry, High-Tech Bridge CEO Ilia Kolochenko told SecurityWeek, “Data protection is certainly a high priority in large companies such as Google or Apple, but as we recently saw with Facebook – authorized third-parties are the uncontrollable Achilles’ heel.”
The subversion of authorized users through bribery, blackmail or stolen credentials is another difficulty. “When human interactions are involved, it is generally the easiest link to compromise,” says SentinelOne’s Kedem.
Just as securing access to the HART database will be difficult, so too will be securing the use of the database. While it can provide value to its users manually, there is little doubt that machine learning and artificial intelligence will be used to help locate the needles in this massive haystack. This is particularly concerning because of the intention to include ‘relationship patterns’, which will be easier sifted with AI than manual searches.
Indeed, it is tempting to wonder if HART will become the basis for the FBI’s often-promised move into ‘predictive policing’. Thycotic’s Carson believes this is probable. “This goes way back,” he said. “‘Trapwire’ was exposed by Wikileaks back in 2012 resulting from the Stratfor hacks. It reportedly used CCTV surveillance to recognize people from their facial biometrics, how they walked and even from the clothing they wear. The purpose of such technology was prioritized for national security and it has been known that such technology had existed; but this was a clear indication that it was formerly in use. However, it is now clear that such data is being used beyond national security in both government and commercial use for profit and control.”
Acalvio’s Rick Moy simply said, “Predictive models need tons of data, so it would certainly be an enabler.”
But this brings us to the next problem: false positives potentially generated by built-in bias in the artificial intelligence algorithms. Carson is not too concerned: “I would assume the results would have to be verified by a human. The AI and machine learning is typically to find the needle in a haystack and a human is used to validate the results.”
Moy, however, does have concerns. “False positives come with any algorithm based on diverse data inputs. Bias is a human trait, and humans are still writing the algorithms. But it’s worth noting that there’s quite a difference between searching for known features of a past incident versus asking a system what the most relevant features of an incident were, versus predicting who will commit a future crime.”
The implication is that use of the HART database to identify suspects is likely to be very accurate; but its use to predict criminal, terrorist or simply anti-social behavior would be worrying. If there is a bias against certain ethnic groups for, say, criminal or terrorist activity within society and existing records, that bias can potentially be transferred to the AI algorithms resulting in damaging and far-reaching false positives.
“US Congress needs to look at the old adage of ‘we could, but should we?’ while going forward with the DHS HART database,” comments Abhishek Iyer, Technical Marketing Manager at Demisto. “AI and ML algorithms often mirror and amplify the biases of the data collected. If DHS investigation will be based on biometric recognition whose accuracy is already compromised by bias, it can lead to wrongful arrests, distress for US travelers, and lost government resources.”
There is little doubt that a national biometric database could help law enforcement. But at what cost? The Electronic Frontier Foundation fears is will damage freedom of speech and association, and massively impinge upon personal privacy. But the challenges posed by HART go beyond civil liberties. Securing both access to and use of the data is going to be very difficult.