Data privacy issues arising from use of training data sets in artificial intelligence technologies - Nixon Peabody blog

The intersection of artificial intelligence and data privacy create numerous legal issues, which must be understood prior to developing, using, acquiring, or otherwise transacting with AI technologies and businesses. At the highest levels, most AI technologies rely on training data to “teach” its algorithms how to approach any given problem. For instance, a facial recognition technology may require training data comprised of millions of faces from publicly available data sets; a medical diagnostic technology may rely on millions of prior medical records and related data to arrive at a conclusive diagnosis or treatment plan. In either situation, the developer and users of the AI technology must understand the data privacy issues arising from use of these training data sets as they commercialize any technology.

Presently, the US government has not any comprehensive legislation concerning the data privacy issues arising from training data sets in AI systems. Rather, the present statutory and regulatory regime consists of a patchwork of state and federal privacy laws, and guidelines from industry practice and academia, as well as guidance from various executive agencies at both the state and federal level.

For instance, the GAO recently published “Artificial Intelligence: An Accountability Framework for Federal Agencies and Other Entities” (along with highlights), and government contractors should heed those insights from the Comptroller General’s Forum on the Oversight of Artificial Intelligence. Congress has had numerous bills pending in both houses since 2019 and 2020 relating to AI, but none of those bills have progressed meaningfully beyond initial committee consideration.

Accordingly, individual states have sought to fill the gap on particular issues. Many states such as California and New Jersey have pending legislation intended to curb the ability to use data in discriminatory or predatory ways through the implementation of AI technologies. Similarly, other states —from Illinois, Michigan, and Missouri—tackle issues such as the use of AI and data in areas such as credit decisions, employment decision-making, education, and government benefits determinations.

Both federal and state courts are beginning to see new cases addressing issues of use of data in AI systems and determining how to apportion liability across developers, users, etc.

Finally, other countries and regions have their own protocols for dealing with data use in systems such as AI technologies. Many European countries, including the United Kingdom, have signed a Declaration of Cooperation on Artificial Intelligence, with a legislative proposal expected in 2021. However, the EU already has the General Data Protection Regulation (GDPR) in place which provides a comprehensive privacy regime covering all personal data, regardless of type or context. Accordingly, any AI regulation will need to fit within the existing GDPR’s strict guidelines and it remains to be seen how that will eventually play out.

While navigating these issues may lead to internal business hurdles, it is surely better to address those hurdles in advance of any purchase, use, or development of AI technologies. As always, Nixon Peabody is tracking these developments and will provide additional updates as these issues continue to evolve.