Privacy Law
PRIVACY LAW ISSUES ASSOCIATED WITH DEVELOPING AND DEPLOYING GENERATIVE AI TOOLS
VOLUME 1, 2024, PRIVACY LAW SECTION JOURNAL
Written by Jonathan Tam*
MCLE Self-Study Article
The past couple of years have seen significant technological advancements in artificial intelligence (“AI”) and legal developments applicable to organizations that develop and deploy (i.e., adopt and use) AI. This article outlines examples of developments related to privacy law and AI at the U.S. federal and California state level and examines at a high level some privacy issues that organizations should consider before developing or deploying generative AI (“GenAI”) tools, which are a subset of AI technologies that generate new content in response to a user instruction or prompt.
EXAMPLES OF DEVELOPMENTS RELATED TO PRIVACY LAW AND AI AT THE FEDERAL AND CALIFORNIA STATE LEVEL
On October 30, 2023, President Biden issued the “Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence” (“EO 14110”).[1] The order defines “AI” as “a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine and human-based inputs to perceive real and virtual environments; abstract such perceptions into models through analysis in an automated manner; and use model inference to formulate options for information or action.”[2] EO 14110 also defines “generative AI” as “the class of AI models that emulate the structure and characteristics of input data in order to generate derived synthetic content,” and “synthetic content” as “information, such as images, videos, audio clips, and text, that has been significantly modified or generated by algorithms, including by AI.”[3]
As a side note, these definitions arguably do not cover all systems that many people would consider to be GenAI systems. For example, many people would consider chatbots and tools that can create text responses, images, audio clips or videos based on user prompts to constitute GenAI. But if such a tool can only be used to create artwork, summarize other works, or develop harmful materials such as misinformation, the tool arguably falls outside EO 14110’s definition of “AI” because such output does not constitute “predictions, recommendations, or decisions”. Another argument is that the definition of “AI” is too broad. If one interprets its elements expansively—for example, by construing the word “decision” to mean any algorithmic output—then the definition arguably covers any software that runs on a machine, was designed or used by a human, and generates algorithmic output. This side note is intended to suggest that the terms “AI” and “GenAI” are not easily defined and there may be competing theories on how they should be defined.
EO 14110 calls out the need for the Federal Government to protect Americans’ privacy. The order pursues this objective in various ways, including by: (i) ordering the Office of Management and Budget to develop guidance on how federal agencies should procure and process “commercially available information” in a privacy-protective way; (ii) promoting the adoption of “differential-privacy guarantees” so that datasets about groups of entities that an organization shares with another cannot easily be used to identify specific entities from that dataset; and (iii) ordering the creation of a government-funded body called the Research Coordination Network dedicated to advancing privacy research and developing privacy-enhancing technologies.[4]
On October 4, 2022, the White House issued the Blueprint for an AI Bill of Rights (“Blueprint”),[5] which sets forth five non-legally-binding principles intended to protect people from the harms of automated systems. One of these principles is centered on data privacy.[6] The Blueprint notes, among other things, that designers, developers and deployers of automated systems should: (i) set privacy defaults so that they conform with users’ reasonable expectations; (ii) only collect personal information that is strictly necessary for the specific context; (iii) seek permission to process personal information where appropriate; (iv) provide privacy notices and consent requests in a plain language; (v) implement special protections for sensitive data; and (vi) avoid unchecked surveillance.
The Federal Trade Commission (“FTC”) has also published various guidance documents focused on AI issues,[7] including one that describes GenAI as follows:[8]
“Generative AI” is a category of AI that empowers machines to generate new content rather than simply analyze or manipulate existing data. By using models trained on vast amounts of data, generative AI can generate content—such as text, photos, audio, or video—that is sometimes indistinguishable from content crafted directly by humans. Large language models (LLMs), which power chatbots and other text-based AI tools, represent one common type of generative AI. Many generative AI models are developed using a multi-step process: a pre-training step, a fine-tuning step, and potential customization steps. These steps may all be performed by the same company, or each step may be performed by a different company.
The FTC has the authority to take privacy-related enforcement actions against companies, including under the Children’s Online Privacy Protection Act and its regulations (“COPPA”), and Section 5 of the Federal Trade Commission Act (“FTC Act”), which prohibits unfair or deceptive acts or practices in or affecting commerce. The FTC has warned that AI, including GenAI, can be used to engage in privacy infringements,[9] and published statements focused on the intersection of AI and biometric information.[10]
At the state level, California Governor Newsom published an executive order on GenAI on September 6, 2023.[11] The order requires, among other things, that a handful of state government agencies issue general guidelines for public sector procurement, uses and required trainings of GenAI that address applicable privacy risks. The order does not enumerate new privacy risks but refers to risks already outlined in the White House’s Blueprint.[12]
On November 16, 2023, the California Bar’s Board of Trustees approved Practical Guidance for the Use of Generative Artificial Intelligence in the Practice of Law.[13] The guidance does not categorically prohibit lawyers from using AI, but identifies a number of ways in which lawyers’ ethical and professional obligations apply to the use of GenAI. For example, the guidance reminds lawyers that GenAI raise privacy law issues and lawyers cannot counsel a client to engage in a violation of laws, or assist in any such violations, when using GenAI tools.[14]
On August 29, 2023, the California Privacy Protection Agency (“CPPA”) published draft regulations regarding risk assessments.[15] By way of background, the California Consumer Privacy Act of 2018 (“CCPA”),[16] contemplates that businesses (i.e., entities that do business in California, determine the means and purposes of processing personal information and meet certain quantitative thresholds) must regularly submit risk assessments to the CPPA when they process California residents’ personal information in ways that present significant risks to their privacy or security. The CPPA’s draft regulations include a definition of “Artificial Intelligence” that is similar, but arguably broader than the definition in EO 14110.[17] It states, among other things, that businesses that process California residents’ personal information to train such technologies automatically engage in processing activities that present significant risks to their privacy, thereby triggering duties to complete a risk assessment.[18] The draft regulations enumerate various elements that a risk assessment must incorporate, including the benefits resulting from the processing, the negative impacts to California residents’ privacy associated with the processing, the planned safeguards to address the negative impacts, and whether the negative impacts, as mitigated by the planned safeguards, outweigh the benefits resulting from the processing.[19] The CPPA’s draft risk assessment regulations may undergo further revision prior to finalization.
New legal developments governing the privacy dimensions of AI continue to unfold at a rapid rate, and privacy practitioners should continue to monitor the space for new laws, regulations, cases and regulatory guidance materials.
PRIVACY ISSUES COMMONLY RAISED WHEN DEVELOPING OR DEPLOYING GENAI
As discussed above, companies that develop GenAI tools typically procure large sets of raw data, modify the data to produce training data (such as by deleting duplicate data), feed the training data into an AI model to train it to recognize patterns, and fine-tune the AI model until it meets certain standards, such as to ensure the output is sufficiently responsive, intelligible and accurate. The datasets that developers procure and process to train GenAI tools, and the output that the tools generate, may contain personal information. In addition, companies that deploy GenAI tools may include personal information in the prompts or other datasets that they want the tools to take into account when generating output.
A number of privacy compliance considerations and requirements may apply to developers of GenAI tools, such that they may wish to:
• Consider whether and the extent to which the CCPA’s exemption for “publicly available information” may apply to the developer’s proposed processing activities, such as the procurement of data for training purposes;[20]
• Provide notices (such as notices at collection, which must be provided at or before the point at which a business subject to the CCPA collects a California resident’s personal information, unless an exception applies) and obtain consents (which a business subject to the CCPA must do in certain situations, such as before selling the personal information of a minor under the age of 16, unless an exception applies), as required or appropriate, to individuals whose personal information is collected and used to train the model or operate the tool, or whose personal information may be included in the tool’s output;[21 ]
• Avoid retaining personal information for longer than reasonably necessary to discharge the disclosed purposes for which it was collected, which may require deleting training data once it is no longer reasonably necessary to train the model;[22]
• Comply with the CCPA’s necessity, proportionality and purpose limitation requirements, which may require an overall examination of what types of data and processing are necessary and proportional to the development of the AI model;[23]
• Determine whether the developer sells personal information, as the CCPA defines “sell”,[24] in connection with developing or operating the model (which may be the case if a third party can use personal information from the developer for its own purposes, even if the third party did not pay for the information) and, if there is selling and the CCPA applies, comply with various related obligations including to obtain opt-in consent for minors under the age of 16 and giving other individuals the ability to opt-out of sales;[25]
• Evaluate whether the developer uses California residents’ “sensitive personal information”, as the CCPA defines this term,[26] for purposes not subject to an exception or exemption and, if so, comply with requirements related to allowing them to opt out of such uses of their sensitive personal information— note that the CCPA exceptions include to perform, improve, upgrade or enhance services, which may apply to training an AI model;[27]
• Honor requests from California residents to know, delete or correct their personal information, which may oblige the developer to maintain granular control over how the tool generates output about a particular individual (for example, the tool may need to “relearn” facts about an individual if its prior configuration generated incorrect information about the individual and the individual submitted a correction request);[28] and
• Implement security measures as required by applicable laws to protect personal information from unauthorized or illegal processing, which may include red-teaming (i.e., intentionally acting as an adverse party to test the vulnerabilities of the system) the model to minimize the risks of it revealing personal information about individuals unless there is a lawful basis to do so.[29]
A number of privacy compliance considerations and requirements may also apply to deployers of GenAI tools, such that they may wish to:
• Consider whether it is permitted to disclose personal information to developers and operators of GenAI tools, such as in prompts, or if it is required to issue any notices or obtain any consents before doing so;[30]
• Determine whether it is necessary to conduct a risk assessment, privacy impact assessment or similar assessment to evaluate the risks versus the benefits of deploying the GenAI tool;[31]
• Enter into appropriate data processing or protection clauses with the provider of the tool, depending on whether the provider serves as the deployer’s “service provider” or “contractor”, or as a “third party”, as the CCPA defines these terms;[32]
• Evaluate whether the provider’s data security measures are adequate;[33]
• Implement policies, protocols and training to ensure that personnel who use the tool do so only in compliance with applicable legal obligations, which may include compliance with necessity, proportionality and purpose limitation requirements;[34] and
• Ensure that any output generated by the tool is covered by the deployer’s protocols and policies related to honoring California residents’ CCPA rights, including access, deletion, correct and optout rights (e.g., prompts and outputs containing an individual’s personal information will be deleted upon request unless an exception applies).[35]
The above considerations are some examples of privacyrelated points that companies may wish to take into account when developing, providing and deploying GenAI tools. Issues outside of privacy may also apply to the development and deployment of GenAI, including under intellectual property, anti-discrimination, product safety, contract, tort and other laws.
Endnotes
Jonathan Tam is a partner in Baker McKenzie’s San Francisco office specializing in privacy, cybersecurity, consumer protection and tech transactions. He is dually licensed in Canada and the U.S. and has helped numerous companies with implementing privacy and security compliance programs, leading data incident responses, negotiating data processing and other agreements, and representing companies in the context of data-related regulatory investigations. Jonathan is incoming Chair of the Executive Committee of the Cybersecurity & Privacy Section of the San Francisco Bar Association, and CIPP/C and CIPP/US certified. He regularly publishes and speaks on privacy and tech topics. He is a graduate of the University of Toronto Faculty of Law and Harvard College.
- “Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.” White House. October 30, 2023. Available at https:// www.whitehouse.gov/briefing-room/presidentialactions/ 2023/10/30/executive-order-on-the-safe-secureand- trustworthy-development-and-use-of-artificialintelligence/.
- Ibid., at Subsection 3(b).
- Ibid., at Subsections 3(p) and (ee).
- Ibid., at Section 9.
- “Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People.” October 3, 2023. The White House Office of Science and Technology Policy. Available at: https://www.whitehouse.gov/ostp/ai-bill-ofrights/.
- The others are: (i) safe and effective systems; (ii) algorithmic discrimination protections; (iii) notice and explanation; and (iv) human alternatives, consideration and fallback.
- See, e.g., “Consumers are Voicing Concerns about AI”. Federal Trade Commission. October 3, 2023. Available at: https://www.ftc.gov/policy/advocacy-research/tech-atftc/ 2023/10/consumers-are-voicing-concerns-about-ai.
- “Generative AI Raises Competition Concerns”. Federal Trade Commission. June 29, 2023. Available at: https:// www.ftc.gov/policy/advocacy-research/tech-atftc/ 2023/06/generative-ai-raises-competition-concerns.
- “FTC Authorizes Compulsory Process for AI-related Products and Services”. Federal Trade Commission. November 21, 2023. Available at: https://www.ftc.gov/ news-events/news/press-releases/2023/11/ftc-authorizescompulsory- process-ai-related-products-services.
- See, e.g., “Preventing the Harms of AI-enabled Voice Cloning”. Federal Trade Commission. November 16, 2023. Available at: https://www.ftc.gov/policy/advocacyresearch/ tech-at-ftc/2023/11/preventing-harms-aienabled- voice-cloning.
- Executive Order N-12-23. Office of Governor Gavin Newsom. September 6, 2023. Available at: https://www. gov.ca.gov/wp-content/uploads/2023/09/AI-EO-No.12-_- GGN-Signed.pdf.
- Ibid., at Subsection 3(a).
- “Practical Guidance for the Use of Generative Artificial Intelligence in the Practice of Law”. The State Bar of California. November 16, 2023. Available at: https://www. calbar.ca.gov/Portals/0/documents/ethics/Generative-AIPractical- Guidance.pdf.
- Ibid., at page 3.
- “Draft Risk Assessment Regulations for California Privacy Protection Agency September 8, 2023 Board Meeting” (“Draft Risk Assessment Regulations”). California Privacy Protection Agency. August 29, 2023. Available at: https:// cppa.ca.gov/meetings/materials/20230908item8part2.pdf.
- As amended by the California Privacy Rights Act of 2020.
- The draft regulations defined “Artificial Intelligence” as “an engineered or machine-based system that is designed to operate with varying levels of autonomy and that can, for explicit or implicit objectives, generate outputs such as predictions, recommendations, or decisions that influence physical or virtual environments. Artificial intelligence includes generative models, such as large language models, that can learn from inputs and create new outputs, such as text, images, audio, or video; and facial or speech recognition or detection technology.”
- Draft Risk Assessment Regulations at page 4.
- Ibid., at pages 6-11.
- The CCPA excludes “publicly available information” from the scope of protected personal information, and defines publicly available information as “information that is lawfully made available from federal, state, or local government records, or information that a business has a reasonable basis to believe is lawfully made available to the general public by the consumer [i.e., the California resident who is the subject of the information] or from widely distributed media; or information made available by a person to whom the consumer has disclosed the information if the consumer has not restricted the information to a specific audience. “Publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge.” Cal. Civ. Code § 1798.140(v)(2).
- See, e.g., Cal. Civ. Code §§ 1798.100, 1798.120(c) and 1798.125.
- See, e.g., ibid at § 1798.100(a)(3).
- See, e.g., ibid at § 1798.100(c).
- The CCPA defines “selling” to mean the disclosure of personal information for monetary or other valuable consideration unless an exception applies. Cal. Civ. Code § 1798.140(ad).
- See, e.g., ibid at § 1798.120.
- The CCPA defines “sensitive personal information” to include certain categories of personal information including a California resident’s racial or ethnic origin, genetic data, passport number, and other categories. The definition was recently amended to include citizenship and immigration status as well (AB 947). For the full definition of “sensitive personal information”, see Cal. Civ. Code § 1798.140(ae).
- See, e.g., ibid at § 1798.121.
- See, e.g., ibid at § 1798.130.
- See, e.g., ibid at § 1798.100(e).
- See, e.g., Cal. Civ. Code §§ 1798.100, 1798.120(c) and 1798.125.
- See, e.g., ibid at § 1798.185(15)(B) and supra at note 14.
- See, e.g., ibid at §§ 1798.100(d) and 1798.145(j), (ag) and (ai).
- See, e.g., ibid at § 1798.100(e).
- See, e.g., ibid at § 1798.100(c).
- 35. See, e.g., ibid at § 1798.130.