The Future of Legal Artificial Intelligence (A.I.)—A Discourse on A.i. Components, Levels, and Biases¹

Michael Andrew Iseri

Michael Iseri is a Technology Law and Disability Rights Attorney, Cyber Security Professional, Legal Tech Accessibility Adviser, and Software Engineer. Driven by these unique skills, he wants to ensure that legal accessibility is available to all.

A. Introduction

The legal field—as with other professions—is undergoing a transformative phase that would integrate more advance technology into its legal services. Technology adoption rates have accelerated in 2020 due to COVID-19 restrictions for in-person meetings and legal hearings. Like at a trial with inadmissible evidence, the door is now wide-open to bring in technology.

This article serves as a primer on the current state of A.I. and its application to the legal field. To note, there are few resources that clearly define legal technologies—especially legal A.I.—without misleading marketing terms, grandiose claims/gimmicks, or incompatible real-world applications. Most importantly, there are different classes and case scenarios of A.I. existing in the real world such as search engine A.I., content generation A.I., navigation A.I. (such as self-driving cars), auto-response A.I., and much more. The knowledge in this article is based on the author’s unique perspective as an attorney and software engineer.² The information has also been vetted through numerous dialogues with various software engineers from Google and Uber in San Francisco and Silicon Valley.³

To begin, this article would provide an overview of the three components of an A.I. to better equip the reader with the ability to understand and characterize the different A.I. programs out in the real world. Second, there will be a brief discussion on how the three A.I. components create different levels of A.I. complexities in the real world. Lastly, this article would provide an overview on the different types of biases that could "corrupt" an A.I. program during its development and implementation stages that would likely impact any development of a legal A.I.

B. The Three A.I. Components—(1) Human Interfaces, (2) Intelligent Automatization (IA), and (3) Machine Learning (ML)

In the programming world, there is no such thing as a "true A.I.," a program that codes itself to evolve and adapt. The stuff in Hollywood and films have tainted the populace’s perceptions on what A.I. truly represents for numerous professions.

There are three main components of A.I. that can characterize an A.I. program in different professions. They are the following:

(1) Human Interface: This component is the main means of an A.I. program to communicate with humans, whether through sight, sound, touch (haptic feedback), or other means. Without this component, a program would not be able to receive or communicate back to humans on its operations. Main examples are dialogue boxes and webpages, a chatbot, voice interfaces such Amazon’s Alexa and Apple’s Siri, vibrations, sirens and alarms, and other means.
(2) Intelligent Automation Tools (IA): This component essentially defines an A.I. program’s identity and core functions by establishing its tools and operations. IA tools are coded instructions that provide the necessary means for an A.I. program to do what it is programmed to do. It is analogous to the tools that a human would use to accomplish a unique goal, such as using a saw and hammer to build a table. Most importantly, these IA tools have been programmed by humans, and no A.I. programs have been able to truly build their own IA tools outside of a control environment. Currently, IA tools are the limiting factors for A.I. programs to evolve since they require humans to program new parameters and functions. For example, a human can easily play chess or game that include more extra rows and columns than a conventional gameboard; however, an A.I. program would not be able to understand these extra outside rules without being programmed to anticipate that possibility if it is not within its existing IA tools parameters.

[Page 15]

(3) Machine Learning (ML): This component allows a programmer to have an A.I. program to fluctuate its own parameters to optimize itself or to find alternate solutions. The main benefit of ML is that it allows finer optimization at superior speeds of development by removing the programmer’s need to further fine-tune a program. For example, an A.I. program that identifies a particular fruit through images can do so without a programmer needing to continuously refine the parameters of that fruit at different angles and lighting. This is often accomplished by "feeding" the A.I. program libraries of existing images/information, such as giving an image A.I. program numerous different pictures of a single fruit for the program to refine its parameters. Additionally, the better ML tools would incorporate multiple layers of checks that conduct different analysis and judgment protocols on a particular task before coming to a consensus and a conclusion. Imagine this as numerous panels of appellate judges at different levels with different backgrounds trying to decide the outcome for a single case. To note, some people have misconstrued ML tools as A.I. itself since they appear to perform the other two A.I. components (human interface and IA tools). This is a misconception since MLs just optimize and refine the other two A.I. components. At present, MLs cannot truly create their own IA tools outside controlled environments. An example would be a user asking a voice app the outside temperature in the mornings. The human interface receives/replies via voice while the IA tools checks an online database for the outside temperature. If built into the program, the ML tools would provide a more custom human interface response to the user (such as using the user’s name and making the response shorter) and could already run the IA tools for temperatures based on the user’s likely location in anticipation of that request.

C. A Brief Overview of the Levels of A.I. — "Simple A.I.," "Sophisticated A.I.," and "True A.I."

When only two or three A.I. components are present, then you have a "simple A.I." Contrary to its name, a "simple A.I." can be extremely complex, and are often extremely efficient at accomplishing what they need to accomplish. From the author’s experience, only the human interface and IA tools are necessary components for A.I. programs to operate in professions, especially in the legal profession. Although there are numerous definitions of A.I. you can look up, often the appearance of a program performing a complex or redundant task quickly demonstrates "intelligence." The ML component is not necessary when an A.I. program does not need to refine itself after numerous uses or the A.I. program is easily adjustable by a programmer (which is a different topic on technology sustainability and deprecation). Examples of "simple A.I." in the legal profession would be basic document automatization programs/websites and e-discovery searching tools that try to find patterns based on inputted search terms.

"Sophisticated A.I." occurs when you have multiple A.I. components or numerous "simple A.I." working in conjunction to accomplish numerous tasks. The main difference between a "simple A.I." and a "sophisticated A.I." is not its complexity (though it could be a factor), but rather the enormity of separate A.I. components working along each other to accomplish an A.I. program task.

[Page 16]

Finally, "true A.I." is when you have a program that creates its own IA tools without any humans programming it to learn these IA tools. To the best of the author’s knowledge, no "true A.I." exists outside of controlled environments that had humans already guiding the program’s development. For example, you and I could try to play a new musical instrument tomorrow if we wanted to on a whim; however, an A.I. program would not have the ability to do so unless it is within its existing IA tools parameters to learn a new musical instrument. A program cannot "try" unless preprogrammed to do so. Until an A.I. program can learn a new skill on its own accord without any human intervention or guidance by humans, then "true A.I." is just a tale best told in cinema.

D. An Overview on the Program’s Bias Problem—(1) Programmer’s Bias, (2) Data Bias, and (3) Application Bias

Now that you know the components and levels of A.I. programs, creating A.I. programs for the legal field could be problematic due to the "Program’s Bias Problem." Programs have multiple stages of development that biases could be introduced and impact an A.I. program. In current diversity and inclusion research, the Program’s Bias Problem is like "implicit biases" (a belief that there is an unconscious bias affecting value judgments that is manifested in every individual). It is the author’s belief that in the programming world, biases show up at three stages of a program’s life cycle: (1) programmer’s bias, (2) data bias, and (3) application bias.

The first stage of biases is introduced at the development and programming level of a program. A company’s development committee and its programmers must make binary cutoffs throughout various parts of a program’s code for the code to function. Often, in programming, the program’s responses and observances are reflected as binary inputs/outputs of strictly 1s or 0s (e.g. on/off; yes/no; white/black; etc.). Even when variance decision making is implemented (e.g., shades of gray), there are cutoff points or thresholds in a program at the code level such as 50/50 cutoffs. The cutoff points are often made by the development committee and/or programmers implicitly or through machine learning (ML) algorithms that adjust those thresholds up and down; and these cutoff points are based on the biases of the development committee and/or programmers to accomplish their desired goals for an A.I. program.

The second stage is the data bias. For programs to function correctly with their intelligent automation (IA) tools and machine learning, they must be "fed" with vast amounts of data. The source of the data can be biased, and the bias data would make the program biased. An example is a college implementing an admissions acceptance A.I. program that would accept the best of the best candidates for the next school year.⁴ If data is used from the college’s hundred-plus years history, especially during the pre-Civil Rights era, then the program would likely incorporate racist biases in its application processes.

The third stage is application bias. This stage is how the program is used in the real world and how it could affect the overall biases for the other two stages. The best way to describe this bias is through an example. Imagine that you have the best program to detect drug usage. Amazing, right? However, what if that drug detection program is only used on specific races at traffic stops such as for Hispanics and African Americans. The use of the program in of itself creates a bias (being used at only traffic stops and being used on selected individual groups), and this bias would have an unintended consequential loop that affects programmer’s bias and data bias.

The three stages of biases have actual real-world consequences. A popular story shared amongst diversity and inclusion advisers discuss an example using image search A.I. programs distinguishing images of chihuahuas’ faces and blueberry muffins.⁵ The real-world controversial problem is actually a popular image app by Google in 2015 that tagged images of an African American couple with the photo tags of "Gorilla."⁶ After three years, Google "fixed" the problem by just removing the label "Gorilla" so no images would be labeled "Gorilla."⁷ This is a prime example of the lack of oversight in developing the A.I. program to be better at its image recognition, not testing enough various data for better A.I. program development, and not testing the program in its application before its deployment. What resulted is an A.I. image search program that has underlying racist problems from the program’s bias.

[Page 17]

Another recent example involves the recent firing/resignation of a prominent African American Google A.I. scholar, Timnit Gebru, for a soon-to-be-published paper on the risks involving large-scale human language processing A.I. programs.⁸ She warned that this large-scale A.I. could drown out smaller and more nuanced diction and linguistic cultural developments by larger and more text vocal majority.⁹ MIT Technology Review describes one of the major conclusions of the paper as the following:

It [large-scale language processing A.I. programs] will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. The result is that AI-generated language will be homogenized, reflecting the practices of the richest countries and communities.

She also highlights other problems such as the massive energy costs to train such an A.I. program from its carbon footprint and electricity, the fact that an actual A.I. program would not understand human language rather than manipulating data to reflect that the A.I. program understands human language, and the potential use of having this A.I. program generating misinformation through an illusion of meaning if successful.¹⁰ Gebru’s paper and warning further highlights the importance of oversight in developing any A.I. program at its inception with diverse perspectives, appreciating the different problems with existing and future data sources and the outcomes the A.I. program could produce which could further perpetuate biases in its application.

Conclusion

Considering that legal A.I. would require using sources that are often not the best sources of diversity and perspectives, the future of legal A.I. appears to be bleak in its actual unbiased development and application. One of the biggest problems for future legal A.I. programs would be based on a numerous statistical findings of gender, race, LGBT+, and disability representation at law firms and courts. For example, representation of attorneys with disabilities in law firms was at 0.54% in 2017 compared to U.S. Census Bureau’s data report that shows ~20% of the general population had a disability in 2010.¹¹ The future of legal A.I. and its problems would be just more prevalent as the legal profession embraces technology in the post-COVID world.

[Page 18]

——–

Notes:

^1. ©2021 by the American Bar Association. Reprinted with permission. All rights reserved. This information or any or portion thereof may not be copied or disseminated in any form or by any means or stored in an electronic database or retrieval system without the express written consent of the American Bar Association.

^2. For a quick overview, the author has personally developed numerous legal "A.I." from scratch in forty-five+ fully voiced languages that dynamically complete legal services as well as fully voiced homeless/COVID-19 resource map systems for California, fully voiced Constitution and Miranda rights programs, and legal guides on the Google Play Store. The author also had automated fully voiced bar exam flash card study programs, but they were decommissioned when the California Supreme Court appointed the author to the California Committee of Bar Examiners to create the July/February California bar exams for a four-year term.

^3. In the technology and programming fields, the best sources of information are from local meetups since technology moves extremely fast (faster than writing and publishing articles at times). It is easier to share and disseminate new technology concepts and best practices at pre-Covid meetups.

^4. Oscar Schwartz, Untold History of AI: Algorithmic Bias Was Born in the 1980s, IEEE SPECTRUM (APR. 15, 2019), https://spectrum.ieee.org/tech-talk/tech-history/dawn-of-electronics/untold-history-of-ai-the-birth-of-machine-bias.

^5. Mariya Yao, Chihuahua or Muffin? My Search for the Best Computer Vision API, FREECODECAMP (OCT. 12, 2017), https://www.freecodecamp.org/news/chihuahua-or-muffin-my-search-for-the-best-computer-vision-api-cbda4d6b425d/.

^6. Loren Grush, Google Engineer Apologizes After Photos App Tags Two Black People as Gorillas, THE VERGE (JULY 1, 2015), https://www.THEVERGE.coM/2015/7/1/8880363/google-apologizes-photos-app-tags-two-black-people-gorillas.

^7. James Vincent, Google ‘Fixed’ Its Racist Algorithm by Removing Gorillas from Its Image-labeling Tech, THE VERGE (JAN. 12, 2018), https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai.

^8. Matt O’Brien, Google AI Researcher’s Exit Sparks Ethics, Bias Concerns, AP (Dec. 4, 2020), https://apnews.com/article/business-apple-inc-artificial-intelligence-00c1dab0a727456df9e5ef9c6160c792.

^9. Karen Ho, We Read the Paper that Forced Timnit Gebru Out of Google. Here’s What It Says, MIT TECH. REV. (DEC. 4, 2020), https://www.technologyreview.com/2020/12/04/1013294/google-ai-ethics-research-paper-forced-out-timnit-gebru/.

^10. Id.

^11. Angela Morris, Are Law Firms Committed to Disability Diversity? A Handful of Firms Have Taken Action, ABA J. (Oct. 24, 2018), https://www.abajournal.com/news/article/law_firms_disability_diversity.

The Future of Legal Artificial Intelligence (A.I.)—A Discourse on A.i. Components, Levels, and Biases

The Future of Legal Artificial Intelligence (A.I.)—A Discourse on A.i. Components, Levels, and Biases1

The Future of Legal Artificial Intelligence (A.I.)—A Discourse on A.i. Components, Levels, and Biases¹