The Bureaucratic Challenge to AI Governance: An Empirical Assessment of Implementation at U.S. Federal Agencies

Can government govern artificial intelligence (AI)? One of the central questions of AI governance surrounds state capacity, namely whether government has the ability to accomplish its policy goals. We study this question by assessing how well the U.S. federal government has implemented three binding laws around AI governance: two executive orders—concerning trustworthy AI in the public sector (E.O. 13,960) and AI leadership (E.O. 13,859)—and the AI in Government Act. We conduct the first systematic empirical assessment of the implementation status of these three laws, which have each been described as central to US AI innovation. First, we track, through extensive research, line-level adoption of each mandated action. Based on publicly available information, we find that fewer than 40 percent of 45 legal requirements could be verified as having been implemented. Second, we research the specific implementation of transparency requirements at up to 220 federal agencies. We find that nearly half of agencies failed to publicly issue AI use case inventories—even when these agencies have demonstrable use cases of machine learning. Even when agencies have complied with these requirements, efforts are inconsistent. Our work highlights the weakness of U.S. state capacity to carry out AI governance mandates and we discuss implications for how to address bureaucratic capacity challenges.


INTRODUCTION
Can government govern AI? Many commentators have discussed the normative question of government intervention into the market [80,86,93,98,103]. We address a distinct, but related, empirical question that highlights the bureaucratic challenge to AI governance: Is there sufficient state capacity to achieve the goals of AI governance when such goals have already been set in law?
Many scholars, policymakers, and commentators point to the transformative potential of AI [23,125]. Seeking to capture the benefits of the "Fourth Industrial Revolution" or "third wave of the digital revolution," countries are prioritizing efforts to reorganize their public and private sectors, fund research and development (R&D), and establish structures and policies that unleash AI innovation [37, 73,112,143]. In the United States, the White House and Congress have promoted AI innovation and its trustworthy deployment by increasing R&D investments, exploring mechanisms to increase equitable access to AI-related resources through a National Artificial Intelligence Research Resource, funding National AI Research Institutes throughout the country, dedicating $280 billion-through the CHIPS and Science Act-into domestic semiconductor manufacturing and "industries of tomorrow," and coordinating AI policy in the National AI Initiative Office within the White House [10, 32, 40, 49, 53, 54, 69,72]. While many have rightly applauded the Blueprint for an AI Bill of Rights and the associated actions across the federal government [44,48], implementing that framework ultimately requires government agencies convert guidance and principles into practice. 1 Federal AI initiatives raise at least three interrelated questions that are relevant to the academic literature and that implicate questions about policy effectiveness. First, as a question of regulatory paradigm, we might ask about the proper role of the state vis-à-vis industry and civil society actors, especially given the deep information asymmetries that plague state-based regulatory initiatives [98]. Second, conditional on the chosen paradigm, we might also ask about the proper policy instrument, implicating familiar debates over the specificity of rules in comparison to standards [82,96] or the proper target of rules [90] given the policy context. Third, after policy instruments are determined, we could assess the capacity of bureaucracies to effectuate those actions' purpose [81,115].
We contribute to this scholarship through a systematic assessment of the federal government's progress in implementing three important binding laws that are seen as central to U.S. leadership in trustworthy AI. 2 Through extensive research, we study (i) the AI in Government Act of 2020 [19,28], which aimed to provide resources and guidance to federal agencies on AI; (ii) the Executive Order on AI Leadership (E.O. 13,859) [12], which mandated government-wide efforts to promote AI R&D, AI competitiveness, and public trust; and (iii) the Executive Order on Trustworthy AI in Government (E.O. 13,960) [25], which encouraged government adoption of AI to benefit the public and promulgated trustworthy AI principles. 3 Collectively, the AI in Government Act, the AI Leadership Order, and the Trustworthy AI Order are critical pillars to the U.S. strategy on AI 4 and to envisioning an ecosystem where the U.S. government leads in AI and promotes trustworthy AI [71].
While much progress has been made, our findings-from a systematic examination conducted between late October and mid-November 2022-are sobering and highlight longstanding concerns about bureaucratic capacity. The goal of these laws to foster a responsible AI innovation ecosystem is threatened by weak and inconsistent implementation across the administrative state. First, fewer than 40 percent of all 45 requirements across the three pillars could be publicly verified as implemented at the time of our examination, including major requirements to advance AI innovation and trustworthy AI. Second, the implementation of Agency AI Plans, which are intended to provide information about the agency's approach to AI regulatory activities and to foster the agency's strategic planning around AI, has been poor. Around 88 percent of agencies that are likely subject to the requirement to submit Agency AI Plans under the AI Leadership Order failed to do so by late 2022. 5 Third, roughly half or more of agencies had not published an inventory of AI use cases, as required under the Trustworthy AI Order and in contradiction with public transparency efforts. Given Congress has since made disclosing AI use case inventories a statutory requirement under the 2023 National Defense Authorization Act [77], the lack of implementation is especially concerning.
These findings suggest a lack of bureaucratic capacity compounded by issues of policy ambiguity: Agencies lack the expertise, committed leadership, and sheer personnel to strategically plan for and prioritize AI, and compliance is hindered by vague mandates and reporting lines. We thus suggest three policy recommendations. First, centralized mandates must delineate (1) which agencies and sub-agencies must comply, (2) what "AI" applications are covered, and (3) how to interpret non-responses. This places agencies on notice about their obligations and facilitates public accountability. Second, if bureaucratic capacity is to blame, Congress must provide more resources for agencies to obtain adequate technical expertise. Third, senior leadership at the White House and at agencies 3 We do not focus on the National AI Initiative Act of 2020, as the National AI Advisory Commission is statutorily tasked with tracking the Initiative's progress, nor on the Artificial Intelligence Training for the Acquisition Workforce Act, as its passage in October 2022 precludes meaningful assessment of its implementation. For more on the National AI Advisory Council, tasked with "advising the President and the National AI Initiative Office on topics related to the National AI Initiative, " the creation of which was called by the National AI Initiative Act of 2020, see [56,95]. 4 The U.S. government does currently not have a "National AI Strategy" per se, but instead has a number of documents, including the three assessed in this Paper, that collectively provide strategic guidance. The National AI Initiative Office maintains a list of related legislation, executive orders, and strategy documents. See [71]. 5 The requirement is in Section 6(c) of the AI Leadership Order, [12], and OMB's guidance was published in a memorandum known as "OMB M-21-06" [138]. is needed and senior personnel at agencies should treat these requirements not as boxes to tick but as opportunities for strategic planning around AI.
Our paper proceeds as follows. Section 2 discusses related scholarship in public administration, bureaucratic politics, and transparency initiatives for public sector AI. Section 3 provides background on the three binding laws we assessed. Section 4 discusses our methodology for systematically assessing the implementation of these laws. Section 5 provides detailed findings on the implementation of the AI Leadership Order, Trustworthy AI Order, and AI in Government Act. Section 6 examines the AI Leadership Order's requirement that agencies publish Agency AI Plans in detail across 41 agencies. Section 7 assesses the Trustworthy AI Order's requirement that agencies publish AI use case inventories across 220 agencies and narrower subsets of agencies. Section 8 discusses implications and limitations and Section 9 concludes.

RELATED WORKS
Our study of bureaucratic implementation of AI governance speaks to four bodies of research. First, our work relates to longstanding scholarship on state and bureaucratic capacity to achieve policy goals [84,109,115,121]. Prior research shows that agency performance and the realization of White House-level political goals are frustrated by organizational capacity constraints, including insufficient leadership, staff, and resources. For example, Bolton, Potter, and Thrower [81] analyzed 22,000 regulations reviewed by the Office of Information and Regulatory Affairs (OIRA) within OMB and found that organizational capacity constraints, including vacant leadership positions, insufficient staff resources, and high workloads, hindered the president's ability to advance priority rules and inhibited OIRA's ability to carry out its mission.
In the AI space, agencies' struggle to attract and retain technical talent is a hurdle to the executive branch's ability to responsibly adopt and govern AI [88,101,102]. One estimate is that while 60% of new machine learning PhD graduates went into industry and 24% into academia, less than 2% went into government in 2020 [144]. Embedded AI expertise, as Engstrom, Ho, Sharkey, and Cuéllar [102] detailed and as other scholars noted (e.g., [88,101,128]), is critical for agencies' efficacy in designing, developing, and using AI tools to achieve their mission and subjecting AI tools to meaningful accountability. These concerns about bureaucratic capacity, in turn, can inform broader normative assessments of the federal government's current ability to promote trustworthy AI. 6 Second, our research speaks to the central debate on the role of government in AI policy, where jurisdictions have diverged between taking a more "passive" role that gives space for industry self-regulation versus an "active" role through direct regulation (e.g., [98]). These debates imagine diverse roles for the state, whether as an interlocutor with industry to help develop best practice, a research funder, an adopter of responsible and trustworthy AI technologies, a direct regulator, or some combination of the above [86,93,103,136]. Such normative debates can and should be informed by empirical evidence, including about the relative advantages and capabilities of different institutional actors. For example, Black and Murray [80] comment that a central issue about who ought to regulate concerns where "trust and legitimacy" liewhether for a transnational standard-setting organization, a corporation engaging in self-regulation, or a state-based regulatory body. Regulation has classically been justified based on the expertise of technocratic government agencies (e.g., [123]), but AI poses extreme information asymmetries between technology developers and policymakers [88,91,108], in addition to concerns about public-private gaps in expertise [88,144]. For those who believe in a robust role for the state in AI governance, our work addresses a core question: whether the government has the capacity to effectively regulate AI.
Third, our work pertains to efforts for transparency around the administrative state [89,101,111]. Principles of transparency and accountability are foundational to administrative law (e.g., [87,101]). In the U.S. context, much scholarship has examined transparency initiatives such as the Freedom of Information Act, sunshine laws and hearing requirements, notice-and-comment rulemaking, and public availability of agency guidance (e.g., [87,107,119,127]). Calls for greater transparency around the U.S. government's use of AI are therefore situated not only within research about the role of transparency in administrative law but also within discussions about the benefits and risks posed by agencies' use of AI (see, e.g., [88,102]). One major question surrounds how public sector AI challenges administrative law's commitment to transparency. Coglianese and Lehr [92] argue that the opacity of AI does not pose particular barriers to administrative law. Engstrom and Ho [100], on the other hand, argue that existing administrative law doctrines may be insufficient, requiring adaptations of governance. The importance of government transparency about its use of AI necessitates a discussion about the proper lever to achieve such transparency. 7 Last, many efforts have focused on transparency through public registries of AI use cases. Floridi [104] discusses the promise of AI registries in Helsinki and Amsterdam, noting that, the "goal is to make the use of urban AI solutions as responsible, transparent, and secure as other local government activities. " Other countries, such as the United Kingdom, have adopted these AI registries [116]. At the local government level in the U.S., Bloomberg Philanthropies uses AI registries as one evaluation criterion for its "What Works Cities" Certification, which it claims is the "national standard of excellence for data-driven, well-managed local government" [68]. The City of San Jose, for instance, began an Algorithm Register in January 2023 for transparency of city services [60]. Yet the implementation of such transparency initiatives has not been straightforward. New York City's Automated Decision Systems Task Force fractured in substantial part because of a lack of consensus around what constituted algorithmic decision systems. Cath and Jansen [85] question the efficacy of the Helsinki and Amsterdam model of AI registries as a form of governance. The Administrative Conference of the United States (ACUS) commissioned a report that compiled AI use 7 Calls for transparency exist not only at the federal level but also at the state level. A proposal in California (A.B. 331), for example, seeks to require AI developers to submit impact assessments annually to the California Civil Rights Department [142]. cases across federal regulatory agencies [102], requiring a large team to determine, for instance, whether the underlying use case met a definition of machine learning. This report preceded the promulgation of the AI Use Case Inventory requirement via executive order. And because requirements differ across jurisdictions, efforts like the Northwestern Computational Journalism Lab's Algorithm Tips have attempted to crowdsource information across the federal, state, and local level [62]. AI registers have been advocated in other domains as well [132], and remain one of the critical levers for transparency. Our research examines the actual implementation of such AI registries and demonstrates that substantial policy guidance may be required for faithful implementation.

LEGAL SETTING
We address this core question of bureaucratic capacity for AI governance by assessing three pillars of America's strategy for AI innovation. The two executive orders and AI in Government Act all carry the force of law, and so the executive branch's ability to implement them serves as an important litmus test for the U.S. government's realization of its AI policy goals. Moreover, these laws are billed as cornerstones of America's AI policy. By enabling America "to coordinate AI strategy" and equipping federal agencies' responsible use of AI, the AI in Government Act sought to ensure America's "competitive edge against the rest of the world in the next decade" [28]. The AI Leadership Order was similarly touted as "critically important to maintaining American leadership in technology and innovation" [16], whereas the Trustworthy AI Order "signal[ed] to the world" America's commitment to "the development and use of AI underpinned by democratic values" [11,24]. To achieve their stated goals, the AI Leadership Order sought to drive technological breakthroughs throughout all sectors of the U.S., while the two other efforts focused on the federal government's use of AI. We describe each of the laws in turn.
Executive Order 13,859 (The AI Leadership Order). The 2019 AI Leadership Order launched the American AI Initiative to "focus the resources of the Federal government to develop AI in order to increase our Nation's prosperity, enhance our national and economic security, and improve quality of life for the American people" [11]. Specifically, it sought to accelerate the federal government's efforts to build the infrastructure, policy foundations, and talent necessary for America's leadership in AI through a multipronged approach emphasizing AI R&D, AI-related data and resources, regulatory guidance and technical standards, the AI workforce, public trust in AI, and international engagement [11,12,117]. Noting that a "coordinated Federal Government strategy" was necessary and that AI "will affect the missions of nearly all executive departments and agencies, " the AI Leadership Order further mandated that agencies pursue six related strategic objectives for "promoting and protecting American advancements in AI." These six strategic objectives were about: (1) investing in AI-related research and development; (2) making AI resources (e.g., data, models, computing resources) available to the public; (3) reducing barriers that prevent the development and use of AI technologies; (4) ensuring that domestic and international technical standards "minimize vulnerability to attacks from malicious actors and reflect Federal priorities"; (5) building the AI workforce; and (6) developing a National Security Presidential Memorandum "to protect the advantage of the United States in AI and technology critical to United States economic and national security interests" [12].
Executive Order 13,960 (The Trustworthy AI Order). The 2020 Trustworthy AI Order directed federal agencies to harness "the potential for AI to improve government operations" [24]. Recognizing that "[t]he ongoing adoption and acceptance of AI will depend significantly on public trust, " the Trustworthy AI Order articulated nine principles for federal agencies to implement-according to guidance that would be developed by the OMB-when designing, developing, acquiring, and using AI. These principles provide that AI should be (a) lawful, (b) performance-driven, (c) accurate, reliable, and effective, (d) safe, secure, and resilient, (e) understandable, (f) responsible and traceable, (g) regularly monitored, (h) transparent, and (i) accountable [25]. To support federal AI adoption, it also mandated several actions intended to increase the number of federal employees with necessary AI implementation expertise [24]. Like the AI Leadership Order, the Trustworthy AI Order required agencies to publicly disclose certain AI-related information in an attempt to cultivate trust and understanding (see Section 7). The requirement of disclosing AI use cases was also incorporated into the 2023 National Defense Authorization Act [77], meaning Congress, too, has directed federal agencies to take inventory and disclose their uses of AI, reflecting the perceived importance of this transparency measure.
AI in Government Act of 2020. The AI in Government Act sought to "ensure that the use of AI across the federal government is effective, ethical and accountable by providing resources and guidance to federal agencies" [28]. This included the establishment of an AI occupational series, a call for formal guidance for agency usage, procurement, bias assessment and mitigation of AI, and the creation of a center of excellence within the General Services Administration (GSA) to support government adoption of AI.

METHODOLOGY
These three laws have been in effect for sufficient time to enable us to design a study to assess the implementation status of each line-level provision. The research was based on an extensive manual search protocol-conducted between October and November of 2022-detailed in Appendices A.1, B.1, and C.1, but we provide a concise overview of our research approach here. We note at the outset that because these laws impose public transparency and reporting requirements, we rely on public materials to conduct our searches. We undertook extensive efforts to identify relevant documents or notices of actions, but these may not capture all relevant (nonpublic) actions. Our findings still remain informative about the transparency of national AI efforts, and failures to implement by statutory or regulatory deadlines are particularly informative.
To assess overall implementation, we identified all line-level actions within the three documents (e.g., instructions that a federal entity "shall budget, " "shall consider, " "shall review, " "shall publish"). Each line-level action was categorized as a time-boxed requirement, where the action was required by a specified date (e.g., publishing a report within 90 days); an open-ended requirement, where the mandated action did not have a specific date for completion; or an ongoing requirement, where the mandate did not include a specific deliverable or concrete outcome and where there was no specified deadline. It was generally straightforward to assess whether the time-boxed requirements were met, whereas other mandated actions were often more ambiguous, either due to lack of a deadline, lack of express public disclosure requirements, or both. We construed ambiguity in favor of the agencies based on an assumption that the agencies were taking the necessary steps (or at least making good-faith efforts) to implement these mandates, as explained in Appendix A.
In addition, we studied the implementation status of two specific cross-agency mandates: the requirement under the AI Leadership Order for agencies to issue "Agency AI Plans, " and the requirement under the Trustworthy AI Order for agencies to post AI use case inventories. In the former, the AI Leadership Order required "implementing agencies"-defined to be agencies, as determined by the National Science and Technology Council (NSTC) Select Committee on AI, with regulatory authorities and that "conduct foundational AI R&D, develop and deploy applications of AI technologies, provide educational grants, and regulate and provide guidance for applications of AI technologies"-to issue a report discussing its authorities and plans to regulate AI. The Trustworthy AI Order, by contrast, ordered all "agencies" (with exceptions only for military, intelligence, and independent regulatory agencies) to disclose their uses of AI.
Ambiguities in the scope of these executive orders-the agencies they cover and, for the AI use case inventories, the definition of "AI"-complicated assessment of their implementation. For the Agency AI Plans, we looked to agencies with regulatory authority and therefore included cabinet-level departments and agencies and the 19 agencies deemed "independent regulatory agencies" under 44 U.S.C. § 3502 (5). We included the U.S. Agency for International Development (USAID), as it was the only agency represented at the National Security Council that was not already included as a Cabinet-level agency or as an independent regulatory agency. We also inquired with a member of the Select Committee and did not receive an answer on what agencies are included. The result was 41 agencies.
Because the AI Use Case Inventories requirement applied to agencies generally, we began with the Administrative Conference of the United States' Sourcebook of U.S. Executive Agencies ("ACUS Sourcebook"). From the 278 agencies identified in the ACUS Sourcebook's data spreadsheet, we removed agencies within the Department of Defense, agencies and sub-agencies within the intelligence community, and the 19 independent regulatory agencies defined in 44 U.S.C. § 3502(5), based on the exemptions in Section 8(a) of the Trustworthy AI Order. This left us with a total of 220 agencies.
In searching for AI Plans and Use Case Inventories, we took a systematic approach meant to optimize both the chance of finding the document while also providing clear and simple search processes. For each requirement and agency, we searched in four ways: (1) at a dedicated URL as mandated under the respective executive order; (2) a web search for certain words closely related to the requirements; (3) a search on the agency's website for those key words; and (4) a search in the publication libraries at AI.gov.
Full data generated by our research included in the Appendices.

OVERALL IMPLEMENTATION STATUS
While much progress has been made, we were unable to verify implementation of the majority of the line-level legal requirements.  Across both executive orders and the AI in Government Act, we found that 11 of 45 requirements, or roughly 27 percent, were implemented (see Table 1). 8 The implemented requirements spanned a range of topics, including agencies' prioritization of AI R&D in annual budget proposals, 9 recommendations for leveraging cloud computing resources for federally funded AI R&D, 10 guidance on federal engagement in the development of AI-related technical standards, 11 and the establishment of a GSA AI Center of Excellence to facilitate the adoption of AI within the federal government. 12 However, seven of 45 requirements (16 percent) were not implemented by the deadline, and the remaining 26 requirements (58 percent) could not be confirmed as either fully implemented or not implemented (see Appendix A.2). The requirements that remain unfulfilled-including creating an AI occupational series for federal employees, estimating the AI workforce gap in the federal government, policy guidance on federal acquisition and use of AI, 13 and a public roadmap on OMB's intended revisions or new AI policy guidance 14 -are significant for the country's AI ecosystem and the 8 A requirement in Section 5(c)(ii) of the Trustworthy AI Order [25] had not been implemented when we did our systematic analysis, but we excluded this requirement from the overall implementation assessment because the deadline for its implementation had not yet passed. 9 Section 4(a) of the AI Leadership Order [12] directed heads of AI R&D agencies to "consider AI as an agency R&D priority" and to take AI "into account when developing budget proposals and planning for the use of funds." Section 4(b) directed the same agencies to "budget an amount of AI R&D that is appropriate for this prioritization," particularly through the Networking and Information Technology Research and Development (NITRD) Program, and to identify "the programs to which the AI R&D priority will apply and estimate the total amount of such funds that will be spent on each program. " This ongoing, annual requirement seems to be implemented through an annual NITRD supplement to the president's budget, progress reports on AI R&D, and a NITRD AI R&D dashboard. See [10,13,14,22,35,64]. 10 Section 5 of the AI Leadership Order [12] directs the Secretaries of Defense, Commerce, Health and Human Services, and Energy, as well as the Administrator of NASA and the Director of the NSF, to prioritize allocation of high-performance computing resources for AI, and also directs the NSTC Select Commission on AI to work with GSA on a report to the president for leveraging cloud computing resources. The National AI Initiative Office's AI Researchers Portal includes a computer resources overview with six "Federally-supported computing infrastructure resources that are useful for AI research" identified. See [59]. The NSTC Select Committee on AI also published-16 months after the mandated deadline-Recommendations for Leveraging Cloud Computing Resources for Federally Funded Artificial Intelligence Research and Development as well as a complementary "lessons learned" report in July 2022. See [26,52]. 11 Section 6(d) of the AI Leadership Order [12] directs the Secretary of Commerce through the NIST Director, with participation from relevant agencies, to "issue a plan for Federal engagement in the development of technical standards and related tools in support of reliable, robust, and trustworthy systems that use AI technologies." In August 2019, NIST published the required report. See [17]. 12 Section 103 of the AI in Government Act [19] mandates the establishment of this Center and delineates its roles; GSA has established the Center. See [139]. 13 The White House announced in May 2023 that OMB will release draft guidance on AI procurement for federal agencies in summer 2023 [114]. 14 These are respectively required by Sections 105 and 104 of the AI in Government Act [19] and Section 4(b) of the Trustworthy AI Order [25]. Note that Action 7 in the 2021 Federal Data Strategy Action Plan could arguably be construed as an implementation of the public roadmap requirement because it provides four milestones; however, it federal government's adoption of AI. Similarly, the implementation status is uncertain for major requirements, including efforts to make data and source code more accessible for AI R&D, 15 better leverage and create new AI-related education and workforce development programs, 16 and ensure agencies participate in interagency bodies that further the implementation of trustworthy AI. 17 Requirements in the executive orders with deadlines for specific deliverables were implemented at a higher rate. Conversely, none of the AI in Government Act's four requirements with a deadline were implemented: the Office of Personnel Management (OPM) was to submit to Congress a plan to establish an AI occupational series by May 2021; OMB was required to issue a memorandum on AI procurement, mitigating discriminatory impact or bias, and promoting AI innovation by October 2021, with agencies publicly posting plans to achieve consistency with it by April 2022; and OPM was to create an AI occupational series and estimate AI-related workforce needs in each federal agency by July 2022. Of the implemented requirements across all three, many were late. For example, the NSTC Select Committee on AI produced the AI Leadership Order's mandated report to the president on better leveraging cloud computing for AI about 16 months past the deadline. Pursuant to the AI Leadership Order, OMB similarly issued a memorandum to agencies on regulatory approaches to AI about 16 months late, as well as a notice on the Federal Register soliciting public comments on how to improve public access to federal data for AI about two months after the AI Leadership Order's summer 2019 deadline.
We provide detailed findings in Appendix A.2 and a line-level tracker in Appendix E.1.

AGENCY AI PLANS
As noted above, a significant focus of the AI Leadership Order was "reduc[ing] barriers to the use of AI technologies to promote their innovative application" while also protecting "civil liberties, privacy, American values, and United States economic and national security" [12]. The AI Leadership Order therefore placed significant emphasis on examining the proper role of regulating AI, noting the desire to "avoid regulatory or non-regulatory actions that needlessly hamper AI innovation and growth" [138].
Two requirements were critical to achieving this objective: (1) OMB was required to publish a memorandum providing guidance on how agencies should approach regulating AI, and (2) agencies does not mention policy guidance documents (e.g., OMB Circulars) as anticipated by the Trustworthy AI Order. See [34, p. 14]. 15 Required under Section 5 of the AI Leadership Order [12]. 16 Section 7 of the AI Leadership Order [12] mandates that the NSTC Select Committee on AI "shall provide recommendations to NSTC Committee on STEM Education regarding AI-related educational and workforce development considerations" and "provide technical expertise to the National Council for the American Worker." Furthermore it directs agencies to annually communicate plans to the NSTC Select Committee on AI about AI-related fellowship and service programs. Section 7 of the Trustworthy AI Order [25] mandates that OPM "shall create an inventory of Federal Government rotational programs and determine how these programs can be used to expand the number of employees with AI expertise" and "issue a report with recommendations" for doing so that is "shared with the interagency coordination bodies. . . enabling agencies to better use these programs for the use of AI. . . " 17 Section 6 of the Trustworthy AI Order [25] notes that agencies "are expected to participate in interagency bodies for the purpose of advancing the implementation of the Principles and the use of AI consistent with this order" and that the CIO Council "shall publish a list of recommended interagency bodies and forums in which agencies may elect to participate, as appropriate and consistent with their respective authorities and missions" to fulfill the expectation that they participate in interagency bodies to advance the AI principles.
with "regulatory authorities" were required to publicly post Agency AI Plans to "achieve consistency" with OMB's guidance. OMB's Memorandum for the Heads of Executive Departments and Agencies on Guidance for Regulation of Artificial Intelligence Application (OMB M-21-06 [138]), published on November 17, 2020 (about 16 months after the deadline), fulfilled the first requirement and urged a "regulatory approach that fosters innovation and growth and engenders trust, while protecting core American values. " This "OMB AI Regulation Memo" described "policy considerations" to guide AI development. It (1) provided ten "principles for the stewardship of AI applications" to guide agencies, 18 (2) identified alternatives to regulation, 19 and (3) proposed actions, such as public communications and supporting voluntary consensus standards, that agencies could take to reduce barriers to the use of AI. 20 The OMB AI Regulation Memo also provided guidance on Agency AI Plans. It required agencies to identify (a) their statutory authorities to regulate AI, (b) AI-related information that they were collecting on regulated entities, (c) statutory restrictions on their ability to collect or share such information, (d) regulatory barriers identified by stakeholder engagement, and (e) potential regulatory actions. Agencies were instructed to use an OMB-provided template, submit the plans by May 2021 (adhering to the AI Leadership Order's deadline), and publicly post their plans on their agency websites [138]. Critically, the memo did not provide guidance on which agencies were required to produce an Agency AI Plan: the AI Leadership Order's requirement applied to agencies with sufficient AI-related activities and "regulatory authorities," neither of which are self-defining or obvious. 21 We requested, but did not receive, information on the applicable agencies and have, as a result, approximated the relevant agencies as spelled out in the detailed methodology in Appendix B.1.
Out of 41 agencies assessed, only five (12 percent), posted an AI Plan using the template provided by the OMB AI Regulation Memo (see Table 2) by November 2022, even as the OMB AI Regulation Memo ordered agencies to publish them by May 2021. These agencies were the Departments of Energy (DOE), Health and Human Services (HHS), and Veteran Affairs (VA), as well as the Environmental Protection Agency (EPA) and USAID. Thirty-six agencies have 18 The principles were: (1) public trust in AI, (2) public participation, (3) scientific integrity and information quality, (4) risk assessment and management, (5) benefits and costs, (6) flexibility, (7) fairness and non-discrimination, (8) disclosure and transparency, (9) safety and security, and (10) interagency coordination. [138, pp. 3-7]. 19 OMB M-21-06 provided four example non-regulatory approaches: (1) providing sector-specific policy guidance, statements, and frameworks; (2) using existing authorities to promote pilot programs and experimentation (e.g., through granting waivers, regulatory exemptions); (3) engaging voluntary consensus standards-development; and (4) developing and promoting voluntary frameworks. [138, pp. 7-8]. 20 OMB M-21-06 suggested the following four "non-exhaustive" agency actions: (1) increase public "access to Federal data and models for AI R&D"; (2) public communication through requests for information (RFIs) in the Federal Register, increased transparency about uncertainties regarding outcomes, and making guidance documents widely available; (3) increase agency participation, including through private sector engagement, "in the development and use of voluntary consensus standards and conformity assessment activities"in order to "help agencies develop expertise in AI and identify practical standards for use in regulation"; and (4) increase international cooperation on regulation. See [138, pp. 8-11]. 21 The Agency AI Plan requirement only applied to "implementing agencies" that have regulatory authorities, including independent regulatory agencies. "Implementing agencies" were defined in Section 3 of the AI Leadership Order [12] as "agencies that conduct foundational AI R&D, develop and deploy applications of AI technologies, provide educational grants, and regulate and provide guidance for applications of AI technologies, as determined by the co-chairs of the NSTC Select Committee. "  Examination of the five Agency AI Plans also casts doubt on whether all agencies meaningfully attempted to identify relevant regulatory authorities. (We provide a detailed summary in Appendix B.2 of the substance of these five Agency AI Plans.) The DOE's AI Plan was completed with "None" written in every section. By contrast, HHS, the VA, and EPA provided more detail within their Agency AI Plan. Although the USAID plan does not identify any statutory authorities or planned regulatory actions, its publication of an Agency AI Plan demonstrates a commitment to transparency.
HHS is a particularly instructive and exemplary case. HHS identified 11 statutes that directly or indirectly authorized it to regulate AI applications, over 32 active collections of AI-related information, 12 AI use case priorities, 10 AI regulatory barriers, and four planned regulatory actions concerning AI applications [74]. The extent and depth of HHS's response likely stems from substantial efforts within the agency to formulate an AI strategic plan that considers how HHS will "[r]egulat[e] and oversee[] the use of AI in the health industry" as well as an extensive Trustworthy AI Playbook and an action plan by the Food and Drug Administration (FDA) for regulating AI-based medical devices [29, 31, 39]. In short, AI Plans reflect-and are aimed to foster-strategic planning, forethought, and coordination around AI.

AI USE CASE INVENTORIES
The Trustworthy AI Order mandated that agencies prepare inventories of their uses of AI, share them with the Federal Chief Information Officers Council (CIO Council) and other agencies, and then make them public [25]. The number of covered agencies is much broader than under the AI Leadership Order, exempting only independent regulatory agencies and agencies within the Department of Defense (DOD) or intelligence community. 22 Agency AI use case inventories must be prepared annually and should identify AI use cases that are inconsistent with the order, including the nine implementing principles. In the case of conflict, agencies are to develop remediation plans. 23 Table 3: Publication of Agency AI Use Case Inventory as of Nov. 2022. "Large" agencies are those with more than 400 employees; "Known AI" are those with known AI use cases as of 2020. "Sub-agency" treats hierarchically related agencies as separate (e.g., separating the FAA and DOT); "Parent" attributes all sub-agency use cases to the parent agency.
Public disclosure of AI use case inventories has been problematic. 24 Roughly half or more of relevant agencies-a minimum of 47 percent of the agencies examined-have not published an AI use case inventory (see Table 3 and Appendices C.2 and E.3). Because of uncertainty in the relevant agencies, we report the implementation rate with different groups of agencies and at different organizational levels (see Appendix C.1 for more details on the methodology). The Trustworthy AI Order and the CIO Council's guidance for creating the inventories [25, 75], for instance, did not explain how sub-agencies and parent agencies should report their inventories (e.g., whether the DOT should include AI use cases from its subagency, the FAA, or let the FAA publish a separate inventory). We report use cases first with sub-agencies assessed individually and then rolled up to the parent agency.
Starting with the 220 agencies identified as potentially subject to this requirement-168 did not have an independent AI use case inventory or include their AI use cases within the inventory of their parent agency. Examining 78 parent-level agencies, only 17 posted AI use case inventories. 25 Thus, 76 percent of all 220 parent and sub-agencies, assessed separately, did not publish an inventory, and 78 percent of agencies assessed at the parent level did not publish an inventory (see Table 3).
To address the reality that executive agencies are not all similarly resourced, we also examined "large" agencies (defined as ones with over 400 employees). When focused on this subset of 125 large 24 We searched for AI use case inventories starting in late October 2022, and the findings reported in the Tracker are current up to at least November 11, 2022, with some spot checks performed throughout early December 2022. Agencies may have posted inventories after our exhaustive search. But they were required to post the inventories by March 2022. Moreover, though it is possible we missed some inventories, we emphasize that they ought to be easily accessible. The CIO's guidance "encouraged" agencies to publish their inventories on a specific URL [75], and the NAIIO's repository [57] ostensibly includes all of the published inventories. Even if agencies have published inventories elsewhere, there are shortcomings to their implementation of the order if they are not published according to these methods. 25 Three had zero use cases (HUD, NIST, and NSF), and a fourth (SSA) had only five use-cases. These are questionable, but for the purposes of the first two measures, we mark them as compliant solely from the posting of their inventories. In contrast, we count HUD as noncompliant when assessing against the identified AI use cases, i.e., the "Known AI Cases" of Table 3, because while its inventory asserts that the agency has no AI use cases, the ACUS Report identified a non-zero number of use cases. Neither NIST nor NSF were included in the "Known AI Cases" measure because the ACUS Report did not identify a use case from NIST, and NSF is not a "large" agency within the meaning of the report, and so neither is counted specially as compliant for one measure and noncompliant for another measure, unlike HUD. For further methodological discussion, see Appendix C.1. agencies (with parent and sub-agencies separately assessed), 47 had AI use cases published within an inventory, whereas 78 (62 percent) had not published use cases within an inventory. Assessing 37 large, parent-level agencies, 21 (57 percent) had not published an inventory.
The Trustworthy AI Order and guidance provided by the CIO Council did not specify whether an agency without AI use cases (or whose only use cases were exempted from disclosure) was required to file an inventory, or otherwise notify the public, to indicate that it had completed the requirement. It could be that 76 percent of agencies simply have no AI use cases. We hence examine the subset of agencies for which we can independently confirm the existence of AI use cases. This analysis enables us to distinguish whether the absence of inventories indicates the absence of AI use cases or an agency's failure to fulfill the Trustworthy AI Order's mandate. We rely on the extensive ACUS Report that "rigorous[ly] canvas[sed] AI use at the 142 most significant federal departments, agencies, and sub-agencies" to identify which agencies already had an AI use case as of 2019 and reported that nearly half of agencies have experimented with AI and machine learning at that time. 26 Of the 49 parent and sub-agencies with a known AI use case, 47 percent had not published an AI use case inventory (23 parent and subagencies). Among the narrowest group of agencies-i.e., 23 large agencies with a known AI use case assessed at the parent level-only 11 had published an AI inventory. 27 Notably HUD publicly disclosed that it does "not currently have any relevant AI use cases" [41]. We list these 23 agencies in Table 8. We also include an assessment of the implementation of the AI use case inventories of agencies enumerated in the Chief Financial Officers Act of 1990 and that are members of the CIO Council in Appendix C.2 and in Section 8.2.
The inventories themselves highlight serious implementation challenges with a signature transparency initiative. First, agencies are not disclosing AI use cases, even when these use cases have already been publicly documented. Customs and Border Protection (CBP), for instance, uses the Traveler Verification Service (TVS), which is a facial recognition system that "serves as CBP's backend matching service for the collection and processing of facial images in support of biometric entry and exit operations" [45,102]. Acknowledging that "facial recognition poses a unique set of privacy issues" [9], CBP has sought to be "aggressively transparent" [45] in publishing privacy compliance documentation concerning its biometric entry-and-exit operations, including by publishing six Privacy Impact Assessments 28  Second, inconsistencies in how agencies have implemented the AI use case inventories illustrate three sources of policy ambiguity.
(1) Non-response. For agencies that have not posted inventories, it is unclear whether they are asserting that they have no uses of AI or simply have not fulfilled the requirement. Of the published inventories, three-from HUD, the National Institute of Standards and Technology (NIST), and the National Science Foundation (NSF) [41, 50, 63]-state that their agencies have no AI use cases that meet the Trustworthy AI Order's requirements.
(2) Agency structure. All inventories except for NIST's were published at the parent-agency level (e.g., by DOC or DOE, rather than the NOAA or the Office of Electricity). But it is unclear whether unlisted sub-agencies within an inventory did not have relevant use cases or whether they were unresponsive to a presumed request for reporting by the parent agency. In some cases, the latter seems very likely. 29 (3) AI definition. The definition of AI provided in the 2019 National Defense Authorization Act and incorporated into the Trustworthy AI Order is potentially quite broad, reaching among other things, any "artificial system" that "is designed to approximate a cognitive task" or that can "learn from experience and improve performance when exposed to datasets. " 30 The breadth of that definition may make compliance harder for agencies when classifying particular technologies as "AI" for the purposes of an inventory. 31 For example, NOAA identified 36 AI use cases, representing the vast majority of the DOC's 49 AI use cases. The rest of Commerce's AI inventory [46] includes zero uses from the parent agency, five from the International Trade Administration, two from the National Telecommunications and Information Administration (NTIA), one from the Minority Business Development Administration, and five from the U.S. Patent and Trademark Office, with NIST publishing a separate inventory [50]. Ambiguity may result from both the breadth of the definition of covered AI-which includes uses that are new and existing, standalone and embedded, procured and developed in-house by the agency-and the carve-outs for sensitive or classified uses of AI, AI used for national security purposes, AI "embedded within common commercial products, " and AI R&D, as provided in Section 9 of the Trustworthy AI Order [25].
Third, AI use case inventories often incorporate existing transparency initiatives, but with significant variation. Agencies are best 29 Consider DOE: In its inventory [58], DOE reports 45 use cases from three sub-offices: Brookhaven National Laboratory (one use case); the Office of Electricity (10 use cases); and Idaho National Laboratory (34 use cases). We think these numbers are implausible as an exhaustive account of AI usage within DOE. For example, a public information sheet published in 2020 from the then-Office of Fossil Energy (now the Office of Fossil Energy and Carbon Management) boasted of having "over 60 AI-enabled projects underway" [18]. Moreover, each DOE office has listed a single individual as its point of contact for all AI use cases from that office. It seems at least plausible that those offices have designated specific employees to serve as point-individuals on AI transparency for the office but that other offices have failed to do so, which is why there are no use cases reported for other DOE sub-agencies. As another example, in the Department of the Interior's inventory [47], the United States Geological Survey (a sub-agency) disclosed 55 of the Department's 65 use cases. Some of those use cases seem to be collaborations with agencies (e.g., the U.S. Fish and Wildlife Service and the Bureau of Ocean Energy Management) that themselves did not disclose use cases. We count such agencies as failing to implement the requirement notwithstanding that other agencies reported some of their AI use cases. 30 The full definition is provided in Section 238(g) of the FY2019 NDAA [8]. 31 The CIO's 2021 FAQs and "Example AI Use Case Inventory Scenarios" guidance documents [75] provide some details beyond the statutory definition, but much of the work of classifying technologies as "AI" still falls on the agencies.
positioned to know what records exist regarding each AI use case, and some have provided useful links to published documentation. For example, many use cases in the DHS inventory include links (e.g., to privacy impact assessments); some EPA, HHS, Department of the Interior (INT), DOC, and Department of Agriculture use cases include links to relevant publications; and some Department of Labor (DOL), INT, and Department of Justice (DOJ) use cases reference publicly available code.

DISCUSSION
We now discuss broader implications emerging from this study, as well as some limitations. First, empirically, our top-level finding is that implementation has been lacking, which we interpret through the lens of bureaucratic capacity and policy ambiguity. Second, methodologically, we discuss how social scientists can study policy implementation in a rigorous and systematic way based on our case studies.

Broader Implications
Foundational theoretical work in bureaucratic capacity has argued that lower capacity can prevent effective implementation of hierarchically imposed policy obligations. Huber [115] attributed this possibility to inhibitions on the principal's ability to punish failures on the part of the agent when the agent lacks sufficient capacity to implement the directive. Other explanations focus on the multiplicity of tasks and principals that each agency has, which implies that the agency may shirk obligations that lack enforcement mechanisms [97]. Still others might argue that policy directives understood as far from the organization's core "turf" may seem peripheral or unimportant and are thus ignored [141]. All of these different explanations can shed light on the lackluster implementation of these AI directives: Agencies, by and large, lack the technical expertise and committed leadership necessary to effectively implement and prioritize regulatory principles promulgated by the White House or Congress.
Our findings also reveal substantial policy ambiguity that places more decision-making costs on agencies seeking to comply with the directives, thereby further hampering implementation. Central questions pertaining to the scope of transparency obligations-like the AI plans and inventories-were left ambiguous by the executive orders and White House-level guidance. Our findings emphasized two sources of ambiguity-ambiguity in defining "AI" and "agency. " On the former, the definition of "AI" used in the Trustworthy AI Order left substantial discretion to the agencies to categorize their use of technology. For example, the Order's exemptions for "AI embedded in common commercial products" and for "AI research and development activities" are ambiguous. 32 We found large inconsistencies in the kinds of use cases disclosed by agencies: compare, for example, NOAA's disclosure of 36 AI use cases pertaining to scientific research with CBP's non-disclosure of facial recognition systems used for biometric entry and exit operations (a system for which CBP has published an independent website, presumably because of the politically sensitive nature of the operation).
Additionally, both orders swept in broad terms, making it hard to know which entities were obligated to publish AI plans or use case inventories (see discussion in Section 4 and Appendices B.1 and C.1). Yet the problem of "agency" definition is not novel. As the authors of the ACUS Sourcebook note, "cataloging administrative agencies is difficult because so many varying definitions abound" ( [130, p. 11]). The point is not that there is a correct definition-rather, it is that these pillars of America's AI strategy did not even attempt to address the issue, thereby shifting costs onto lower-level executive branch entities to determine whether they ought to comply.
From the perspective of change management, a key problem with such ambiguity is that it inhibits the policymaker from effectively communicating and directing change in conditions of fast-changing technology (e.g., [137], cf. [79,96]). Tighter rule construction itself would be helpful so that agencies better understand when and how they must comply. But discretion will inevitably vest with line-level bureaucrats implementing policies on AI (cf. [122]). Though much theoretical literature has discussed bureaucratic resistance to hierarchically imposed requirements (e.g., [118]), evidence from the perspective of "street-level" bureaucrats argues that implementation failures are more often a result of insufficient capacity than ideological opposition [83], which accords with our findings.
The poor public availability of the Agency AI Plans, AI use case inventories, and other mandated items supports existing scholarship about the consistency by which agencies make public guidance documents [87]. Coglianese found that mandated agency guidance is inconsistently published, which keeps the public in the dark about important agency actions. Agencies, Coglianese argued, need internal management practices to ensure disclosure because legal requirements without incentives or consequences for non-disclosure will be insufficient to motivate agencies to disclose [87]. Transparency requirements strengthen government accountability efforts while also enabling federal agencies to have meaningful consultations with external stakeholders. But to actualize those policies will require more careful rule construction from the top, closer attention to bureaucratic capacity down the chain, and agency adoption of management strategies to systematically track, index, and publish guidance [99].
Our systematic assessment of agency implementation of these policies provides evidence for the inference that insufficient bureaucratic capacity has hampered the implementation of U.S. AI policy. We do not rule out possible alternative explanations. For example, agencies' incentives to faithfully implement policy directives may be tied to their assessment of those policies' durability [135], where policies promulgated by a president late in her term 33 may be perceived by the agencies as less imperative or even less legitimate. Similarly, agencies may have differential incentives to comply based on how central AI initiatives are to their core functions, especially as it implicates funding. Thus, for example, NOAA's substantial disclosure of AI use cases in its inventory might be understood as a kind of "bureaucratic entrepreneurship" [124], where the agency's work helped demonstrate to the public why it needed greater funding for AI-related initiatives (funding, incidentally, which it received [106]). But while there is more room for theoretical insight from studying variation within our findings, the top-level result is still 33 The AI Leadership Order was issued on February 14, 2019, and the Trustworthy AI Order on December 8, 2020, when then-President Trump was a so-called 'lame duck' president.
indicative of a general lack of bureaucratic capacity to implement AI policy.
Finally, our methodological contribution is to provide a transparent and systematic means for assessing policy implementation notwithstanding the conceptual ambiguities noted above. Our reliance on a mix of statutes, regulatory provisions, and materials by ACUS can inform subsequent efforts to assess policy that is addressed, generally, toward "agencies," as in the Trustworthy AI Order, or agencies with "regulatory authorities, " as in the AI Leadership Order (see Section 4 and Appendices B.1 and C.1). Furthermore, for the AI use case inventories, we present findings with different levels of aggregation and groupings of agencies that correspond to theoretical concerns and practical realities: We considered not only the largest number of agencies to which the executive order might theoretically apply, but we also cut only to "large" agencies and to agencies that had been previously identified by ACUS as using (or considering using) AI. And we assessed each of those measures not only by disaggregating at the lowest "agency" level but also by bundling agencies into their parent departments (for example, including the IRS within the Treasury Department) to address what seems to be agencies' understandings of their obligation under the executive order (i.e., most disclosed inventories were housed at the parent-agency level) (see Section 7 and Appendix C). And while much of our work emphasized systematic analysis, we also considered the disclosures qualitatively so that our findings are sensitive not only to whether agencies ticked a box but also how meaningfully their disclosures achieved the executive orders' policy goals. While these steps involve nuance to implement, they illustrate how we can rigorously assess policy implementation. As AI governance efforts mature, these efforts will be critical to ensure that legislative and executive directives are not "lost or misdirected in the vast hallways of the federal bureaucracy" ([1, p. 1111]).

Limitations
We note several limitations of our assessment. First, as we have noted, our assessment is based on publicly available information. Many more implementation efforts may be underway. But the mere fact that so many deadlines have been missed-when the pace of innovation in AI is extremely fast-illustrates the severe limitations of existing governmental efforts. In addition, the difficulty in researching the implementation status is itself telling. Existing efforts have delegated to agencies the task of defining and implementing these provisions, and, as a result, efforts have been fragmented and inconsistent.
Second, some might argue that the failure to meet deadlines and implement legal requirements is no different in AI than in other domains [78]. Perhaps that is so, although there are few directly analogous studies in comparable, but non-AI, domains. 34 Regardless, our findings suggest bureaucratic capacity challenges in a highly consequential space.
Third, our implementation estimates may be critiqued based on the fact that they weigh provisions equally. Not all operative provisions in a bill or order matter equally. We agree and have provided the detailed, line-level tracker results to enable any assessment of implementation of specific items (Appendix E). Our qualitative assessment, however, does not suggest that all important items have been implemented. To the contrary, major items that are critical to preparing the federal government for the AI transition have not been addressed.
Fourth, while AI use case inventories are an important step toward transparency, they remain relatively limited as implemented. Some registries, for instance, include extensive data and model documentation, but the Trustworthy AI Order did not appear to require such extensive detail. As we show in Appendix C.2, numerous agencies have gone beyond the minimal requirement and documented performance benchmarks and evaluation measures, which are particularly important for assessments of trustworthiness.
Last, we released our findings in December 2022 [120] and some agencies have since posted AI use case inventories or disclosed no use cases. 35 To the extent that our research galvanized agency action, we both applaud the agencies and White House for taking initiative but also re-emphasize that formal compliance itself is not the goal. Compliance should be a means for strategic planning and action: Publicly verifiable steps, while important from a transparency perspective, are fundamentally proxies for assessing whether agencies are prepared for and taking concrete steps around trustworthy AI. If the tracked metrics become ends in themselves, then they are no longer reliable indicators of the underlying issue of interest. Agency responses also suggest further support for our conclusion that senior leadership is critical. All of the agencies that we know published an inventory after our white paper, except for one, are subject to the Chief Financial Officers Act (see Appendix D). This act required each agency to establish a Chief Financial Officer and provided the White House's Office of Management and Budget (OMB) greater authority over agency financial management [67]. This could demonstrate the important role of the White House in shepherding compliance and strategic planning.

CONCLUSION
Our findings have broad implications for the current ability of government to govern AI. We find that three core elements of America's collective AI strategy-the AI Leadership Order, the Trustworthy AI Order, and the AI in Government Act-have not been implemented well despite an urgent need for the U.S. government to grapple with a technology that is widely seen to have far-reaching, transformative potential.
These findings strongly suggest that there is a resource shortage, a leadership vacuum, and a capacity gap, which are exacerbated by policy ambiguity. Leadership will be required from both the White House, including the National AI Initiative Office and OMB, and agencies to coordinate and drive forward AI innovation and trustworthy adoption. Current requirements may appear to agencies like "unfunded mandates" and be treated like checklists when they should in fact be seized as opportunities for strategic planning around AI. Some agencies have recognized the urgent need and were able to respond comprehensively and meaningfully to these legal requirements (see, e.g., HHS's Artificial Intelligence Strategy, 35 Agencies that have since published an explanation or use case inventory include the Departments of Education, Housing and Urban Development, the Interior, the Treasury, and Transportation; the General Services Administration; the Small Business Administration; and the U.S. Office of Personnel Management. Trustworthy AI Playbook, and action plan for regulating AI-based medical devices [29, 31, 39]). If our findings are due to bureaucratic capacity, Congress should provide resources for agencies to staff and acquire technical expertise to comply in more than a perfunctory way and develop strategic AI Plans. Failure to provide proper resources and mandate senior personnel to discharge these responsibilities could otherwise undermine the goal of these laws to maintain U.S. leadership in AI innovation and trustworthy AI.
The public disclosure of AI Plans and AI use case inventories constitute an important effort to foster transparency and accountability in public sector AI. The executive orders mandated their public disclosure and senior-level guidance instructed they be made readily available on specific websites. The fact that it has taken considerable effort for our team to track the implementation of such plans, use cases, and requirements (see efforts detailed in the Appendices) strongly suggests that improvements must be made on reporting and tracking of these provisions. Our assessment may miss certain use case inventories, for instance, but that is precisely the point. Disclosure must be accessible and legible to be effective.
We close by noting that on paper and in principle, America's strategy for AI innovation and responsible AI, as manifested in the Trustworthy AI Order, the AI Leadership Order, and the AI in Government Act, is highly laudable. But in practice, our assessment suggests severe challenges in the federal government's ability to navigate a rapidly changing and critically important space. Requirements have been converted into perfunctory checklists instead of triggers for strategic planning, and agencies do not appear to have effectively grappled with the opportunities and risks that AI poses.
Bureaucratic capacity is a sine qua non for turning laudable principles into reality.

A IMPLEMENTATION OF LEGAL REQUIREMENTS A.1 Methodology
To assess the implementation status of the AI Leadership Order, the Trustworthy AI Order, and the AI in Government Act, we first identified all line-level actions that these documents mandate (e.g., instructions that a federal entity "shall budget," "shall consider," "shall review," "shall publish"). For each requirement, the following information was compiled in a tracker (see Appendix E): (1) the relevant portion of the executive order or legislation, (2) the government stakeholder responsible for its implementation, (3) a summary of the mandated outcome or deliverable, (4) the mandated deadline, if any, and (5) the "type" of requirement (see below paragraph), and (6) the status of implementation. The first four items were drawn from the text of the executive order or legislation itself, while the status of implementation was drawn from publicly available information, as of November 23, 2022. Where possible, we provide additional details about the implementation of the requirement and URL links to relevant documents. Therefore this represents the publicly verifiable status and may not capture activities executed without public disclosure (either with the intent to protect sensitive or classified information or simply because the federal entities did not prioritize or have an appropriate avenue for disclosing the activity).
As noted above, the requirements were split into three categories. This facilitated assessment of implementation by the responsible federal government entity (see tables in Appendix E.1). The categories were: (1) Time-boxed requirements mandated a federal entity, or entities, to produce a document or achieve an outcome by a specified date (e.g., "shall develop" a report within 90 days of the date of the executive order).
(2) Open-ended requirements mandated the production of a document/deliverable or achievement of an outcome without specifying a deadline. (3) Ongoing requirements were open-ended mandates to agencies that often did not require the production of a specific document/deliverable or achievement of a concrete outcome (e.g., agencies "shall pursue" an objective, "shall consider" actions, "shall identify opportunities, " "shall provide" expertise, etc.). These ongoing requirements also did not have a deadline. This also includes outcomes that were part of an annual process without a specified date (e.g., the AI Leadership Order's requirements in Section 4(b)-(b)(1) that agencies prioritize AI R&D and "communicate plans for achieving this prioritization to the OMB Director and the OSTP Director").
Although assessing implementation of the time-boxed requirements was often straightforward, compliance with a significant percentage of mandated actions was not known or hard to determine, either because the mandate required ongoing compliance without producing a specific milestone, the mandated action did not require public disclosure of its completion or progress toward its completion, or both. Under the assumption that federal entities had taken necessary steps, or at least made good faith efforts, to meet their legal and statutory requirements, ambiguity was resolved in favor of the federal entities. Therefore, the researchers applied the following rules for determining implementation status: implemented (or indications of implementation), not implemented (or indications that the requirement was not implemented), and not known.
• Implemented: Time-boxed requirements were marked as successfully implemented where the mandated outcome was achieved, even if achieved after the mandated deadline. Openended requirements and ongoing requirements without a defined deliverable were coded green if public information strongly supported the conclusion that federal entities were implementing the requirement. • Not Implemented: Time-boxed requirements were marked as not implemented if there was no public information, as of November 23, 2022, confirming their implementation by the mandated deadline. Requirements were coded red if public information strongly suggested that they had not been implemented by federal entities. The latter, for instance, occurred for the AI Leadership Order's requirement for a National Security Presidential Memorandum.

• Not Known: Implementation of time-boxed requirements
and open-ended requirements was marked as not known where public reporting was nonexistent, often because public reporting was not mandated, or did not clearly indicate the status of implementation. Similarly, the implementation of ongoing requirements was marked as not known because there was no mandated reporting and often no mandated outcome for the researchers to publicly verify.

A.2 Summary of Findings
A summary of the findings for each document is provided in Tables  1, 4, 5, and 6. The detailed methodology is provided in Appendix A.1, but it is important to highlight upfront that methodological constraints may result in our findings underestimating implementation and overestimating requirements that remain outstanding. Although best efforts were taken to properly identify all relevant documents or notices of actions, the researchers could only rely on federal entities' public disclosures, which may not capture all relevant actions taken by the federal government to achieve the mandates.
• AI Executive Order: Only 39 percent, or nine of the order's 23 requirements, were implemented. Given a dearth of publicly available information about many of the requirements, the implementation status for a majority of the requirements was not known (57 percent). Requirements with a specified deadline had a higher rate of implementation (45 percent) than requirements without a deadline (0 percent) or open-ended requirements without a concrete deliverable (40 percent). Critically,the requirement for agencies to publish AI Plans to achieve consistency with OMB guidance on regulating AI was not fulfilled. The implementation of these Agency AI Plans is discussed in Section 6 and Appendix B.2. • Trustworthy AI Order: Implementation was even lower for the Trustworthy AI Order, with only 13 percent, or two of the requirements, implemented. 36 Similar to the AI Leadership Order, implementation for a majority of the requirements (54 percent) could not be conclusively determined. Two of the requirements, or 13 percent, have not been implemented, including the requirement for agencies to prepare and publish AI use case inventories. The implementation of these Agency AI use case inventories is discussed in Section 7 and Appendix C.2. • AI in Government Act of 2020: Compared to the executive orders, the percentage of requirements that were not implemented was much higher at 67 percent, or four of the six requirements. The only requirement implemented was to establish an AI Center of Excellence within GSA, while the progress that GSA has made on achieving the Center of Excellence's duties is unknown.

B IMPLEMENTATION OF AGENCY AI PLANS B.1 Methodology and Background
B.1.1 Background on AI Leadership Order's "Agency AI Plan" Requirement. As discussed in Section 6, a significant focus of the AI Leadership Order was addressing concerns about regulatory gaps and hurdles to AI development and deployment. As such, the executive order mandated: • The White's Office of Management and Budget (OMB) to issue a guidance memorandum to agencies, after publishing a draft guidance for public comment, within 180 days of the EO (approximately August 2019). Sections 6(a)-(b). • The heads of "implementing agencies" with regulatory authorities to develop a plan to "achieve consistency" with the OMB memorandum within 180 days of OMB issuing the memorandum. Section 6(c).
OMB fulfilled its requirement over a year overdue, publishing a draft memorandum on January 1, 2020,[27] and issuing its final memorandum on November 17, 2020. [138] OMB M-21-06, Memorandum for the Heads of Executive Departments and Agencies on Guidance for Regulation of Artificial Intelligence Application (referred to as the "OMB AI Regulation Memo" for ease of understanding), provided guidance for agencies on regulatory and nonregulatory approaches to AI. Critically, it noted that "government use of AI" was outside of the scope of the memorandum.
The OMB AI Regulation Memo also provided guidance for the Agency AI Plans. Specifically, it stated: The agency plan must identify any statutory authorities specifically governing agency regulation of AI applications, as well as collections of AI-related information from regulated entities. For these collections, agencies should describe any statutory restrictions on the collection or sharing of information (e.g., confidential business information, personally identifiable information, protected health information, law enforcement information, and classified or other national security information). The agency plan must 36 There were 17 requirements, but one requirement was excluded from the overall calculations because its deadline has not yet passed (i.e., the rate of implementation assumed 16 instead of 17 requirements). also report on the outcomes of stakeholder engagements that identify existing regulatory barriers to AI applications and high-priority AI applications that are within an agency's regulatory authorities. OMB also requests agencies to list and describe any planned or considered regulatory actions on AI.
Furthermore, the memorandum included specific instructions for how agencies must submit and publish their plans: Agency plans are due on May 17, 2021, and should be submitted to OIRA at the following email address: Alplans@omb.eop.gov. To inform the public of each agency's planned and implemented activities, agency plans must be posted on, or be accessed from (through a URL redirect), the following domain on the agency's website: www. [agencyname].gov/guidance.
The May 2021 deadline adhered to the AI Leadership Order's requirement that the plans be completed and submitted within 180 days of the OMB AI Regulation Memo's issuance.
The OMB AI Regulation Memo did not provide guidance on which agencies were subject to the executive order's requirements. The AI Leadership Order stated that the requirement applied to "implementing agencies that also have regulatory authorities. " "Implementing agencies" were defined in Section 3 of the AI Leadership Order as "agencies that conduct foundational AI R&D, develop and deploy applications of AI technologies, provide educational grants, and regulate and provide guidance for applications of AI technologies, as determined by the co-chairs of the NSTC Select Committee. " This set is potentially quite broad, especially as regulation of applications of AI would include many incumbent regulatory regimes (e.g., approval of medical devices by the Food and Drug Administration, discrimination of employment policies by the Equal Employment Opportunity Commission). However, the NSTC Select Committee on AI did not publish a list of agencies it determined were "implementing agencies, " nor did the OMB AI Regulation Memo provide any additional insight. Although the OMB AI Regulation Memo directed itself to "heads of all Executive Branch departments and agencies, including independent regulatory agencies, " neither the memorandum nor the executive order defined "regulatory authorities, " a potentially expansive term subsuming most administrative agencies, or delineated which agencies had regulatory authorities.
B.1.2 Methodology for Assessing Implementation. To identify relevant agencies, we first searched online for a list of agencies deemed to be "implementing agencies" by the co-chairs of the NSTC Select Committee on AI. As this list was not publicly available, we instead focused on Cabinet-level departments and agencies and the 19 agencies deemed "independent regulatory agencies" under 44    a Requirement in section 5(c)(ii) has not been implemented, but the deadline for implementation has not yet passed, so it is not classified as implemented, not implemented, or not known. Therefore the percentages for the 12 time-boxed requirements do not equal 100 percent. b 6 has one time-boxed requirement and one ongoing requirement. See Appendix E.1 c See Section 7, Table 3, and Appendix C.
U.S.C. § 3502 (5). 37 We also included the U.S. Agency for International Development (USAID), as it was the only agency represented at the National Security Council [105] that was not already included as a Cabinet-level agency or as an independent regulatory agency. The reason for including each agency is identified in the full tracker 37 The current Cabinet includes the heads of the 15 executive departments (the Secretaries of Agriculture, Commerce, Defense, Education, Energy, Health and Human Services, Homeland Security, Housing and Urban Development, Interior, Labor, State, Transportation, Treasury, and Veterans Affairs, and the Attorney General), the White House Chief of Staff, the U.S. Ambassador to the United Nations, the Director of National Intelligence, and the U.S. Trade Representative, as well as the heads of the Environmental Protection Agency, Office of Management and Budget, Council of Economic Advisers, Office of Science and Technology Policy, and Small Business Administration. [66] We excluded the White House Chief of Staff, U.S. Ambassador to the U.N., and Council of Economic Advisors because they do not have rule-making or regulatory authority. [130] in Appendix E.2. It is possible that this list is overinclusive or underinclusive of the agencies that were actually required to establish and publish an Agency AI Plan to achieve consistency with the OMB AI Regulation Memo. We also inquired with a member of the Select Committee and did not receive an answer on what agencies are included. The intended purpose of the OMB AI Regulation Memo's requirement that the Agency AI Plans should be available on the respective agency website's page on guidance was to increase transparency and "inform the public." Therefore, identifying the plans should be intuitive to the public and should not require significant expenditure of time. To simulate how an individual might seek to access the plan, we implemented four simple approaches to finding  • Dedicated Agency URL: Visiting the link under which the OMB AI Regulation Memo expressly requires the Agency AI Plan to be posted: [agency_name].gov/guidance. We noted first if the agency had a dedicated guidance webpage. If it did, we searched "response artificial intelligence OMB M-21-06" (as "OMB" and "M-21-06" are expressly noted in the template response). If it did not, we marked "no" for this method. • Web Search: We searched online (using Google) "[agency name] response artificial intelligence OMB M-21-06". If the agency's full name did not return results, we searched with the agency's acronym (e.g., HHS for the Department of Health and Human Services), where applicable. • Search Within Agency Website: Searching within an agency's website: "response artificial intelligence OMB M-21-06." If (as noted above) an agency lacked its own website, we searched on its parent agency's website with its name included, e.g., "[agency name] response artificial intelligence OMB M-21-06". If the search engine returned an implausibly large number of results (e.g., on the order of 10,000), phrases would be placed in quotation marks (e.g., "artificial intelligence, " "use case, " and "M-21-06"). • AI.gov: Searching the publication library on AI.gov (the website for the National AI Initiative) for the agency's name (or acronym) and "response artificial intelligence OMB M-21-06". We also looked at all documents published by that agency and included in the publication library as there was a small number of documents per agency, if any, in the publication library.
If an Agency AI Plan was identified using any of these four methods, as of November 23, 2022, the researchers marked "yes" in the "Agency Plan" column (Appendix E.2) and provided the web link to the plan in the "URL" column. If the Agency AI Plan was not identified using any of the four methods, the researchers marked "no" for the presence of an "Agency Plan. "

B.2 Summary of Findings
The agencies with an Agency AI Plan are the Departments of Energy, HHS, and VA, the EPA, and USAID (see Tables 2 and 7 Four agencies published AI-related strategic plans, including some that noted the AI Leadership Order, but these plans provided far less than the detailed required. The DHS's S&T AI and ML Strategic Plan [38], the VA's AI Strategy [30], and the Department of State's Enterprise Data Strategy [33] mention the AI Leadership Order and identify AI priorities but provide less detail on regulation than required under the AI Leadership Order. The Nuclear Regulatory Commission published an AI Strategic Plan in June 2022 [94], but it similarly does not provide enough detail to classify as an Agency AI Plan consistent with the OMB AI Regulation Memo. • 11 statutes that authorized HHS to regulate AI applications, even noting that two of the statutes do not directly mention AI but might provide indirect authority to regulate AI as it relates to health data or health technology • 32 active collections of AI-related information, 30 of which were approved by OMB pursuant to the Paperwork Reduction Act and two that were exempted from OMB clearance as they are "general requests" • 12 AI use case priorities, 7 were AI applications in the private sector that were under its regulatory authorities (e.g., AI algorithm for wrist fracture reduction), 4 were opportunities for HHS to "shape the development and production of AI in the private sector," such as creating and improving relevant datasets, and 1 (predicting risk of adult maltreatment) was an internal AI tool that could be adopted by the private sector • 10 AI regulatory barriers (e.g., data silos, intellectual property, concerns about HIPAA and data sharing) • 4 planned regulatory actions concerning AI applications (e.g., imposing clinical holds on medical devices) Responsible Agencies: Agencies that must comply were defined by the Trustworthy AI Order [25] in Section 8 as "all agencies described in section 3502, subsection (1), of title 44, United States Code, except for the agencies described in section 3502, subsection (5), of title 44." The Department of Defense and "those agencies and agency components with functions that lie wholly within the Intelligence Community" were also exempted.
Scope: The Trustworthy AI Order used the definition of AI "set forth in section 238(g) of the National Defense Authorization Act for Fiscal Year 2019 as a reference point. " 38 The order further clarified in Section 9 that it applied to "both existing and new uses of AI; both standalone AI and AI embedded within other systems or applications; AI developed both by the agency or by third parties on behalf of agencies for the fulfillment of specific agency missions, including relevant data inputs used to train AI and outputs used in support of decision making; and agencies' procurement of AI applications. " However, the order excluded some AI uses from the AI inventory requirement, including "AI used in defense or national security systems (as defined in 44 U.S.C. 3552(b)(6) or as determined by the agency)," "AI embedded within common commercial products, such as word processors or map navigation systems, " and "AI research and development (R&D) activities. " The CIO's Example AI Use Case Inventory Scenarios provides additional guidance. This guidance adhered to the Trustworthy AI Order, which mandated that agencies share their inventories with other agencies within 60 days of completing them and then make their inventories publicly available within 120 days of completing their inventories.
C.1.2 Methodology for Assessing Implementation. To identify relevant agencies, we looked to the ACUS Sourcebook of U.S. Executive Agencies ("ACUS Sourcebook") [130] and included all 278 agencies and sub-agencies identified in the Sourcebook data spreadsheet. [131] Given the Trustworthy AI Order's explicit exclusions, we removed agencies within the Department of Defense, agencies and subagencies within the intelligence community as defined by 50 U.S.C. § 3003(4), and the 19 independent regulatory agencies defined in 44 U.S.C. § 3502 (5). 39 We further made individualized adjustments for agencies that are now defunct or are administered under different names. 40 This produced a total of 220 agencies. 38 The order noted in Section 9(a) that the evolution of AI use in the federal government necessitates that "OMB guidance developed or revised pursuant to section 4 of this order shall include such definitions as are necessary to ensure the application of the Principles in this order to appropriate use cases. " 39 One of the named independent regulatory agencies within 44 U.S.C. § 3502(5) is the Interstate Commerce Commission, which is now defunct. We excluded its successor, the Surface Transportation Board.  Multiple measures were employed to measure implementation. The measurements varied along two major dimensions: (1) Agencies considered: We measured implementation rates by considering different subsets of agencies-specifically, we employed three agency groupings: all relevant agencies, large agencies, and agencies with a known AI use case. Appendix E.3 includes the list of all 220 agencies and classifies which agencies are large and have a known AI use case. (a) All relevant agencies considers all 220 agencies identified using the methodology described above. This approach does not consider agency size or likelihood of the agency employing AI. (b) Large agencies considers 125 "large" agencies. To identify this subset, we benchmarked against the 2020 "Government by Algorithm: Artificial Intelligence in Federal Administrative Agencies" report submitted to ACUS ("ACUS AI Report"). [102] The ACUS AI Report narrowed the agencies listed in the 2018 ACUS Sourcebook by (1) including only agencies with more than 400 employees; and (2) removing active military and intelligence-related agencies. The ACUS AI Report therefore identifies 142 "large" agencies. For this tracker, the 142 agencies had to be further Agricultural Marketing Service; (5) the Northern Great Plains Regional Authority, which is now defunct; (6) the Economic and Statistics Administration, which no longer exists; and (7) the Internal Revenue Service Oversight Board, which has been suspended. We further added the Executive Office of the President as a parent agency.
Though it is probably best regarded as not an "agency" ( [130, p. 19]). Notably, we did not exclude three agencies in the Department of Agriculture listed by the ACUS Sourcebook-the Rural Business-Cooperative Service, the Rural Housing Service, and the Rural Utilities Service-that seem to be child-agencies of a USDA sub-agency known as "Rural Development. " To our knowledge, these three are the only examples of sub-sub-agencies featured in our analysis. narrowed by removing the independent regulatory agencies within the meaning of 44 U.S.C. § 3502(5) and the now-defunct agencies. 41 The result is a total of 125 agencies. 42 (c) Agencies with a known AI use case considers 49 agencies with a non-zero number of AI use cases identified by the ACUS AI Report team. 43 The ACUS AI Report identified through "an agency-by-agency, web-based search protocol, augmented by a range of third-party sources" any use case where an agency "had considered using or had already deployed AI/ML technology to carry out a core function, " discounting instances "where agencies demonstrated no intent to operationalize a given tool," such as "a pure research paper using AI/ML." Because the ACUS team focused on whether the agency was deploying AI for a "core function," identifying an AI use case is a decent proxy for presuming that that agency ought to report some inventory pursuant to the Trustworthy AI Order. If the agency did not have an inventory but it did have a non-zero number of use cases, we classify that agency as not having implemented the requirement. 44 (2) Organizational level: We calculate the compliance and noncompliance rate at both the individual/sub-agency and parent level. Appendix E.3 identifies the parent agency and its sub-agencies. (a) At the individual/sub-agency level, we disaggregate all subagencies from their parent agency. Because nearly all inventories were published by the parent-level agency, 45 we denoted a sub-agency as having published an inventory if its use cases are described and assigned to that sub-agency within the parent agency's inventory. 46 We 41 One agency that the ACUS AI Report analyzed that we did not include was the Office of Medicare Hearings and Appeals, because it is not listed in the ACUS Sourcebook. 42 These agencies are marked in the second column of the Full Tracker (see Appendix E.3, where the 125 agencies considered by the ACUS AI Report and relevant to the order were marked as "Yes" and agencies not considered in the ACUS AI Report were marked as "No. " 43 The ACUS AI Report team identified 157 use cases across 64 agencies, representing around 45% of the agencies that the team canvassed.See [102, pp. 15-16]. However, some of these agencies were not included in our original 220 agencies assessed. For example, the ACUS AI Report identifies multiple AI use cases at the Securities and Exchange Commission, however the SEC is excluded from the Trustworthy AI Order as it is an independent regulatory agency under 35 U.S.C. § 3502 (5). Therefore our final number of agencies with AI use cases is 49 instead of 64. 44 The third column of the Full Tracker, presented in Appendix E.3 marks as "Yes" only agencies for which the ACUS AI Report Team found an AI use case within the scope of the report. Agencies marked as "No" did not have a use case that the ACUS AI Report identified. Agencies marked "N/A" were excluded from this subset because they were not "large" agencies as defined by the ACUS AI Report. 45 The only exception was NIST's inventory, which was published separately from that of its parent agency (the Department of Commerce Commerce and all of its sub-agencies, we count all of the sub-agencies as part of the Department of Commerce. Whether a DOC sub-agency has an AI use case inventory, therefore, does not impact whether DOC is marked as having implemented an inventory. However, for the assessment among "large agencies with known AI uses, " child-agency identified use cases were imputed to the parent agency: for example, while the ACUS Report did not identify any AI use cases by DOC at the department level, DOC was marked as having known AI use cases in the parent-level assessment because its sub-agencies had known AI use cases. (ii) A parent-level measure is generally a more conservative measurement because it significantly reduces the number of small agencies assessed for compliance. Table 8 provides results on the filing of AI use case inventories for large, parent-level agencies that had a known use case as of 2019. The ACUS AI Report is the best available public resource for comparing the likely agencies with AI use cases. We emphasize that the difficulty of searching for and verifying agency uses of AI against the Trustworthy AI Order's requirements is precisely why disclosure is important-and, indeed, why it would be valuable even for agencies to post empty inventories so the public is made aware that the agency believes it does not have any use cases that require disclosure. Use of the ACUS AI Report involves several nuances. First, some of the 142 agencies examined in the Report were not relevant for the use case inventory requirement given that many were either independent regulatory agencies (exempted by the terms of the Trustworthy AI Order) or no longer functional. Second, the ACUS AI Report's definition of AI deviates in small ways from the Trustworthy AI Order's definition, although the latter appears to be broader. 48 47 DOJ's other two use cases were by the Justice Management Division and the Tax Division, which were not sub-agencies within our search criteria. 48 As noted above, the Trustworthy AI Order incorporates the FY2019 NDAA's definition of AI "as a reference point," but it anticipates that definition will be updated by Third, the Report included anticipated uses of AI, whereas these have a more ambiguous treatment under the order: The order indicated that AI inventories should include "current and planned uses" in Section 5(b), but it also stated in Section 9(d)(iii) that it only applied to "existing and new uses of AI" and excluded "AI research and development (R&D) activities." That said, agencies that have filed AI use case inventories have commonly included use cases of AI that are under development. Fourth, the Report team searched for AI use from January to August 2019 (see [102, pp. 15-16]), and such use cases may not be operational today. If anything, however, we would expect machine learning to have been more widely adopted over the past three years.

C.2 Summary of Findings
To address these concerns, we double-checked the 23 parent agencies' identified use cases against the Trustworthy AI Order's definition and assessed whether those use cases were still plausibly in use today. When unclear, we identify additional and current use cases that would fall under the Trustworthy AI Order's inventory obligation. 49 In two instances, it is less clear whether agencies have active use cases. 50 Regardless of specific agency use cases, what this demonstrates is substantial inconsistency in how agencies have implemented the requirement.
Some of these use cases both touch on core agency functionalities and have been the subject of public disclosure. Beyond CBP's TVS subsequent OMB guidance. See [25], Section 9(a). The CIO's 2021 guidance did not displace the NDAA's definition; instead, it stated that agencies "shall assess their use of AI and include criteria that aligns with the definition of AI as described in section 238(g) of the National Defense Authorization Act" [75]. That definition, in full, explains that AI means: [8] (1) Any artificial system that performs tasks under varying and unpredictable circumstances without significant human oversight, or that can learn from experience and improve performance when exposed to datasets. (2) An artificial system developed in computer software, physical hardware, or other context that solves tasks requiring human-like perception, cognition, planning, learning, communication, or physical action. (3) An artificial system designed to think or act like a human, including cognitive architectures and neural networks. (4) A set of techniques, including machine learning, that is designed to approximate a cognitive task. (5) An artificial system designed to act rationally, including an intelligent software agent or embodied robot that achieves goals using perception, planning, reasoning, learning, communicating, decision-making, and acting.
In contrast, the ACUS AI Report provides the following discussion of its scope:([102, p. 12]) By "artificial intelligence," we limit our scope to the most recent forms of machine learning, which train models to learn from data. These include a range of methods (e.g., neural networks, random forests) capable of recognizing patterns in a range of types of data (e.g., numbers, text, image)-feats of recognition that, if undertaken by humans, would be generally understood to require intelligence. . . . Conceptually, AI includes a range of analytical techniques, such as rule-based or 'expert' symbolic systems, but we limit our focus to forms of machine learning. Our scope also excludes conventional forms of statistical inference (e.g., focused on causal, as opposed to predictive, inference) and forms of process automation that do not involve machine learning (e.g., an online case management system). 49 For DOED, see [126]; for HUD, see [110]; and for SBA, see [140]. 50 EEOC's use of AI was only obliquely mentioned in public documentation, preventing a thorough assessment of whether the AI use should be disclosed under the executive order. The original use case cited in the ACUS AI Report was derived from a recommendation about potential improvements to EEOC's "data analysis and predictive analytics activities, " including "text analytics. " See [102, pp. 30-31]. Other documentation suggests that EEOC's staff should be trained in the use of AI [7]. And USITC's use case posed boundary questions about whether the AI use was merely for R&D versus for future operations. discussed above, we describe two further examples. First, the Internal Revenue Service's Return Review Program (RRP) uses "cuttingedge machine-learning technologies to detect, resolve, and prevent criminal and civil tax refund fraud and noncompliance" [51,113]. While the IRS has published a privacy impact assessment stating the general purpose and data used by RRP [15], and the system has been critiqued by oversight agencies [2][3][4]21], the IRS did not disclose this use case because neither it nor its parent agency published an AI use case inventory. Second, the Social Security Administration (SSA) uses an Anti-Fraud Enterprise System (AFES), an "industry-proven predictive analytics software to identify highrisk transactions for further review" [5,42]. While SSA does not seem to have fully implemented AFES, it has published a privacy impact assessment for the initiative [6] but did not include it in its AI use case inventory [76]. Use case inventories also vary in terms of the information they provide for each listed AI use case. We highlight here examples of when inventories report performance benchmarks or other methodological details that would bear on the trustworthiness of their AI use cases. For example, one of the use cases by the U.S. Citizenship and Immigration Services (USCIS)-labeled "BET/FBI Fingerprint Success Maximization"-includes a statement estimating its efficacy but also its costs, noting a model could "catch 98% of rejected submissions" and potentially have saved "42,763 additional appointments in 2020" at cost of "forcing recapture during 11% of encounters" [43]. More attention needs to be paid to evaluation and performance assessments to enable the public, Congress, and other oversight bodies to assess the benefits and drawbacks of the use of AI.
The Department of Labor's use case with narratives about workrelated injuries and illnesses from the Survey of Occupational Injuries and Illnesses also illustrates the value of transparency regarding model development. There, employees manually classified qualitative answers to the survey into six categories, and then machine-learning algorithms were adopted to code the surveys using those labeled data as a training set. As detailed in its use case inventory, "[u]se of these autocoders subsequently expanded and coded 85% of all SOII elements for reference year (RY) 2019. This gradual increase occurred by adapting the selection criterion based on careful monitoring of the processes. This monitoring allowed the coding to expand to all six elements coded (occupation, nature, part, event, source, secondary source)" [65]. While the agency has not provided measures of time saved or accuracy, it has provided laudable details about the development process.
By contrast, the FBI's Threat Intake Processing System (TIPS), which is described as using "artificial intelligence (AI) algorithms to accurately identify, prioritize, and process actionable tips, " [61] provides less insight on evaluation. The FBI noted that it can "conduct ongoing testing on the code" and "monitor and/or audit performance," but it provides no other detail on development of performance measures. 51 51 By the terms of the Trustworthy AI Order, agencies must report only "non-classified and non-sensitive use cases of AI" in their inventories, and publication should be "to the extent practicable" in light of, among other things, potential "sensitive law enforcement" information. See [25], Section 5(a), (e). Although providing information about TIPS presumably raises concerns about sensitive law enforcement decisions, we emphasize each agency's obligation to balance these concerns with the imperative of transparency, especially given that prioritization of law enforcement resources is shaped by the AI use case.
Finally, we note that the implementation rate of the AI use case inventories is higher when focusing on the agencies enumerated in the CFO Act of 1990 52 or that are members of the CIO Council. 53 The number of CFO Act agencies that have published an inventory or a public disclosure of no relevant AI use cases is 17 (77%). The number of CIO Council member agencies that published an inventory or disclosed no use cases is 71%. Although the ACUS AI Report casts doubt on HUD's public disclosure that it has no AI use cases, we mark it as having implemented an inventory for these calculations. The relatively higher implementation rate for these agencies may illustrate that the CIO Council faces challenges in ensuring agencies not directly involved with the Council prepare [20] Based on the scope of the Trustworthy AI Order, we excluded the Intelligence Community, NRC, and various defense-related agencies. and publish an AI use case inventory. Regardless, neither the Trustworthy AI Order nor the CIO's implementing guidance limited the scope of relevant agencies to those enumerated by the CFO Act or involved with the CIO Council. Figure 1 details the differences in compliance with the AI use case inventory requirement before and after the publication of our white paper (see [120]) for large, parent-level agencies with known AI use cases. The right-most column is updated through July 3, 2023. The grey row includes the total number of agencies that meet the criteria for each column. The figure excludes some agencies subject to the Chief Financial Officers Act because they are independent regulatory agencies exempted from the requirement for an AI use case inventory. Compare this figure to Table 8. HUD, SBA, and USITC are marked as non-compliant because there is strong evidence that these agencies have AI use cases.