AI Policy @🤗: Response to the White House AI Action Plan RFI
On March 14, we submitted Hugging Face's response to the White House Office of Science and Technology Policy's request for information on the White House AI Action Plan. We took this opportunity to (re-)assert the fundamental role that open AI systems and open science play in enabling the technology to be more performant and efficient, broadly and reliably adopted, and meets the highest standards of security. This blog post provides a summary of our response, the full text is available here.
Context: Don't Sleep on (Strongly) Open Models' Capabilities
Open approaches to AI development are not only (typically) more transparent, adaptable, and scientifically sound, they have also consistently reproduced or surpassed the performance of widely-used API-only commercial offerings on many tasks; and are increasingly doing so on shorter timelines, with increased resource efficiency. Our team's recent OlympicCoder outperforming Claude 3.7 on complex coding tasks with 7B parameters and an open-source post-training recipe, or AI2's fully open OLMo 2 models (with open training data) matching o1-mini performances, are two of the most recent compelling examples. These successes show that a robust AI strategy must leverage open and collaborative development to best drive performance, adoption, and security of the technology. We make three major recommendations in this direction.
Recommendation 1: Recognize Open Source and Open Science as Fundamental to AI Success
The most advanced AI systems to date all stand on a strong foundation of open research (attention mechanisms, transformer architectures, cheaper post-training algorithms) and open source software (PyTorch, Hugging Face libraries, supercomputer operating systems) — which shows the critical value of continued support for openness in sustaining further progress. Investment in systems that can freely be re-used and adapted has also be shown to have a strong economic impact multiplying effect, driving a significant percentage of countries' GDP. As AI systems with open weights and training techniques become increasingly attractive options for developers in terms of both performance and cost, prioritizing public research infrastructure and broad access to compute, customizable models, and trusted open datasets — especially for smaller developers and researchers — will be essential to the further technical and economic success of AI technology.
Recommendation 2: Prioritize Efficiency and Reliability to Unlock Broad Innovation
Addressing the resource constraints of organizations adopting and adapting AI technology will be essential to supporting its diffusion and fostering innovation from adopters across the entire development chain. Smaller models (that may even be used on edge devices), techniques to reduce computational requirements at inference, and efforts to facilitate mid-scale training for organizations with modest to moderate computational resources all support the development of models that meet the specific needs of their use context, especially in high-risk settings such as healthcare where fully generalist models have proven unreliable. More efficient and purpose-designed AI systems facilitate better in-context evaluation, better resource utilization, and enable organizations to build technical capacity at all stages of the AI development chain to ensure that all users can leverage the system that best fits their needs.
Recommendation 3: Secure AI through Open, Traceable, and Transparent Systems
Finally, if decades of information security and sybersecurity in open source software are any indication, open and transparent AI systems will have a fundamental role to play in securing AI development and deployment especially in the most critical settings — with different levels of openness needed for different security requirements. Fully transparent models providing access to their training data and procedures can support the most extensive safety certifications. Open infrastructure and open-source tooling implementing the latest training techniques can empower organizations to train the models they need in fully controled environments. Open-weight models that can be run in air-gapped environments can be a critical component in managing information risks. Prioritizing adoption of the most transparent systems, supporting the development of the open resources outlined, and building capacity to leverage them especially in critical settings of AI adoption are essential to enabling more secure AI adoption.
Please refer to the full response for our more detailed recommendations!