InnoAI Insight
Posts
What is Open AI's Preparedness Framework

What is Open AI's Preparedness Framework

Lee Lucas
December 20, 2023

OpenAI has recently introduced a new safety initiative known as the Preparedness Team, accompanied by the development of the Preparedness Framework.

The primary objective is to ensure that their upcoming advanced AI models are safe for deployment, minimizing potential risks.

What’s going on here?

In essence, OpenAI is establishing a framework to assess the readiness of both humans and AI models to interact harmoniously.

What does that mean?

Certainly. OpenAI comprises various safety teams, including the Superalignment team, which focuses on addressing existential risks posed by artificial superintelligence surpassing human capabilities. Simultaneously, model safety teams, responsible for ensuring the safety of models such as GPT-3.5 and GPT-4 for everyday use, are in place.

The newly formed Preparedness Team will concentrate on anticipating risks associated with the most advanced AI models, often referred to as frontier models. Their approach is grounded in factual analysis and a builder mindset.

The framework categorizes various aspects, including hacking risks, the persuasiveness of models on humans, their level of autonomy, and more. Each model will receive a safety rating, ranging from low to critical risk. Only models with low and medium-risk ratings will receive approval for launch, while high-risk models may undergo further development. For additional details on the framework (beta), refer to this link.

The Preparedness Team will handle the technical evaluation of the models. OpenAI's leadership, with inputs from external safety advisors, will make the final decisions. Notably, the Board of Directors retains the authority to reverse decisions if they deem the models unsafe.

Why should I care?

Recent achievements, such as Deepmind's LLM solving previously unsolved math problems and visual models surpassing humans in solving captchas, underscore the increasing capabilities of AI models. The potential risks arise from the ability of AI models to create new toxins, identify vulnerabilities in security systems, or autonomously utilize computers. These practical risks outweigh hypothetical scenarios of AI bots harming humans. Establishing a framework to understand the limitations of these models is essential for proactive development rather than reactive adjustments post-launch.

Among the recent safety updates from OpenAI, one overarching theme suggests that OpenAI might be gearing up for the release of a new, more intelligent (and potentially riskier) model. However, it's crucial to interpret this as hyperbole, rumor, or speculation, as opposed to confirmed information.