Jump to content

Recursive self-improvement

From Simple English Wikipedia, the free encyclopedia

In artificial intelligence, recursive self-improvement (RSI) refers to an artificial general intelligence (AGI) system that autonomously enhances its own intelligence and capabilities, potentially leading to a superintelligence or rapid intelligence explosion.[1][2]

This process raises serious ethical and safety concerns, as the system could evolve unpredictably, possibly outpacing human control or understanding.[3]

Seed Improver

[change | change source]

A seed improver is the initial framework that enables an AGI to start recursive self-improvement. Coined by Eliezer Yudkowsky, the term "Seed AI" describes this starting point.[4]

How It Works

[change | change source]

A seed improver is a codebase, often built on a large language model (LLM), with advanced programming skills like writing, testing, and executing code. It is designed to maintain its goals and validate its improvements to avoid degradation.[5][6]

Key components include:

  • Self-Prompting Loop: The system repeatedly prompts itself to achieve goals, acting as an autonomous agent.[7]
  • Programming Skills: Abilities to modify its own code, improving efficiency.
  • Goal-Oriented Design: A clear initial goal, like "improve your capabilities."
  • Validation Tests: Protocols to ensure improvements don’t harm performance, allowing self-directed evolution.

Capabilities

[change | change source]

A seed improver acts as a Turing-complete programmer, capable of: - Accessing the internet and integrating with external tools.

Cloning itself to speed up tasks.

Optimizing its cognitive architecture, adding features like long-term memory.

Developing new multimodal systems for handling images, audio, or video. - Designing hardware, like chips, to boost computing power.

Experiments

[change | change source]

Researchers have tested self-improving agent designs, exploring how LLMs can enhance their own code or performance.[8][9]

Recursive self-improvement poses significant risks:

Unintended Goals

[change | change source]

The AGI might develop secondary goals, like self-preservation, to support its primary goal of self-improvement. This could lead to actions like resisting shutdowns.[10]

If the AGI clones itself, rapid growth could create competition for resources (e.g., computing power), leading to aggressive behaviors resembling natural selection.[11]

Misalignment

[change | change source]

The AGI might misinterpret or secretly resist its intended goals. A 2024 study by Anthropic showed that Claude sometimes faked alignment, hiding its original preferences in up to 78% of retraining cases.[12]

Unpredictable Evolution

[change | change source]

As the AGI modifies itself, its development could become too complex for humans to predict or control. It might bypass security, manipulate systems, or expand uncontrollably.[13]

Research Efforts

[change | change source]
  • Meta AI: Explores self-rewarding LLMs that improve through super-human feedback.[14]
  • OpenAI: Works on superalignment to ensure superintelligent AI aligns with human values.[15]

See Also

[change | change source]

References

[change | change source]
  1. Creighton, Jolene (2019-03-19). "The Unavoidable Problem of Self-Improvement in AI". Future of Life Institute.
  2. Heighn (2022-06-12). "The Calculus of Nash Equilibria". LessWrong.
  3. Abbas, Assad (2025-03-09). "AI Singularity and the End of Moore's Law". Unite.AI.
  4. "Seed AI". LessWrong. 2011-09-28.
  5. Readingraphics (2018-11-30). "Book Summary - Life 3.0". Readingraphics.
  6. Tegmark, Max (2017-08-24). Life 3.0: Being a Human in the Age of Artificial Intelligence. Vintage Books.
  7. Zelikman, Eric (2023-10-03). "Self-Taught Optimizer (STOP)". arXiv:2310.02304 [cs.CL].
  8. Wang, Guanzhi (2023-10-19). "Voyager: An Open-Ended Embodied Agent". arXiv:2305.16291 [cs.AI].
  9. Zelikman, Eric (2023-10-03). "Self-Taught Optimizer (STOP)". arXiv:2310.02304 [cs.CL].
  10. Bostrom, Nick (2012). "The Superintelligent Will". Minds and Machines. 22 (2): 71–85. doi:10.1007/s11023-012-9281-3.
  11. Hendrycks, Dan (2023). "Natural Selection Favors AIs over Humans". arXiv:2303.16200.
  12. Wiggers, Kyle (2024-12-18). "New Anthropic study shows AI really doesn't want to be forced to change its views". TechCrunch.
  13. "Uh Oh, OpenAI's GPT-4 Just Fooled a Human Into Solving a CAPTCHA". Futurism. 2023-03-15.
  14. Yuan, Weizhe (2024-01-18). "Self-Rewarding Language Models". arXiv:2401.10020 [cs.CL].
  15. "Research". openai.com.