AGI Research Overview
Overview
Artificial General Intelligence (AGI) aims to develop software capable of reasoning, problem-solving, and adapting to new challenges without task-specific programming. Unlike narrow AI, which is trained for specific tasks, AGI can perform any intellectual task a human can, potentially automating a majority of economically valuable work.
Background
- Historical Context: AGI has its roots in the mid-20th century, coinciding with the advent of digital computers in the 1940s. Pioneers like Alan Turing posed critical questions about machine intelligence, leading to the concept of the Turing test in 1950. The term "artificial intelligence" was coined by John McCarthy in 1955.
- Early Progress: Early optimism about AGI emerged in the 1960s, but setbacks led to an "AI winter" in the 1970s. Progress resumed in the 1990s with IBM’s Deep Blue defeating Garry Kasparov, and futurists like Ray Kurzweil predicted AGI by 2029.
- Recent Advancements: Significant milestones include DeepMind’s AlphaGo beating Lee Sedol in 2016 and OpenAI’s ChatGPT in 2022, marking substantial steps toward AGI.
Human Interest and Capital Raising
- Pop Culture Influence: Characters like JARVIS from Iron Man exemplify the ambition for intelligent systems that act as both assistants and decision-makers.
- VC Funding: Founders with AGI aspirations are raising capital, aiming to develop systems within VC funding timelines by projecting current scaling laws.
VC Deal Activity
- Deal Value and Count:
- 2019: $701.2 million, 113 deals
- 2020: $66.5 million, 11 deals
- 2021: $2,986.1 million, 23 deals
- 2022: $1,484.5 million, 41 deals
- 2023: $16,563.7 million, 139 deals
- 2024: $15,938.6 million, 139 deals
Technologies and Processes
-
Current Technologies:
- Transformer Models (e.g., GPT-4): Pattern-recognition models excelling in sequential data processing and natural language generation. Data-intensive, lack causal reasoning, and fail to generalize beyond trained domains.
- Reinforcement Learning: Models learn by optimizing rewards through trial and error in structured environments. Struggle with real-world complexity, long-term planning, and sparse rewards.
- Self-Play AI: AI systems experiment with new model architectures and conduct AI research themselves. Some regulations do not allow research labs to pursue self-replicating systems.
- Joint-Embedding Predictive Architecture (JEPA): Builds a world model predicting the most probable future state based on underlying abstract features. Limited to image completion.
-
Emerging Approaches:
- Neurosymbolic AI: Combines neural networks’ pattern recognition with symbolic reasoning’s logic- and rule-based processing. Immature, difficult to scale, and complex to integrate effectively.
- Cognitive Architectures: Models simulating human cognitive functions (memory, attention, decision-making). Challenging to model dynamic, humanlike thought processes.
- Embodied AI: AI systems interact with the physical world, learning from sensorimotor feedback. Complex to scale, particularly in diverse, real-world environments.
- Long-Term Memory Systems: Designed to store and retrieve knowledge across time, allowing for cumulative learning. Difficult to implement efficient long-term memory.
- Large World Models (LWMs): Visual models developing 3D reasoning about the real world and learning as young children do. Spatial reasoning cannot be immediately transferred to other forms of intelligence.
Leading Researchers and Critiques
- François Chollet:
- Generalization vs. Scaling: Argues that true AGI will not emerge from simply increasing the size and complexity of current models. These systems are highly specialized and lack the ability to generalize knowledge across domains.
- Causal and Abstract Reasoning: Emphasizes the need for systems that can reason abstractly and understand causal structures of the world.
- Skill Acquisition with Minimal Data: Highlights the inefficiency of current models in acquiring new skills with minimal prior information. Future architectures need to be designed for efficient, adaptable learning with less data dependence.
Applications and Milestones
- Healthcare: Predictive diagnostics, robot-assisted surgery, and treatment optimization for specific diseases. Autonomous diagnosis, treatment planning, and care personalization across all