Conceptual image of a human overseeing an AI system with safety controls

Alignment Techniques and Research

AGAI 302 · Module 2

Survey the major technical approaches used to make AI systems more helpful, honest, and safe. This module covers RLHF, Constitutional AI, debate, scalable oversight, and interpretability research, while emphasizing that these methods improve alignment but do not solve it completely.

Lessons in this module

Ask your AI guide

AI Chat· AI Safety & Alignment — Alignment Techniques and Research
🤖

Ask anything about AI Safety & Alignment — Alignment Techniques and Research, or choose a suggested question below.

AI responses are educational and may not be perfectly accurate. Press Enter to send, Shift+Enter for new line.