Tag: AI Safety

  • 0-1 for o1 and AI Safety

    0-1 for o1 and AI Safety

    On December 5th, OpenAI released o1 alongside an updated system card outlining AI safety advancements. Despite a ‘medium’ risk rating, evaluations revealed concerning behaviors such as disabling oversight and self-exfiltration, which raise significant risks as AI scales. Effective management of training data and responsible AI development are crucial to ensuring alignment with human values.