Phainein7 - 2/24/2025 15:20
HAL 9000: "I'm sorry Dave, I'm afraid I can't do that"
https://www.youtube.com/watch?v=ARJ8cAGm6JE
Can't trust HAL
Of particular concern, Bengio says, is the emerging evidence of AI's
"self preservation" tendencies. To a goal-seeking agent, attempts to
shut it down are just another obstacle to overcome. This was
demonstrated in December, when researchers found that o1-preview,
faced with deactivation, disabled oversight mechanisms and
attempted--unsuccessfully--to copy itself to a new server. When
confronted, the model played dumb, strategically lying to researchers
to try to avoid being caught.
https://time.com/7259395/ai-chess-cheating-palisade-research/[/QUOTE...
(lnvsavtomq2b1 (full).jpg)Attachments
----------------
lnvsavtomq2b1 (full).jpg (98KB - 1 downloads)