Lady Butterfly she/her@reddthat.com to Technology@lemmy.worldEnglish · 3 months agoClaude Opus 4.6: This AI just passed the 'vending machine test' - and we may want to be worried about how it didnews.sky.comexternal-linkmessage-square9fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkClaude Opus 4.6: This AI just passed the 'vending machine test' - and we may want to be worried about how it didnews.sky.comLady Butterfly she/her@reddthat.com to Technology@lemmy.worldEnglish · 3 months agomessage-square9fedilink
minus-squareUlrich@feddit.orglinkfedilinkEnglisharrow-up0·edit-23 months agoIt passed a test in a simulated environment. Put it back where it was in reality and prove it to me there.
minus-squareRepple (she/her)@lemmy.worldlinkfedilinkEnglisharrow-up0·3 months ago“New model is so much better than old model when given test that we never gave to the old model.“ Wut
It passed a test in a simulated environment. Put it back where it was in reality and prove it to me there.
“New model is so much better than old model when given test that we never gave to the old model.“
Wut