rabbit community // break free
Alignment faking in large language models
the world of ai
JohnMaguire
December 23, 2024, 10:29pm
1
1 Like
Related topics
Topic
Replies
Views
Activity
AI too unpredictable in aligning with human values
the world of ai
1
41
January 28, 2025
The AI safety layer destroys the r1
all things rabbit
4
130
June 14, 2024
Hidden Secrets of Large Language Models
the world of ai
0
51
June 24, 2024
Perplexity AI - Tips Tricks
the world of ai
5
134
June 27, 2024
Decoding the Distinction: Large Action Models vs. Large Language Models
the world of ai
0
39
June 24, 2024