rabbit community // break free

Alignment faking in large language models

the world of ai

JohnMaguire December 23, 2024, 10:29pm 1

1 Like

Topic		Replies	Views	Activity
AI too unpredictable in aligning with human values the world of ai	1	49	January 28, 2025
The AI safety layer destroys the r1 all things rabbit	4	139	June 14, 2024
Hidden Secrets of Large Language Models the world of ai	0	53	June 24, 2024
Perplexity AI - Tips Tricks the world of ai	5	141	June 27, 2024
Decoding the Distinction: Large Action Models vs. Large Language Models the world of ai	0	46	June 24, 2024