We love this example because it illustrates just how difficult it will be to fully understand LLMs. The five-member Redwood team published a 25-page paper explaining how they identified and validated these attention heads. Yet even after they did all that work, we are still far from having a comprehensive explanation for why GPT-2 decided to predict Mary as the next word.
Current approach to ML model development has the same vibe with people writing a block of code that somehow works and then put comments like "no idea why but it works, modify at your own risk’
Current approach to ML model development has the same vibe with people writing a block of code that somehow works and then put comments like "no idea why but it works, modify at your own risk’