Welcome to Kunvar's corner of the cyberspace
Hello, I'm Kunvar!
My main focus these days is to take a trained neural network and attempt to understand the various structures they've learnt and how they use these structures to implement various algorithms.
This typically includes investigating the learned parameters of various model components, understanding how they interact with other model components, how the model uses seemingly unrelated learned features to implement various algorithms, and how we can intervene and change the model behavior in a desirable way by editing the internal parameters.
If you have thoughts on anything I've written, want to discuss some projects, get/give feedback, or otherwise want to contact me, my email is : 1stuserhere@gmail.com
Here are some projects that I'm working on these days:
Working on
Mechanistic analysis of meta out-of-context learning in LLMs