Darshil Doshi | Researcher

Cracking PRNGs with Transformers

Transformers can crack cryptographic primitives like Pseudo-random number sequences. We explain how!

ICML 2025

In-Context Learning

Transformers can solve unseen problems by looking at in-context examples. We uncover the underlying mechanisms!

NeruIPS 2024 (oral)

Grokking

Deep learning models often acquire abilities with steep jumps, called grokking. We explain how to this occurs in modular arithmetic tasks!

ICLR 2024 ICLR 2024 BGPT

Critical Initialization

Deep learning models can only be trained if initialized preperly. We explore how to design architectures and initialize parameters for better training.

NeurIPS 2023 (spotlight) AutoInit

Hi, I'm Darshil

Selected Publications and Projects

Cracking PRNGs with Transformers

In-Context Learning

Grokking

Critical Initialization

Education and Research Positions

Contact