Interesting Papers

This is a collection of papers I’ve read and enjoyed and feel like have the most ROI in understanding the current DL / LLM landscape.

Foundational Concepts and Architectures

Network Stability / Regularization Techniques

[2014] Dropout: A Simple Way to Prevent Neural Networks from Overfitting
[2015] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Fine-Tuning & PEFT

Production and Deployment

[2023] Efficient Memory Management for Large Language Model Serving with PagedAttention