Stephen Carmody

A place to write about AI topics and ML in production

Posts Interesting Papers About

Categories

Miscellaneous

Hello World

AI

PyTorch model(x) to GPU: The Hidden Journey of Neural Network Execution

Vector Databases Explained: Search in era of AI

A Guide to LLM Inference (Part 4): Speculative Decoding & Batching

A Guide to LLM Inference (Part 3): Model Compression

A Guide to LLM Inference (Part 2): Attention Optimisation

A Guide to LLM Inference (Part 1): Foundations

A brief Introduction to LLMOps

Fine-Tuning Pre Trained Models

An Introduction to the Transformer Architecture (Part 2)

An Introduction to the Transformer Architecture (Part 1)

NLP

An Introduction to the Transformer Architecture (Part 2)

An Introduction to the Transformer Architecture (Part 1)

Infra

Vector Databases Explained: Search in era of AI

PyTorch

PyTorch model(x) to GPU: The Hidden Journey of Neural Network Execution

GPU

PyTorch model(x) to GPU: The Hidden Journey of Neural Network Execution

CUDA

PyTorch model(x) to GPU: The Hidden Journey of Neural Network Execution