Blog

Writing, notes, and technical references I keep coming back to.

XGBoost in Practice: When to Use It, How to Evaluate It, and Why It Works So Well

May 5, 2026

A practical guide to XGBoost for tabular ML: when it shines, what trade-offs it brings, and how to evaluate it (especially under imbalance) using precision/recall, ROC AUC, and PR AUC.

Building Foundation Models for Banking Data: A Deep Dive into PRAGMA

April 27, 2026

How Revolut’s PRAGMA encoder treats multi-source financial histories as first-class sequences—key–value–time tokenisation, hierarchical encoders, temporal encoding, and adaptation without tabular feature walls.

Local LLMs Are Finally Useful — Just Not in the Way Most People Think

March 29, 2026

Why comparing local models to frontier APIs misses the point — and how system constraints, KV-cache memory, and work like TurboQuant change what “good enough” means on your own hardware.

Tool Calling as a Trust Boundary

March 24, 2026

Building a banking GenAI assistant: why tool calling stopped being a classification problem and became a question of evaluating controlled decision-making

Edge AI Will Change What We Expect From Software

March 22, 2026

On-device inference is shifting from a niche constraint to a genuine deployment choice — and the implications for privacy, latency, cost, and product design are structural, not incremental.

When the Agent Is the Researcher: A Look at Karpathy's autoresearch

March 16, 2026

What happens when the model isn't just generating code inside a workflow, but actually running the experiment loop itself — while you define the rules it has to live within.

You Can't Build Good LLM Systems on Vague Requirements

March 13, 2026

Unclear boundaries of performance can create false sense of success with LLMs

The Age of Applied AI: Why Everything's Changing

February 15, 2026

The landscape of AI is shifting fast. Here's what I'm seeing and why it matters.