The AI Engineering Stack I Actually Use

A pragmatic look at what tools, models, and patterns are worth the hype in 2026 — from someone building things with them daily.

AI engineeringtoolsLLMsstack

Every week there's a new model or framework claiming to change everything. Most of it is noise. Here's what's actually in my stack after building AI-driven projects for the past two years.

Models

Claude for most generation tasks — coding assistance, content, structured output. The reasoning capabilities on complex, multi-step problems are consistently better than alternatives I've tried.

Gemini Flash for high-volume, latency-sensitive tasks where cost matters. The speed-to-quality ratio is hard to beat for things like classification or summarization at scale.

Local models via Ollama when I need offline capability or privacy guarantees. Mistral 7B and Llama 3 cover most of my local needs. The gap with frontier models is real but narrowing fast.

Frameworks

I've tried most of the orchestration frameworks. My current opinions:

Anthropic SDK / OpenAI SDK directly — Just use these. They're good, they're maintained, and you won't spend hours debugging which layer of abstraction swallowed your tool call.

Structured output — I do almost all structured extraction with JSON mode or tool use rather than parsing unstructured text. Much more reliable.

Dev environment

Claude Code — AI-assisted coding inside the terminal. I use it for most things I'd have reached for a search engine for. It's most useful for tasks with clear context: refactoring a specific function, generating boilerplate, explaining unfamiliar code.

Antigravity — VS Code-based IDE by Google. Familiar environment, clean integration with the Google ecosystem, fast to set up.

GitLab — Version control and CI/CD in one place. The built-in pipelines handle a lot: testing, building, deploying — no external services needed. For solo projects, often more than enough.

Claude Cowork — For everything around data preparation and research. Structuring, summarizing, making sense of information — before the actual development process starts. Saves a lot of time on groundwork.

Higgsfield.ai — AI-generated background graphics and videos. Get to usable visuals fast without spending hours in image editing.

The key thing I've learned: keep your context tight. A small, focused conversation gets better results than a long conversation with lots of accumulated context.

What I'd change

If I were starting fresh today:

  • I'd spend more time on evals earlier. The hardest part of AI engineering isn't building the first version — it's knowing when it's actually working well enough to ship.
  • I'd be more skeptical of RAG as a default answer. Vector search + embedding is powerful, but it's also easy to build something that seems to work in demos and fails in production.
  • I'd design for model upgrades from the start. Models improve fast. If your system is tightly coupled to a specific model's quirks, every upgrade is a migration project.

The field moves fast. What I wrote here will be at least partially wrong in six months.