This content originally appeared on HackerNoon and was authored by Shawn Gordon
Introduction
If you have spent any time building with LLMs over the last couple of years, you have probably heard of LangChain. Their open-source framework for building LLM applications is one of the most popular in the space, and their commercial product, LangSmith, is a platform for observing, evaluating, and deploying agents. What you might not have heard about yet is SmithDB, which LangChain announced at their Interrupt conference on May 13, 2026. It is a purpose-built distributed database for agent observability, and it solves a problem that I think a lot of people building production agents are going to recognize immediately.
The problem statement is basically that agent traces have gotten too big and too complex for general-purpose databases to handle well. When LangSmith first launched in 2023, most people were building simple RAG pipelines and prompt chains. The traces generated by those applications were small and predictable. Now in 2026, agents are running for hours, making hundreds of nested tool calls, and generating traces with multi-modal content like images and audio. A single modern agent trace can be megabytes of deeply nested data, and a span might start minutes or even hours before it finishes. Traditional observability stores were never designed for that kind of workload. All the images I used for SmithDB came from LangSmith.
\n 
Let's Dive In
SmithDB is built in Rust and uses the Apache DataFusion query engine and the Vortex columnar file format, with heavy customizations for LangSmith's specific workload patterns. If you are not familiar with those tools, DataFusion is an extensible query engine written in Rust that uses Apache Arrow as its in-memory format. Vortex is a newer columnar file format that positions itself as an aspirational successor to Apache Parquet, with significantly faster random access reads and scans. I’ve had Vortex on my list of things to look at specifically, but hadn’t had a chance yet.
At its core, SmithDB is an object-storage-backed log-structured merge tree (LSM). It initially made me think of SlateDB, but on further review, they are pretty different. The architecture has three main pieces: object storage for durable trace data, a small Postgres metastore for segment metadata, and stateless ingestion, query, and compaction services. Because the query and ingestion services are stateless, scaling is just a matter of adding compute rather than managing local disks. That is a big deal for enterprises that need to self-host or deploy across multiple clouds. I borrowed the architecture diagram from Langsmith.
\n 
The performance numbers are worth noting. SmithDB delivers P50 latencies of 92ms for trace tree loads, 71ms for single run loads, 82ms for run filtering, and 400ms for full-text search. LangChain claims this makes core LangSmith experiences up to 12 to 15 times faster than before, and the early customer feedback from companies like Clay, Vanta, and Cogent Security seems to back that up. Clay, for example, logs hundreds of millions of observability events per day and reported that the performance improvements were immediately noticeable.
What Makes It Different
There are a few engineering decisions here that I think are genuinely interesting and worth understanding.
The first is how SmithDB handles progressive querying over object storage. Most LangSmith queries want the newest data for a given project. A naive approach would scan all candidate files, sort-merge everything, then apply a limit. SmithDB instead walks backward through time, builds a bounded window over the newest segments, and stops as soon as it has enough data. That turns an expensive "sort everything, then limit" operation into a much cheaper bounded scan.

\n 
The second is how it handles the fact that agent spans are not point-in-time events. In traditional request/response applications, a span starts and finishes in milliseconds. Agent spans can stay open for a long time while the agent makes tool calls, retries, or hands off to other agents. SmithDB treats a run as a sequence of events rather than a single immutable row. That sounds simple, but it affects the entire query engine, from how filters are applied to how compaction works.
The third is late materialization of large fields. Agent traces contain big payloads, sometimes megabytes of JSON from tool outputs and LLM responses. SmithDB separates core run fields from these large fields, keeping only pointers in the main rows. The query engine only fetches the full payload when you actually open a specific run or explicitly ask for those fields. That means loading a list of runs or applying filters stays fast because you are not reading megabytes of data you do not need.
There is also a custom inverted index layout optimized for object storage that powers sub-second full-text search and JSON key-path filtering. On local disk, an index can rely on cheap random seeks. On object storage, that pattern falls apart because every unnecessary request adds latency. SmithDB's index layout uses term-sorted row groups with min/max zones so it can prune aggressively before fetching any postings data.
Trying It Out
SmithDB is not a standalone product that you download and install separately. It is the new data layer powering LangSmith. As of the announcement, 100% of US Cloud ingestion and 100% of tracing UI query traffic runs through SmithDB. All major filters, including metadata, feedback, text search, tree filters, and trace filters, are backed by SmithDB. If you are already using LangSmith, you are already using SmithDB, whether you realized it or not.
For self-hosted deployments, SmithDB is not yet available but is supposed to be coming soon. Given the object-storage-backed architecture, self-hosting should be straightforward since there are no local disks to manage and no complex sharding to configure.
If you want to try LangSmith itself, you can sign up at smith.langchain.com. The platform is framework-agnostic, so you do not need to be using LangChain or LangGraph to take advantage of it. It works with OpenAI SDK, Anthropic SDK, Vercel AI SDK, LlamaIndex, and custom implementations via OpenTelemetry.
Summary
The broader trend here is interesting. As AI agents get more complex and longer-running, the infrastructure around them needs to evolve too. Traditional APM and observability tools were built for request/response workloads where a span lasts milliseconds. Agent observability is a fundamentally different problem with deeply nested traces, multi-modal content, and spans that can stay open for hours. The fact that LangChain felt the need to build an entirely new database to solve this is indicative of the evolution of data and the need for new tools in this AI world.
So, what the heck is SmithDB? It is the purpose-built data layer behind LangSmith, designed from the ground up to handle the unique challenges of agent observability at scale. Built in Rust on top of Apache DataFusion and Vortex, it uses an object-storage-backed LSM architecture that delivers sub-100ms trace loads and sub-second full-text search. It is not something you install on its own, but if you are using LangSmith, it is what makes everything fast. And given how much agent trace data is growing, having a purpose-built engine for it feels less like a luxury and more like a necessity.
Check out my other What the Heck is… articles at the links below:
- What The Heck Is DuckDB?
- What the Heck Is Malloy?
- What the Heck is PRQL?
- What the Heck is GlareDB?
- What the Heck is SeaTunnel?
- What the Heck is LanceDB?
- What the heck is SDF?
- What the Heck is Paimon?
- What the Heck is Proton?
- What the Heck is PuppyGraph?
- What the Heck is GPTScript?
- What the Heck is WarpStream?
- What the Heck is Apache Iggy?
\
This content originally appeared on HackerNoon and was authored by Shawn Gordon
Shawn Gordon | Sciencx (2026-05-27T07:06:37+00:00) What the Heck is SmithDB?. Retrieved from https://www.scien.cx/2026/05/27/what-the-heck-is-smithdb/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.