Research infrastructure for the age of autonomous research.
ML research is bottlenecked by the parts no one wants to do. Provisioning the right hardware. Wrangling broken open-source code. Tracking which experiment produced which artifact. Reproducing a result from three weeks ago. Forking a training run to try a small change.
These chores eat the calendars of the world's best researchers. They are also exactly the kind of work that LLMs should be doing. Except today's infrastructure was built for humans clicking through dashboards, not agents calling APIs.
We're building the substrate that changes that.
What we're building
An agent-native infrastructure layer.
Compute that agents can provision in seconds. First-class primitives for snapshotting, branching, and rolling back experiment state. Observability designed to be read by an LLM, not a person squinting at a dashboard.
Harness-agnostic by design.
We're not building a harness. We're building the layer underneath. The same primitives serve a homegrown evaluation script, an off-the-shelf RL loop, or anything else that needs to provision compute, fork state, and trace what happened. We co-develop with multiple harnesses to keep the abstractions honest, because the point is for the infrastructure to make any harness better at running ML experiments.
A knowledge base of experiments.
Every run produces a structured trace: code, data lineage, GPU logs, results, the agent's reasoning. Over time these traces become a searchable, verifiable record of what has been tried, and the raw material for the next generation of research agents.
The bet
The bottleneck on autonomous research isn't GPUs, and it isn't smarter agents. It's experiment state: trustworthy enough to build on, fast enough for an agent to iterate at speed, and observable enough that an agent can debug what its own system is doing.
Today every team builds these from scratch, badly. Experiment state lives in one place, artifacts in another, the LLM's chain of thought in a third, and nothing ties them together. We want the infrastructure to give you the most thorough possible observability for free, with experiment state and artifacts tied back to the agent reasoning and traces that produced them.
That graph of experiments, branches, failures, verified results, and the reasoning that yielded them is the durable asset. We're building it.
Who we are
Thomas Boser and Karan Brar. Between us, close to a decade of training models, plus the infrastructure to serve them at scale, and over a year of work on getting LLMs to do ML themselves. We met at Reducto.
We're building hiloop full-time, in San Francisco.
Get in touch at founders@hiloop.ai