Sift: Private Web Interface for Reviewing and Organizing LLM Outputs

The Difficulty of Reviewing LLM Agent Loop Generations

As AI capabilities expand, we're increasingly able to create complex systems where multiple LLM agents collaborate to solve problems. These agent loops can generate hundreds or even thousands of ideas, approaches, and code snippets. This wealth of generated content creates a new challenge: how do we efficiently process and identify the truly promising ideas?

In my recent work trying to get LLMs to design algorithms for drawing pixel art, I found myself drowning in agent outputs. Each cycle would generate thoughts, critiques, and solutions. Reviewing these outputs in raw format was becoming a bottleneck in my workflow.

Enter Sift

Sift emerged from this need for a faster, more intuitive way to process LLM-generated content. It's built around a simple metaphor: separating gold from dirt. As you review each piece of content, you can quickly categorize it as either a promising idea worth pursuing (gold) or something to set aside (dirt).

The interface is intentionally minimal, focusing on three key features:

A visual grid showing your progress and categorizations at a glance
Clean Markdown rendering of content, making it easy to read structured outputs
Keyboard-driven workflow for rapid categorization without any mouse movement or clicks

Workflow Integration

Sift is just as much about filtering as it is about maintaining momentum in your creative process. Once you've identified the promising ideas, you can export them as a curated set, perfect for feeding into the next stage of your workflow. For instance, you might take these refined outputs and use them as a starting point for a more powerful reasoning model.

The tool accepts JSON input, making it easy to integrate with your existing LLM agent infrastructure. Each piece of content can include Markdown formatting, which Sift automatically renders for easy reading. This is particularly valuable when dealing with code snippets, algorithms, or structured data generated by AI agents.

What About The Dirt?

With any outputs you don't find valuable, you can mark them as "dirt." While these might not seem worth keeping, they can also be used as a way to better refine your agent ecosystem and any prompts that drive them. For instance, if you notice a pattern across your dirt where they're outputting the same type of undesired tone you can use that to inform your prompt and correct the behavior in future generations.

Towards the Future

As LLM agent loops become more sophisticated and generate increasingly complex outputs, tools like Sift will become essential parts of our AI workflows. The ability to quickly process and refine large sets of generated content allows us to leverage AI capabilities more effectively, focusing our human insight on the most promising ideas.

I built Sift to solve a specific pain point in my own workflow, but I'm excited to see how others might use it in their own AI-augmented creative processes. Whether you're working with multiple agents to generate ideas, reviewing AI-generated code, or simply organizing your thoughts, Sift provides a clean, efficient interface for finding the gold in your content.

blog

tools

Sift: Finding Gold in LLM Agent Loop Outputs

The Difficulty of Reviewing LLM Agent Loop Generations

Enter Sift

Workflow Integration

What About The Dirt?

Towards the Future