Running Competitive Agent Loops Locally with Deepseek-R1
Today I started running competitive agent loops using Deepseek-R1 32B locally on a Mac M1 with 32GB of RAM. The experiment is a simple one: two agents with different expertise compete to design the best interface while drawing inspiration from each other's approaches.
There is now a follow-up to this post where I explore divergent thinking in local Deepseek-R1 32B.
How to Get Started
If you want to just go play on your own, I'll start with a quick guide to get you up and running. The rest of this post will cover the why and what I'm trying to accomplish with this experiment.
We'll be using Ollama to run Deepseek-R1 32B locally.
Quick Setup
- Download Ollama or use a command:
curl -fsSL https://ollama.com/install.sh | sh
- Start the Ollama server with Deepseek-R1 32B (you can run a smaller model using 14b or 7b instead of 32b if you want to use less memory):
ollama run deepseek-r1:32b
- Copy/clone the script:
git clone git@github.com:jmatsumura/scripts.git
- Run the script (in the deepseek-playground/ directory if you cloned the whole repo):
python rivals.py --rounds=10
The script will then produce logs as it runs each round and saves the output to a JSON file in the same directory as it was executed.
The Power of Competitive Collaboration
A colleague of mine told me about this technique for having agents compete and I've always been curious, but until now I didn't have a way to easily, or rather cheaply, experiment with it. Traditional approaches to AI-assisted design often rely on a single agent working in isolation. But what if the system pits together multiple agents with opposing philosophies (say a minimalist and an expressive UI designer) and let them compete while they can see what moves the other is making?
I've been thinking about how to get LLMs to be more than just a super thesaurus for your thoughts. I think competitive agent loops might be one way to get the thinking to be more dynamic and more distant from your initial input. This might not always be a good thing, but I've found myself wanting much more contrarian approaches to what I've been doing recently as I'm learning a lot of things I simply don't know much about like game development. It would be amazing if agents could be another mechanism to help me tease out the unknown unknowns.
Beyond Mixture of Experts
While this might sound similar to a mixture of experts approach, the key difference is competition. In traditional mixture of experts, different specialized models work together harmoniously, each handling their specific domain.
In competitive agent loops, the idea is to intentionally create tension. Each agent not only brings their expertise but also their convictions about what makes them superior to others. They're not just solving their part of the problem, they're competing to prove their philosophy produces better results than their competition.
This approach resonates with me personally because I love competition. Whether it's gaming, sports, or problem-solving, I've always found that a bit of healthy rivalry pushes me to think more about my approach. I'm curious to see if this competitive dynamic can create the same effect in AI agents, driving them to not just solve problems, but to constantly one-up each other while learning from their rival's tactics.
Key Differences
Mixture of Experts
- Cooperative specialization
- Clear domain boundaries
- Harmonious integration
- Predictable roles
Competitive Agents
- Philosophical rivalry
- Overlapping challenges
- Creative tension
- Dynamic evolution
Finally Free to Experiment
One of the biggest challenges I've faced with agent experiments has been the cost of API calls. When you're paying per token, you naturally limit how many iterations you run. But now... Going for a run? Start a 10-round competition. Heading to sleep? Let them battle for 100.
What This Enables
- Run experiments while you sleep or work on other things
- Test numerous agent personalities
- Try longer competitive chains / iterations
- Compare results from multiple parallel competitions
- Experiment with temperature and other parameters freely
- Don't spend tons of time over-engineering around cost savings
This freedom to experiment has me so excited to play a lot. Instead of carefully planning each interaction to minimize tokens, I can focus on discovering which competitive dynamics produce the most valuable results for a given task.
Understanding the Agent Loop
The competitive agent loop I designed for today follows a pattern where two UX designers with opposing philosophies tackle the same design challenge:
Minimalist Designer
- Believes in ruthless simplicity
- Uses whitespace as a weapon
- Eliminates unnecessary elements
- Learns from expressive insights
Expressive Designer
- Creates delightful experiences
- Focuses on rich interactions
- Prioritizes user engagement
- Learns from minimalist clarity
The Script in Action
The script creates a competitive loop where each agent:
- Maintains their core design philosophy
- Reviews their competitor's previous thoughts (not their solutions)
- Extracts valuable insights while staying true to their approach
- Iterates on their design incorporating these learnings
Key Components
- Agent definitions with distinct personalities
- Design task specification
- Thought extraction and evolution tracking
- Multi-round competitive iteration
The logic around each agent being able to see the other'sthoughts is what I'm mostly testing first. I want to see if this reasoning model can be used to help drive more inspired solutions over time by anchoring off of thought patterns and not a deliverable like a technical specification or JSON data.
Running Your Own Experiments
The script is a super light entrypoint to get you started. The hope is that it's easy to modify for your own experiments. Here are some variables you might want to try messing with (many of these are modifiable right at the top of the script or could easily be elevated to command line arguments):
- Agent personalities and expertise
- Design challenge complexity
- Number of competitive rounds
- Incentives for the agents to want to win
- Interaction patterns between agents
- The model you're using
I'm planning on continuing to test it out, tweaking some of the variables mentioned above, analyzing the results, then posting follow-ups here. Maybe it'll just point me right back to OpenAI/Anthropic and my token-concious agent loops, but I'm eager to see how much I can leverage a fully local setup with Deepseek. The initial results have proved worth continuing to research: each agent takes inspiration from the other's best ideas while upholding their own core beliefs.