How to Build an AI Agent Under $10/Month Using DeepSeek + Gemini

How to Build an AI Agent Under $10/Month Using DeepSeek + Gemini

(Updated: ) 📖 2 min read

AI Agents are the defining technology of 2026. However, if your agent runs multiple loops of “thinking,” “tool use,” and “verifying” using flagship models (like Claude Opus or GPT-4o-Pro), a single task execution can easily cost $0.50 to $2.00.

If your agent runs hundreds of tasks daily, your API bill will skyrocket.

To solve this, we can design a multi-model agent architecture that combines two of the cheapest models on the market: DeepSeek-R1 (for planning and reasoning) and Google Gemini Flash-Lite (for fast, structured tool execution).

Here is the step-by-step guide to building this agent pipeline for under $10.00/month.

🧮 Estimate your agent costs: Use our AI API Pricing Calculator to project token charges based on your expected agent loop frequency.


The Concept: Multi-Model Orchestration

Instead of using one expensive model for the entire agent run, we split the responsibilities:

[User Request] 
       │
       ▼
1. DeepSeek-R1 (Reasoning / Planning) ──► Generates list of actions
       │
       ▼
2. Gemini Flash-Lite (Tool Execution)  ──► Runs python code, queries API
       │
       ▼
3. Gemini Flash-Lite (JSON Parser)     ──► Formats final output for user

The Cost Breakdown (Per 1,000 Runs)

  • DeepSeek-R1 Reasoning: 4,000 input tokens + 2,000 output tokens = $0.005 per execution.
  • Gemini Flash-Lite Execution: 2,000 input tokens + 500 output tokens = $0.0004 per execution.
  • Total Cost per Agent Run: $0.0054.
  • Cost for 1,500 Runs/Month: $8.10/month (Leaving you $1.90 for hosting!).

Step 1: Writing the Agent Coordinator in Python

We will write a simple python coordinator that uses DeepSeek to plan, and Gemini to parse and execute a mock weather retrieval tool.

First, install the required packages:

pip install google-genai openai

Here is the implementation:

import os
from openai import OpenAI
from google import genai
from google.genai import types

# 1. Initialize Clients
# DeepSeek API uses the standard OpenAI-compatible client library
deepseek_client = OpenAI(
    api_key=os.environ.get("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com/v1"
)

gemini_client = genai.Client(
    api_key=os.environ.get("GEMINI_API_KEY")
)

# Mock database tool
def query_weather_api(city: str):
    # Standard database lookups or API calls go here
    return f"Weather in {city}: 72°F, Sunny."

def run_cheap_agent(user_prompt: str):
    print("🧠 Step 1: Offloading Planning to DeepSeek...")
    
    planning_prompt = f"""
    The user wants: '{user_prompt}'
    We have a tool available: query_weather_api(city).
    Reason step-by-step and write a plan.
    At the end, print the exact tool call as: TOOL_CALL: query_weather_api('city_name')
    """
    
    # We use deepseek-reasoner (DeepSeek-R1) for thinking
    plan_response = deepseek_client.chat.completions.create(
        model="deepseek-reasoner",
        messages=[{"role": "user", "content": planning_prompt}]
    )
    
    plan = plan_response.choices[0].message.content
    print(f"\n[DeepSeek Plan]:\n{plan}\n")
    
    # 2. Extract Tool Call using Gemini Flash-Lite
    print("🤖 Step 2: Parsing Tool Commands with Gemini Flash-Lite...")
    parser_prompt = f"Extract the tool call target from this text: '{plan}'"
    
    parse_response = gemini_client.models.generate_content(
        model='gemini-2.5-flash-lite',
        contents=parser_prompt,
        config=types.GenerateContentConfig(
            max_output_tokens=100
        )
    )
    
    parsed_command = parse_response.text.strip()
    print(f"[Gemini Output]: Tool Target is '{parsed_command}'")
    
    # 3. Tool Execution
    if "query_weather_api" in parsed_command:
        # Simple extraction for demo purposes
        city = parsed_command.split("'")[1]
        tool_result = query_weather_api(city)
        print(f"\n[Tool Result]: {tool_result}")
        return tool_result
        
    return "No tool executed."

if __name__ == "__main__":
    # Ensure keys are loaded in environment
    # run_cheap_agent("Check the weather for Seattle")
    pass

Step 2: Optimizing the Agent for $0 Hosting

To deploy your agent and keep your total monthly cost under $10.00:

  1. FastAPI Backend: Wrap the Python script in a FastAPI API and deploy it to Railway or Zeabur (using their starter tier for ~$5.00/month).
  2. Database Storage: Use Neon or Supabase free tiers to store agent history and system memory (PostgreSQL).
  3. Task Scheduler: Use GitHub Actions or CronJobs on the free tier to trigger periodic background agent tasks.

💡 Key Cost Optimization Rules for Agents

  1. Stop Flagship Chatter: Don’t let DeepSeek or Gemini generate long essays explaining their thought processes. Force concise planning using strict developer prompt templates.
  2. Enable Prompt Caching: Since agent system prompts are repetitive, structure your templates to reuse prefixes.
  3. Compress Agent History: Agents accumulate massive histories over multiple loops. Summarize older conversation loops to keep your context window thin.

Professor XAI
Professor XAI ML Engineer passionate about advancing AI technologies and building intelligent systems.
comments powered by Disqus