---
title: "Open Interpreter + custom provider: cheap inference for a code-running agent"
description: Open Interpreter runs code on your machine and needs a model behind it. With one --api_base flag you can route through jusCode and drop inference cost 60-80% without changing how OI runs code.
tldr: Run `interpreter --api_base https://api.juscode.co/v1 --api_key <jcg_token> --model jusCode-auto`. OI's litellm layer passes the OpenAI-compatible request through; jusCode picks the cheapest capable model per turn. Local code execution, tool approvals, and conversation memory are unaffected.
date: 2026-05-27
author: jusCode
cluster: integration
tags: open-interpreter, litellm, openai-compatible, custom-base-url, cost-optimization
---

# Open Interpreter + custom provider: keep the agent, drop the bill

[Open Interpreter](https://openinterpreter.com) (OI) lets a model run code on your machine to actually accomplish a task: read files, install packages, query databases, plot data. The agent loop is short and tight: model emits code → OI runs it → OI feeds stdout/stderr back → model decides next step. That tight loop means a lot of model calls per task, which means a lot of money on a frontier model. Pointing OI at jusCode drops the per-call cost without changing the code-execution behavior at all.

## Why it works

OI's model layer is [litellm](https://github.com/BerriAI/litellm), which speaks OpenAI-compatible by default. Anything that accepts a base URL and bearer token works.

## Setup

### 1. Mint a jusCode API key

Sign in at [juscode.co/login](https://juscode.co/login). Open [juscode.co/developer](https://juscode.co/developer) → **Keys** tab → **Mint key**. Copy the `jcg_…` token.

### 2. Launch OI against jusCode

CLI flags (one-off):
```bash
interpreter \
  --api_base https://api.juscode.co/v1 \
  --api_key jcg_your_token_here \
  --model jusCode-auto
```

Or set them once in `~/.openinterpreter/config.yaml`:
```yaml
llm:
  api_base: https://api.juscode.co/v1
  api_key: jcg_your_token_here
  model: jusCode-auto
```

### 3. Verify

Run `interpreter` with no further args, then ask it `what python version is installed`. You'll see OI emit a one-liner, ask permission to run, execute, and report: same flow as before. The model behind it is now jusCode-auto.

## What changes vs default

| Aspect | Default OI (frontier model) | OI + jusCode |
|---|---|---|
| First-token latency | 400-800ms | 200-500ms (smaller models warm up faster) |
| Cost per "read file → decide" loop step | $0.01-0.05 | $0.002-0.01 |
| Cost per "write 200-line script" step | $0.04-0.10 | $0.02-0.05 |
| Code-execution behavior | unchanged | unchanged |
| Conversation memory | unchanged | unchanged |
| Tool approval flow | unchanged | unchanged |
| Local file access | unchanged | unchanged |

## What about safety mode / `--safe_mode`?

OI's safe mode (interactive approval before each code execution) is a CLIENT-side feature. The model behind it doesn't know whether you'll approve, deny, or modify the code. Switching to jusCode doesn't weaken safe mode: your approvals still gate every execution.

## What about offline / `--local`?

`--local` runs an Ollama or LM Studio model on your machine. Don't combine `--local` with `--api_base`; they're mutually exclusive. Use `--local` when you want privacy + zero per-call cost; use jusCode when you want frontier quality at routed-down cost.

## Multi-step tasks: where the savings compound

OI tasks tend to be long. "Clean this dataset and produce three charts" might be 30-50 model calls: read CSV, inspect schema, write cleaning script, run it, check output, write plotting script, run it, check output, iterate. Per-call cost matters more than per-token cost.

Sample task, "load `sales.csv`, find the top 5 products by revenue in 2025, write each to a separate JSON file":

| Model | Total calls | Total cost | Wall time |
|---|---|---|---|
| Claude Sonnet (direct) | 12 | ~$0.18 | 38s |
| GPT-4.1 (direct) | 11 | ~$0.14 | 41s |
| jusCode-auto | 12 | ~$0.03 | 35s |

The wall time barely moves because the bottleneck is local code execution, not the model. The cost moves a lot because every "tell me what the columns are" step now lands on a small fast model.

## When you'd stay on the direct provider

- **Custom system prompts that depend on provider-specific tool calling.** OI uses litellm's normalized tool-use, so this is rare.
- **Your org has an inference contract you need to bill against.** jusCode is a passthrough; underlying providers see jusCode's account.

## Switching back

Drop the `--api_base` / `--api_key` flags or comment them out in `config.yaml`. OI falls back to whatever was set in environment variables (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.).

## Further reading

- [Custom agent harness on an OpenAI-compatible base URL](/blog/custom-agent-harness-openai-compatible): the same pattern for any agent runtime
- [Aider + cheap inference](/blog/aider-cheap-inference): for non-code-executing pair-programming agents
- [jusCode API reference](/docs/api-reference): the endpoint OI's litellm layer hits
