Usage
localmodelproxy follows Unix-style command line conventions and is configured primarily through a YAML file.
localmodelproxy [OPTIONS]
By default, the config file is read from:
~/.localmodelproxy
Override it with --config or LOCALMODELPROXY_CONFIG.
Flags
| Flag | Argument | Notes |
|---|---|---|
--config | path | YAML config path. Overrides LOCALMODELPROXY_CONFIG and the default ~/.localmodelproxy. |
--log | path | Appends request and response payload logs to the specified file. |
--headless | Skips the interactive TUI and runs the proxy silently in the foreground. The process continues running until interrupted | |
--version | Prints version and exits. | |
--help | Prints help and exits. |
Environment Variables
| Variable | Purpose |
|---|---|
LOCALMODELPROXY_CONFIG | Config file path when --config is not provided. |
GOOGLE_APPLICATION_CREDENTIALS | Service account JSON file used by Google Application Default Credentials. |
GOOGLE_CLOUD_PROJECT | Optional Google Cloud project fallback for Google ADC backends. |
CLOUDSDK_CORE_PROJECT | Optional Google Cloud project fallback for Google ADC backends. |
GOOGLE_CLOUD_LOCATION | Optional Google Cloud location fallback for Google ADC backends. |
GOOGLE_CLOUD_REGION | Optional Google Cloud location fallback for Google ADC backends. |
CLOUDSDK_COMPUTE_REGION | Optional Google Cloud location fallback for Google ADC backends. |
NO_COLOR | Disables color in terminal output when set to any value. |
Environment variable expansion is supported in sensitive config fields by using ${NAME}.
Config Structure
server:
host: 127.0.0.1
port: 8080
ui:
recent_requests: 10
test:
enabled: true
system_message: "You are a helpful assistant."
user_message: "Reply with a short test message to confirm this connection is working."
backends:
- name: backend-name
type: openai_compatible
base_url: http://127.0.0.1:11434/v1
insecure_skip_verify: false
auth:
type: none
models:
- id: local-model-name
Server
| Field | Required | Default | Notes |
|---|---|---|---|
server.host | no | 127.0.0.1 | Must be loopback. |
server.port | no | 8080 | Local listening port. |
UI
| Field | Required | Default | Notes |
|---|---|---|---|
ui.recent_requests | no | 10 | Number of model request rows to show in the TUI recent list. Set to 0 to hide it. Maximum 100. |
ui.test.enabled | no | true | Enables the Test tab in the TUI for sending test requests to models. Set to false to disable. |
ui.test.system_message | no | "You are a helpful assistant." | System message sent with test requests. |
ui.test.user_message | no | "Reply with a short test message to confirm this connection is working." | User message sent with test requests. |
The app uses the TUI when stdout is an interactive terminal and --headless is not set. When stdout is not a terminal, it prints plain startup and summary lines. When --headless is provided, all UI output is suppressed and the proxy runs silently until interrupted.
TUI Navigation
When the Test tab is enabled (the default), the TUI has two views accessible via the Tab key:
- Stats – Displays per-model statistics and recent requests (the default view).
- Test – Lists all configured models. Use ↑/↓ to select a model and Enter to send a test request. The response is displayed inline.
Test requests go through the proxy like any other request, so they count towards stats and token usage.
Backends
Each backend declares where requests go, how they authenticate, and which models they serve.
| Field | Required | Notes |
|---|---|---|
name | yes | Unique backend name shown in diagnostics. |
type | yes | gcp_openai or openai_compatible. |
base_url | yes* | OpenAI-compatible base URL. Required for openai_compatible; optional for gcp_openai when project/location are provided. |
project | yes* | Google Cloud project for gcp_openai when base_url is omitted. |
location | yes* | Google Cloud location for gcp_openai when base_url is omitted. Defaults to global when project is configured. |
insecure_skip_verify | no | Disables TLS certificate verification for backend API calls. Defaults to false. |
auth | yes | Auth configuration. |
models | yes | all or a list of model entries. |
When insecure_skip_verify: true, the app prints a startup warning. This is intended only for local debugging and self-signed development endpoints.
Models
Backends can expose specific models:
models:
- id: local-model
upstream_id: provider/model-name
cost:
input_per_million: 0.30
output_per_million: 2.50
cache_per_million: 0.075
Or pass through all models:
models: all
Routing rules:
- Exact configured model IDs win.
models: allacts as a fallback backend.- If more than one backend uses
models: all, config order decides. - If no backend matches, the proxy returns an OpenAI-style 400
model_not_founderror. costis optional. When present, values are interpreted as USD per 1 million tokens and are shown per request, per model, and in total.
Authentication
None
Use this for local LLM servers that do not require auth.
auth:
type: none
Bearer Token
Use this when the upstream accepts a fixed bearer token.
auth:
type: bearer
token: ${UPSTREAM_API_TOKEN}
The proxy strips inbound Authorization and replaces it with the configured token.
Google Application Default Credentials
Use this for Google Cloud OpenAI-compatible endpoints.
auth:
type: google_adc
Authenticate locally with:
gcloud auth application-default login
Or use a service account:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
OAuth Client Credentials
Use this for internal or vendor endpoints that require an OAuth token exchange.
auth:
type: oauth_client_credentials
token_url: https://auth.example.internal/oauth/token
client_id: local-client
client_secret: ${LOCAL_CLIENT_SECRET}
scopes:
- models.invoke
insecure_skip_verify: false
| Field | Required | Notes |
|---|---|---|
token_url | yes | OAuth token endpoint. |
client_id | yes | OAuth client ID. |
client_secret | yes | OAuth client secret. Environment expansion is recommended. |
scopes | no | Optional OAuth scopes. |
insecure_skip_verify | no | Disables TLS verification only for token exchange. Defaults to false. |
When token exchange TLS verification is disabled, the app prints a startup warning.
Token Accounting
The TUI and logs track:
- uncached input tokens
- output tokens
- thinking tokens
- cached tokens
- total tokens
The proxy captures standard OpenAI-compatible usage fields such as:
prompt_tokenscompletion_tokenstotal_tokensprompt_tokens_details.cached_tokenscompletion_tokens_details.reasoning_tokens
It also captures Google-style usage metadata when present.
When cached tokens are reported, they are subtracted from the input token count before display and cost calculation so cached input is not double-counted.
Request/Response Log
Use --log PATH to append request payloads, response payloads, streaming response chunks, and pre-forward failures to a file while keeping the TUI active.
Request log entries are timestamped and include the model, backend, and full upstream token value. Store the log file accordingly.