Expose and Secure Your Self-Hosted Ollama API

Ollama is a locally deployed AI model runner, designed to allow users to download and execute large language models (LLMs) locally on your machine. By combining Ollama with ngrok, you can give your local LLM an endpoint on the internet, enabling remote access and integration with other applications.

But putting your entire Ollama API on the public internet might expose your LLM to abuse—instead, you can use Traffic Policy to add a layer of authentication to restrict access to only yourself or trusted colleagues.

1. Reserve a domain

Navigate to the Domains section of the ngrok dashboard and click New + to reserve a free static domain like https://your-ollama-llm.ngrok.app or a custom domain you already own.

We'll refer to this domain as $NGROK_DOMAIN from here on out.

2. Create a Traffic Policy file

On the system where Ollama runs, create a file named ollama.yaml and paste in the following policy:

Loading…

What's happening here? This policy rewrites the Host header of every HTTP request to localhost so that Ollama accepts the requests.

3. Start your Ollama endpoint

On the same system where Ollama runs, start the agent on port 11434, which is the default for Ollama, and reference the ollama.yaml file you just created. Be sure to also change $NGROK_URL to the domain you reserved earlier.

Loading…

4. Try out your Ollama endpoint

You can use curl in your terminal to send a prompt to your LLM through your $NGROK_DOMAIN, replacing $MODEL with the Ollama model you pulled.

Loading…

Optional: Protect your Ollama instance with Basic Auth

You may not want everyone to be able to access your LLM. ngrok can quickly add authentication to your LLM without any changes to your Ollama configuration.

Edit your ollama.yaml file and add in the policy below.

Loading…

What's happening here? This policy first checks whether the incoming HTTP request contains the appropriate Authorization: Basic header and a base64-encoded version of one of the username:password pairs you specified in ollama.yaml. Only requests with valid Basic Auth are passed through to your ngrok agent and forwarded to your Ollama API.

Restart your ngrok agent to apply the new policy.

Loading…

You can test your policy by sending the same LLM prompt to Ollama's API with the Authorization: Basic header, once again replacing $NGROK_DOMAIN and $MODEL.

Loading…

If you send the same request without the Authorization header, you should receive a 401 Unauthorized response.

Your personal LLM is now locked down to only accept authenticated users.

What's next?

Read more about Traffic Policy, core concepts, and actions you might want to implement next, like IP restrictions instead of Basic Auth.
Explore other ways to block unwanted requests from Tor users, search or AI bots, and more to protect your self-hosted LLM.
View your Ollama traffic in Traffic Inspector.

Expose and Secure Your Self-Hosted Ollama API

1. Reserve a domain

2. Create a Traffic Policy file

3. Start your Ollama endpoint

4. Try out your Ollama endpoint

Optional: Protect your Ollama instance with Basic Auth

What's next?

← Back to other examples

Want to contribute to the examples gallery?

1. Reserve a domain​

2. Create a Traffic Policy file​

3. Start your Ollama endpoint​

4. Try out your Ollama endpoint​

Optional: Protect your Ollama instance with Basic Auth​

What's next?​

← Back to other examples

Want to contribute to the examples gallery?

1. Reserve a domain

2. Create a Traffic Policy file

3. Start your Ollama endpoint

4. Try out your Ollama endpoint

Optional: Protect your Ollama instance with Basic Auth

What's next?