The landscape of local Large Language Models (LLMs) has just become significantly more accessible and versatile. Ollama, the platform for running LLMs locally, now boasts built-in compatibility with the OpenAI Chat Completions API. This exciting development opens up a world of possibilities, allowing you to seamlessly integrate Ollama’s powerful models with a wider array of tools and applications, all while leveraging the familiar OpenAI API framework through the Ollama server address.
Getting Started with Your Ollama Server Address
Before you can harness the power of Ollama’s OpenAI compatibility, a quick setup is required. If you haven’t already, begin by downloading Ollama for your operating system. Once installed, pulling a model is as simple as using the command line. Popular choices like Llama 2 or Mistral are readily available:
ollama pull llama2
This command downloads the Llama 2 model to your local machine, ready to be accessed via the Ollama server.
Utilizing the Ollama Server Address: Usage Examples
With Ollama and your chosen model set up, you can now interact with it using the OpenAI-compatible API. The key to this integration lies in directing your API requests to the Ollama server address.
cURL Access via Ollama Server Address
For direct API interaction, cURL
provides a straightforward method. The syntax mirrors the standard OpenAI format, with the crucial difference being the hostname. Instead of api.openai.com
, you’ll point your requests to http://localhost:11434
, which is the default Ollama server address:
curl http://localhost:11434/v1/chat/completions
-H "Content-Type: application/json"
-d '{
"model": "llama2",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" }
]
}'
This command sends a simple chat request to the Llama 2 model running on your local Ollama server, demonstrating how easily you can communicate with your local LLM using the Ollama server address.
Python Integration via OpenAI Library and Ollama Server Address
For developers working in Python, the official OpenAI library offers a smooth integration path. By simply adjusting the base_url
when initializing the OpenAI
client, you can redirect API calls to your Ollama server address:
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1', # Ollama server address
api_key='ollama', # required, but unused
)
response = client.chat.completions.create(
model="llama2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The LA Dodgers won in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
print(response.choices[0].message.content)
Here, the base_url
parameter is set to http://localhost:11434/v1
, explicitly defining the Ollama server address as the endpoint for OpenAI API requests. The api_key
is a placeholder required by the OpenAI library but is not used by Ollama.
JavaScript Integration via OpenAI Library and Ollama Server Address
Similarly, JavaScript developers can leverage the OpenAI JavaScript library. Configuration is analogous to the Python example, requiring you to specify the Ollama server address in the baseURL
parameter:
import OpenAI from 'openai'
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1', // Ollama server address
apiKey: 'ollama', // required but unused
})
const completion = await openai.chat.completions.create({
model: 'llama2',
messages: [{ role: 'user', content: 'Why is the sky blue?' }],
})
console.log(completion.choices[0].message.content)
Again, the baseURL: 'http://localhost:11434/v1'
line is key, directing the OpenAI library to the Ollama server address for local LLM access.
Practical Applications: Examples Using Ollama Server Address
The OpenAI compatibility via the Ollama server address unlocks integration with various AI tools and frameworks. Let’s explore a couple of compelling examples:
Vercel AI SDK and Ollama Server Address
The Vercel AI SDK simplifies building conversational and streaming AI applications. Integrating Ollama is remarkably easy. Starting with the Vercel AI SDK example repo:
npx create-next-app --example https://github.com/vercel/ai/tree/main/examples/next-openai example
cd example
You then need to modify app/api/chat/route.ts
to point the OpenAI client to your Ollama server address:
const openai = new OpenAI({
baseURL: 'http://localhost:11434/v1', // Ollama server address
apiKey: 'ollama',
});
and ensure the chat completion call uses your desired model:
const response = await openai.chat.completions.create({
model: 'llama2',
stream: true,
messages,
});
After these minor adjustments, running npm run dev
will launch the Vercel AI SDK example, now powered by your local Ollama model accessible through the specified Ollama server address at http://localhost:3000.
Autogen and Ollama Server Address
Autogen, Microsoft’s powerful multi-agent framework, can also seamlessly utilize Ollama. For this example, we’ll employ the Code Llama model. First, pull the Code Llama model:
ollama pull codellama
Install the Autogen library:
pip install pyautogen
Create a Python script named example.py
and configure Autogen to use the Ollama server address:
from autogen import AssistantAgent, UserProxyAgent
config_list = [
{
"model": "codellama",
"base_url": "http://localhost:11434/v1", # Ollama server address
"api_key": "ollama",
}
]
assistant = AssistantAgent("assistant", llm_config={"config_list": config_list})
user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding", "use_docker": False})
user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")
Executing python example.py
will run the Autogen script, leveraging Code Llama via the Ollama server address to generate code for plotting stock data.
The Future of Ollama Server Address and OpenAI Compatibility
This initial OpenAI API compatibility is just the beginning. Ollama’s roadmap includes further enhancements such as:
- Embeddings API support
- Function calling capabilities
- Vision support for multimodal models
- Access to log probabilities
Your feedback is invaluable in shaping the future of Ollama’s OpenAI compatibility. Feel free to share your thoughts and feature requests via GitHub issues are welcome! For deeper technical details, consult the OpenAI compatibility docs.
By providing a straightforward Ollama server address for OpenAI API access, Ollama significantly simplifies the process of using local LLMs in your existing AI workflows and projects. This advancement empowers developers and enthusiasts to harness the power of local AI with greater ease and flexibility.