Simple 2-step guide to use the 52 AI models in your IDE - takes 3 minutes.
Choose based on where you are:
If you're logged into the server via SSH and running VSCode/Cursor there:
Option 1: No configuration needed - Copilot auto-detects localhost:11434
Just use Copilot! It will find Ollama automatically.
Option 2: Explicit configuration (optional)
Create/edit .vscode/settings.json in your workspace:
{
"github.copilot.advanced": {
"ollamaUrl": "http://localhost:11434"
}
}
Reload VSCode, then use Copilot chat → Select models from dropdown.
Create/edit .vscode/settings.json in your workspace:
{
"cursor.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.overrideOpenAIKey": "ollama",
"cursor.general.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.general.overrideOpenAIKey": "ollama",
"cursor.chat.overrideModelString": "llama3.1:70b"
}
⚠️ Important for Cursor: You need BOTH the override settings above:
Change the model: Edit cursor.chat.overrideModelString to any model from the complete list below.
Restart Cursor, then use the chat - it will use your specified Ollama model (free!).
If you're on your laptop/desktop and connecting to the server:
# In a terminal on YOUR LAPTOP, run:
ssh -L 11434:localhost:11434 your-username@10.100.226.153
# Keep this terminal open while working!
Option 1: No configuration needed - Copilot auto-detects localhost:11434
Just open Copilot chat → Click model dropdown → Select "Ollama" → Pick a model!
Option 2: Explicit configuration (optional)
Create/edit .vscode/settings.json:
{
"github.copilot.advanced": {
"ollamaUrl": "http://localhost:11434"
}
}
Create/edit .vscode/settings.json in your workspace:
{
"cursor.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.overrideOpenAIKey": "ollama",
"cursor.general.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.general.overrideOpenAIKey": "ollama",
"cursor.chat.overrideModelString": "llama3.1:70b"
}
⚠️ Important for Cursor: You need BOTH the override settings above:
Change the model: Edit cursor.chat.overrideModelString to any model from the complete list below.
Restart Cursor, then use the chat - it will use your specified Ollama model (free!).
What is an SSH tunnel?
localhost on your laptopHow to keep it running:
# To run in background:
ssh -f -N -L 11434:localhost:11434 your-username@10.100.226.153
# To stop it:
pkill -f "ssh.*11434:localhost:11434"
Replace placeholders:
your-username → Your actual server username⚠️ You MUST have both override settings in .vscode/settings.json:
Without the override URL, Cursor will try to use cloud APIs (paid).Without the model string, it will try to use your API plan.
⚠️ Cursor Limitation: You can only use ONE model at a time in the override settings.
To switch models:
Method 1: Comment/Uncomment (Easiest)
{
"cursor.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.overrideOpenAIKey": "ollama",
"cursor.general.overrideOpenAIBaseURL": "http://localhost:11434/v1",
"cursor.general.overrideOpenAIKey": "ollama",
// Only ONE can be active (uncommented):
"cursor.chat.overrideModelString": "llama3.1:70b", // Currently active
// "cursor.chat.overrideModelString": "deepcoder:14b", // Comment out
// "cursor.chat.overrideModelString": "llama3.1:8b", // Comment out
}
To switch: Comment out the current one, uncomment your desired model, restart Cursor.
Method 2: Edit the Value (Quick)
// Change from:
"cursor.chat.overrideModelString": "llama3.1:70b"
// To:
"cursor.chat.overrideModelString": "deepcoder:14b"
Then restart Cursor.
Pick any model from the complete list below.
Just select from the dropdown - no configuration needed:
For Cursor users: Set cursor.chat.overrideModelString to one of these models:
Best for coding:
deepcoder:14b ⭐ Best coding modeldevstral:24b Great for codeBest overall quality:
llama3.1:70b (default, excellent all-around)hf.co/unsloth/Llama-3.3-70B-Instruct-GGUF:latest (latest/best)Best for reasoning:
deepseek-r1:70b (shows thinking process)Fastest:
llama3.1:8b (quick responses)qwen2.5:32b (balanced speed/quality)All models below can be used in Cursor (set in cursor.chat.overrideModelString) or selected in Copilot:
llama3:8b - Fast, good qualityllama3:70b - High qualityllama3:latest - Latest Llama 3llama3.1:8b - Fast Llama 3.1llama3.1:70b - Excellent all-around ⭐llama3.2:1b - Tiny, very fastllama3.2:3b - Small, fastllama3.2-vision:90b - Vision/image understandingllama3.3:70b - Latest Llama 3.3llama4:latest - Latest Llama 4llama4:16x17b - Mixture of expertsllama4-12k:latest - Extended contexthf.co/unsloth/Llama-3.3-70B-Instruct-GGUF:latest - Best Llama ⭐deepseek-r1:8b - Fast reasoningdeepseek-r1:14b - Good reasoningdeepseek-r1:32b - Strong reasoningdeepseek-r1:32b-8k - 8k contextdeepseek-r1:70b - Best reasoning ⭐deepseek-r1:70b-8k - 8k contextdeepseek-r1:70b-12k - Extended contextdeepcoder:14b - Best for coding ⭐qwen2.5:3b - Very fastqwen2.5:14b - Good qualityqwen2.5:14b-instruct - Instruction tunedqwen2.5:14b-instruct-8k - 8k contextqwen2.5:14b-instruct-q4_K_M - Quantizedqwen2.5:32b - Balanced speed/quality ⭐qwen2.5vl:latest - Vision modelqwen2.5vl:32b - Vision model 32Bqwen3:8b - Latest Qwen gen 3qwen3:14b - Qwen 3 mediumqwen3:32b - Qwen 3 largeQwen3-30B-A3B-Thinking:latest - Thinking modelhf.co/unsloth/Qwen3-30B-A3B-Instruct-2507-GGUF:latest - Optimizedhf.co/unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:latest - Thinking optimizedmistral:latest - Latest Mistralmistral-small:24b - Mistral Smallmistral-small3.2:24b - Latest small versiondeepcoder:14b - Best coding model ⭐devstral:24b - Mistral for codedevstral:latest - Latest Devstralphi4-reasoning:14b - Reasoning modelphi4-reasoning:14b-12k - Extended contextphi4-mini-reasoning:latest - Mini reasoningllama3.2-vision:90b - Best visionqwen2.5vl:latest - Qwen visionqwen2.5vl:32b - Qwen vision 32Bllava-llama3:latest - LLaVA visionminicpm-v:latest - MiniCPM visiongpt-oss:20b - Open-source GPT-stylegpt-oss:120b - Large GPT-stylegranite4:micro - IBM Granite microgranite4:tiny-h - IBM Granite tinygemma3:latest - Google Gemma 3nomic-embed-text:latest - Embeddings (not for chat)unsloth.Q8_0.gguf:latest - Quantized modelTotal: 52 models available
To see this list anytime:
curl http://localhost:11434/api/tags | jq -r '.models[].name' | sort
Test the connection:
# While SSH tunnel is running, in a new terminal:
curl http://localhost:11434/api/tags
You should see a list of models. ✅
Test in your IDE:
Solution: Make sure SSH tunnel is running
# Check if tunnel is running:
ps aux | grep "ssh.*11434"
# If not, start it:
ssh -L 11434:localhost:11434 your-username@10.100.226.153
Solution 1: Restart your IDE (VSCode/Cursor)
Solution 2: Check SSH tunnel is running (see above)
Solution 3: Verify connection
curl http://localhost:11434/api/tags
Solution:
Ctrl+Shift+P → "Reload Window"localhost:11434Summary:
No server configuration needed!No local installation needed!Just connect and use!
That's all you need! You're now using powerful AI models in your IDE for free. 🎉
Server: 10.100.226.153:11434 | Access Method: SSH tunnel | Cost: $0