This article provides a comprehensive guide to deploying AnythingLLM, an open-source AI-powered knowledge base tool, on a Windows Server (2019 or 2022) without Docker, to serve as a website customer support chatbot. It covers two deployment methods—using the AnythingLLMDesktop.exe (desktop application) and Node.js manual deployment—both enabling web access and integration with free local AI models (e.g., Llama3 via Ollama). The guide addresses whether both methods can coexist, their differences, and practical steps for setup, tailored for users seeking a self-hosted, privacy-focused solution with no additional model costs.
Why AnythingLLM for Website Customer Support?
AnythingLLM is ideal for website customer support due to its ability to:
- Manage Knowledge Bases: Upload FAQs, product manuals (PDF, TXT, Markdown), and create Retrieval-Augmented Generation (RAG) workflows for accurate responses.
- Integrate Free AI Models: Use open-source models like Llama3 via Ollama, avoiding costs of commercial APIs (e.g., OpenAI).
- Provide Web Access: Offer a web interface (default port 3001) for management and an embeddable JavaScript chatbot widget for websites.
- Ensure Privacy: Run entirely locally, keeping data on your server.
The tool supports non-Docker deployment on Windows Server, making it accessible for users avoiding containerization due to compatibility or complexity issues.
Prerequisites
Before deployment, ensure your Windows Server (2019/2022) meets these requirements:
- Hardware: Minimum 8GB RAM, 4-core CPU, 10GB storage (for Llama3 model and data). An NVIDIA GPU (4GB+ VRAM) is optional for faster AI inference.
- Software: Windows Server 2019 or 2022, updated with the latest patches (check via
winver
). - Network: Internet access for initial setup; firewall configured to allow ports 3001 (AnythingLLM) and 11434 (Ollama).
- Website: A static or dynamic website (e.g., WordPress) to embed the chatbot widget via HTML/JavaScript.
Deployment Methods
AnythingLLM can be deployed in two ways, both supporting web access and website integration. Below are the steps for each, followed by a comparison and guidance on running them simultaneously.
Method 1: AnythingLLMDesktop.exe (Desktop Application)
The AnythingLLMDesktop.exe is a pre-packaged Windows application that simplifies deployment by bundling Node.js, Electron, and dependencies into a single executable.
Installation Steps
- Download:
- Visit AnythingLLM Desktop and download the latest
AnythingLLMDesktop.exe
(approx. 100-200MB).
- Visit AnythingLLM Desktop and download the latest
- Install:
- Double-click the .exe file and follow the installation wizard (default path:
C:\Program Files\AnythingLLM
). - If Windows Defender prompts, select “More info” > “Run anyway” (common for unsigned apps).
- Double-click the .exe file and follow the installation wizard (default path:
- Run:
- Launch the application via the desktop icon or Start menu.
- It automatically starts a web server on port 3001.
- Access the web interface at
http://localhost:3001
(local) orhttp://<server-ip>:3001
(remote).
- Configure Storage:
- Set the data storage path in the app settings (e.g.,
C:\AnythingLLMData\desktop
).
- Set the data storage path in the app settings (e.g.,
- Set Up Ollama (Free AI Model):
- Download Ollama for Windows from Ollama.
- Run in PowerShell:
This downloads the Llama3 8B model (~4-5GB) toollama pull llama3
C:\Users\<YourUser>\.ollama\models
. - Verify:
ollama run llama3
and test with a prompt. - In the AnythingLLM web interface (Settings > LLM Provider), set Ollama API to
http://localhost:11434
.
- Configure Knowledge Base:
- Create a workspace (e.g., “CustomerSupport”).
- Upload FAQs or product documents (PDF, TXT, Markdown).
- Test the chatbot in the web interface by asking questions like “How do I return a product?”
- Embed Chatbot in Website:
- In the web interface, go to Settings > Embed Widget, and generate the JavaScript code:
<script src="http://<server-ip>:3001/embed/chat.js"></script> <div id="anything-llm-chat" data-bot-id="your-bot-id"></div>
- Add this to your website’s HTML (e.g.,
<body>
tag or via a WordPress plugin like “Insert Headers and Footers”).
- In the web interface, go to Settings > Embed Widget, and generate the JavaScript code:
- Firewall Configuration:
- Open ports 3001 (AnythingLLM) and 11434 (Ollama):
New-NetFirewallRule -Name "AnythingLLM-Desktop" -DisplayName "Allow AnythingLLM Desktop" -Protocol TCP -LocalPort 3001 -Action Allow New-NetFirewallRule -Name "Ollama-Desktop" -DisplayName "Allow Ollama Desktop" -Protocol TCP -LocalPort 11434 -Action Allow
- Open ports 3001 (AnythingLLM) and 11434 (Ollama):
Notes
- Advantages: Quick setup (<5 minutes), no need to install Node.js or Git, ideal for testing or small websites.
- Limitations: Less flexible for customization, higher memory usage (~1-2GB due to Electron), potential compatibility issues on virtualized Windows Server (e.g., Hyper-V, see GitHub Issue #752).
- Updates: Re-download the latest .exe or use the in-app update feature.
Method 2: Node.js Manual Deployment
Manual deployment involves cloning the AnythingLLM GitHub repository and running it with Node.js, offering greater control and stability for production environments.
Installation Steps
- Install Dependencies:
- Clone Repository:
- Create a directory (e.g.,
D:\AnythingLLM\node
):mkdir D:\AnythingLLM\node cd D:\AnythingLLM\node git clone https://github.com/Mintplex-Labs/anything-llm.git cd anything-llm
- Create a directory (e.g.,
- Install Dependencies:
- Run:
yarn setup
to install frontend, server, and collector dependencies. - If errors occur, try:
yarn cache clean
or set a domestic mirror:yarn config set registry https://registry.npmmirror.com
- Run:
- Configure Environment:
- Copy the example environment file:
copy server\.env.example server\.env
. - Edit
server\.env
(e.g., with Notepad++):STORAGE_DIR="D:\AnythingLLM\node\data" PORT=3002 # Avoid conflict with .exe LLM_PROVIDER=ollama OLLAMA_BASE_URL=http://localhost:11435 NODE_ENV=production
- Ensure the storage directory exists and has write permissions.
- Copy the example environment file:
- Set Up Database (Prisma):
- In the server directory:
cd server
. - Run:
This sets up SQLite (default) or MySQL if configured.npx prisma generate --schema=./prisma/schema.prisma npx prisma migrate deploy --schema=./prisma/schema.prisma
- In the server directory:
- Build Frontend:
- In the frontend directory:
cd frontend
. - Run:
yarn build
. - Copy:
copy dist\* server\public /s
.
- In the frontend directory:
- Run Services:
- Server:
cd server && set NODE_ENV=production && node index.js
. - Collector:
cd collector && set NODE_ENV=production && node index.js
. - Access the web interface at
http://<server-ip>:3002
.
- Server:
- Set Up Ollama:
- Copy the Ollama installation to a new directory (e.g.,
C:\Program Files\Ollama2
). - Set a different port:
set OLLAMA_HOST=127.0.0.1:11435 && ollama.exe serve
. - Pull model:
set OLLAMA_HOST=127.0.0.1:11435 && ollama pull llama3
. - In the AnythingLLM web interface, set Ollama API to
http://localhost:11435
.
- Copy the Ollama installation to a new directory (e.g.,
- Process Management (PM2):
- Install PM2:
npm install -g pm2
. - Create
ecosystem.config.js
in the root directory:module.exports = { apps: [ { name: 'node-llm-server', script: 'server/index.js', cwd: 'server', env: { NODE_ENV: 'production' } }, { name: 'node-llm-collector', script: 'collector/index.js', cwd: 'collector', env: { NODE_ENV: 'production' } } ] };
- Run:
pm2 start ecosystem.config.js && pm2 save
.
- Install PM2:
- Firewall Configuration:
New-NetFirewallRule -Name "AnythingLLM-Node" -DisplayName "Allow AnythingLLM Node" -Protocol TCP -LocalPort 3002 -Action Allow New-NetFirewallRule -Name "Ollama-Node" -DisplayName "Allow Ollama Node" -Protocol TCP -LocalPort 11435 -Action Allow
- Embed Chatbot:
- In
http://<server-ip>:3002
, generate the widget code and embed it in your website’s HTML.
- In
Notes
- Advantages: More stable, customizable (edit source code, adjust ports), ideal for production with high concurrency.
- Limitations: Requires technical setup (Node.js, Git, Yarn), longer initial configuration.
- Updates: Run
git pull origin main && yarn setup
and restart PM2.
Running Both Methods Simultaneously
You can run both .exe and Node.js deployments on the same Windows Server, provided you isolate their configurations:
- Ports:
- Desktop: 3001 (AnythingLLM), 11434 (Ollama).
- Node.js: 3002 (AnythingLLM), 11435 (Ollama).
- Storage:
- Desktop:
C:\AnythingLLMData\desktop
. - Node.js:
D:\AnythingLLM\node\data
.
- Desktop:
- Ollama Instances:
- Run two Ollama instances with different ports (11434 and 11435).
- Ensure sufficient RAM (16GB+ recommended) for two Llama3 models (~4-5GB each).
- Use Cases:
- Use .exe for testing new features or knowledge bases.
- Use Node.js for production customer support with high traffic.
- Example: One instance for FAQ chatbot, another for technical support.
- Embedding:
- Both generate similar JavaScript widgets, embeddable in different website sections:
<!-- Desktop Chatbot --> <script src="http://<server-ip>:3001/embed/chat.js"></script> <div id="desktop-chat" data-bot-id="desktop-bot-id"></div> <!-- Node.js Chatbot --> <script src="http://<server-ip>:3002/embed/chat.js"></script> <div id="node-chat" data-bot-id="node-bot-id"></div>
- Both generate similar JavaScript widgets, embeddable in different website sections:
Comparison of Deployment Methods
Aspect | AnythingLLMDesktop.exe | Node.js Manual Deployment |
---|---|---|
Ease of Setup | Simple: Install and run (~5 minutes). | Complex: Requires Node.js, Git, Yarn setup (~15-30 minutes). |
Web Access | Auto-starts web server on 3001. | Manual start on configurable port (e.g., 3002). |
Customer Support Features | Identical: RAG, knowledge base, embeddable chatbot. | Identical: Same features, no functional difference. |
Resource Usage | Higher (~1-2GB RAM due to Electron). | Lower (~500MB-1GB, excluding model). |
Customization | Limited: GUI-based, no source access. | High: Edit source, configure via .env. |
Production Suitability | Best for testing/small sites. | Ideal for high-traffic production with PM2/Nginx. |
Stability | Potential issues in virtualized Server environments. | More stable, WSL2 option for compatibility. |
Resource Requirements
- Minimum: 8GB RAM, 4-core CPU, 10GB storage (Llama3 8B).
- Recommended: 16GB RAM, NVIDIA GPU (4GB+ VRAM), 20GB storage for dual deployments.
- Firewall: Open ports 3001/3002 (AnythingLLM) and 11434/11435 (Ollama).
Troubleshooting
- Desktop.exe:
- Won’t Start: Check logs in
%APPDATA%\AnythingLLM\logs
, run as administrator, or disable Defender temporarily. - VM Issues: Enable VT-x/AMD-V in BIOS for virtualized servers.
- Won’t Start: Check logs in
- Node.js:
- Dependency Errors: Clear Yarn cache (
yarn cache clean
) or use domestic mirror. - Port Conflicts: Check with
netstat -ano | findstr "3001 3002 11434 11435"
.
- Dependency Errors: Clear Yarn cache (
- Ollama:
- No Response: Verify with
curl http://localhost:11434/api/tags
orcurl http://localhost:11435/api/tags
. - Performance: Use Llama3 8B for low RAM; upgrade to 13B with 16GB+ RAM.
- No Response: Verify with
- Chatbot Issues:
- Widget Not Loading: Check website console (F12) for JavaScript errors, verify server IP/port.
- Inaccurate Responses: Refine knowledge base documents or adjust System Prompt in AnythingLLM.
Optimizing for Website Customer Support
- Knowledge Base:
- Upload structured FAQs (e.g., “Returns.txt” with Q&A format).
- Test RAG with common queries (e.g., “What’s your refund policy?”).
- Chatbot Customization:
- Edit System Prompt (e.g., “You are a friendly customer support bot, respond concisely and professionally”).
- Support multi-language FAQs for global users (Llama3 handles multiple languages).
- Production Setup:
- Configure HTTPS with Let’s Encrypt and IIS/Nginx.
- Use PM2 (Node.js) or Task Scheduler (.exe) for auto-start.
- Monitoring:
- Check logs (.exe:
%APPDATA%\AnythingLLM\logs
; Node.js: PM2 logs viapm2 logs
). - Monitor CPU/RAM usage in Task Manager.
- Check logs (.exe:
Can Both Methods Run Simultaneously?
Yes, but requires careful isolation:
- Ports: Use 3001/11434 for .exe, 3002/11435 for Node.js.
- Storage: Separate paths to avoid data overwrite.
- Resources: Ensure 16GB+ RAM for two Llama3 instances.
- Use Case: Run .exe for testing, Node.js for production, or separate chatbots for different website sections.
Recommendations
- Small Websites/Testing: Use .exe for quick setup and minimal configuration.
- Production/High Traffic: Use Node.js with PM2 and WSL2 for stability and scalability.
- Simultaneous Use: Only if you need distinct instances (e.g., testing vs. production); otherwise, choose one to simplify management.
- Next Steps:
- Confirm your server specs (RAM, CPU, GPU) and website scale to choose the best method.
- For production, configure HTTPS and monitor performance.
- If issues arise, share error logs or server details for targeted troubleshooting.
This guide ensures you can deploy AnythingLLM as a robust, cost-free customer support solution, leveraging its powerful AI capabilities while maintaining full control on your Windows Server.