Connectivity Issues
Diagnose and resolve agent communication problems, network connectivity issues, and firewall configuration errors.
Common Symptoms
Connectivity issues typically present with these symptoms:
| Symptom | Description |
|---|---|
| Agent Shows as "Offline" | Agent cannot establish connection to CastellanAI backend |
| Intermittent Connection Drops | Agent alternates between online and offline status |
| Events Not Appearing | Agent is connected but event data is not being transmitted |
| High Latency | Events take longer than 30 seconds to appear in dashboard |
Diagnostic Steps
Follow these steps systematically to identify the root cause:
Step 1: Check Agent Service Status
Verify the CastellanAI agent service is running on the affected host.
Windows:
Get-Service CastellanAgent | Select-Object Status, StartType
Linux:
sudo systemctl status castellan-agent
Step 2: Test Network Connectivity
Verify the agent can reach CastellanAI backend endpoints.
Test HTTPS connectivity (port 443):
Test-NetConnection -ComputerName api.castellanai.com -Port 443
Test WebSocket connectivity:
Test-NetConnection -ComputerName ws.castellanai.com -Port 443
Step 3: Review Agent Logs
Check agent logs for connection errors or authentication failures.
| Platform | Log Location |
|---|---|
| Windows | C:\ProgramData\CastellanAgent\logs\agent.log |
| Linux | /var/log/castellan-agent/agent.log |
Look for error patterns:
- Connection timeout errors
- Certificate validation failures
- Authentication token expired/invalid
- DNS resolution failures
Step 4: Verify Firewall Rules
Ensure firewall allows outbound HTTPS and WebSocket connections.
Required outbound rules:
| Destination | Protocol | Port |
|---|---|---|
| api.castellanai.com | HTTPS | 443 |
| ws.castellanai.com | WebSocket (WSS) | 443 |
| updates.castellanai.com | HTTPS | 443 |
Step 5: Check Proxy Configuration
If your organization uses a proxy, verify agent proxy settings.
Configure proxy in agent.yaml:
proxy:
enabled: true
server: proxy.company.com
port: 8080
authentication:
username: proxy-user
password: proxy-password
bypass:
- localhost
- 127.0.0.1
Common Issues and Solutions
SSL Certificate Validation Failures
Symptom: Agent logs show "SSL certificate validation failed" or "certificate chain cannot be verified"
Solutions:
- Ensure system date/time is accurate (certificate validity is time-sensitive)
- Install latest root certificates on the system
- For corporate environments with SSL inspection, import the corporate CA certificate
- Update agent to latest version (may include updated certificate bundle)
Authentication Token Expired
Symptom: Agent shows as offline with "401 Unauthorized" errors in logs
Solutions:
- Navigate to Agents page in dashboard
- Find the affected agent and click Regenerate Token
- Copy the new token and update agent configuration file
- Restart agent service:
Restart-Service CastellanAgent
Intermittent Connection Drops
Symptom: Agent frequently disconnects and reconnects every few minutes
Solutions:
- Check for network stability issues (packet loss, high latency)
- Adjust WebSocket keep-alive interval in agent.yaml:
keepalive_interval: 30 - Increase connection timeout:
connection_timeout: 60 - Check firewall for aggressive timeout settings on long-lived connections
DNS Resolution Failures
Symptom: Logs show "Unable to resolve hostname" or "DNS lookup failed"
Solutions:
- Test DNS resolution:
nslookup api.castellanai.com - Verify DNS servers are reachable and functioning
- Add DNS entries to hosts file as fallback:
C:\Windows\System32\drivers\etc\hosts - Configure custom DNS servers in agent configuration if needed
Automated Connectivity Test
CastellanAI provides a built-in connectivity test tool for comprehensive diagnostics:
Run the connectivity test:
castellan-agent test-connection
Expected output for healthy connection:
Connectivity Test Results:
✓ DNS Resolution: api.castellanai.com → 52.168.10.5
✓ HTTPS Connection: Port 443 (200 OK)
✓ WebSocket Connection: Port 443 (Connected)
✓ Authentication: Token valid
✓ Event Transmission: Test event sent successfully
Status: All checks passed
If any checks fail, the tool provides specific error messages and resolution suggestions.
Prevention Best Practices
-
Monitor Agent Health Proactively - Enable agent health notifications to receive alerts when agents go offline or experience connectivity issues.
-
Document Network Requirements - Provide firewall teams with complete CastellanAI network requirements during deployment planning.
-
Keep Agents Updated - Enable automatic agent updates or regularly update agents manually to receive connectivity improvements.
-
Test After Network Changes - Run connectivity tests after firewall rule changes, proxy updates, or network maintenance.
What's Next?
- Agent Health Monitoring - Learn how to monitor agent health status and configure health check alerts
- Agent Troubleshooting - Comprehensive troubleshooting guide for all agent-related issues