Agent Health Monitoring

Monitor and troubleshoot CastellanAI agents across your infrastructure.

Understanding Agent Status

Each agent reports its status to the portal every 30 seconds via heartbeat. The portal displays real-time health information including connection status and event collection activity.

Status Indicators

🟢 Connected
⚫ Disconnected
🟡 Pending
🔴 Error
🔵 Updating

Connected (Green)

Agent is connected and actively sending security events. All systems operational.

Indicators:

Green status dot (pulsing animation)
Last seen within the last few minutes
Events being collected and transmitted

Healthy State

This is the expected state for all your agents. No action required.

Monitoring Agent Health

Portal Agents Page

The primary method for monitoring agent health is through the Customer Portal:

Log in to your Customer Portal
Click Agents in the header navigation
View the agent list with status indicators
Review: Hostname, Platform, Status, Last Seen, Events Today

Status Summary

The status summary cards at the top show counts for each status type (Connected, Disconnected, Pending, etc.)

Local Agent Logs

Check agent logs directly on the endpoint for detailed diagnostic information:

Windows
Linux
macOS

# Event Viewer
# Event Viewer → Applications and Services Logs → CastellanAgent

# Or check log files:
Get-Content "C:\ProgramData\CastellanAI\logs\agent.log" -Tail 50

# Using journalctl
sudo journalctl -u castellan-agent -n 50

# Or check log files:
sudo tail -f /var/log/castellan-agent/agent.log

# Using unified logging
sudo log show --predicate 'process == "CastellanAgent"' --last 1h

# Or check log files:
sudo tail -f /Library/Logs/CastellanAI/agent.log

Check Service Status

Windows
Linux
macOS

Get-Service "Castellan Agent"
# Should show Status: Running

sudo systemctl status castellan-agent
# Should show Active: active (running)

sudo launchctl list | grep com.castellanai.agent
# Should show PID if running

Common Issues & Solutions

🔌 Agent shows Disconnected but service is running

Check network connectivity to the CastellanAI servers. Verify firewall allows outbound HTTPS (port 443).

Windows
Linux/macOS

Test-NetConnection -ComputerName api.castellanai.com -Port 443

curl -v https://api.castellanai.com

🟡 Agent stuck in Pending status

The agent may not have completed enrollment successfully.

Check agent logs for enrollment errors
Verify the enrollment token is valid and not expired
Re-enroll if necessary:

# Check current status
castellan-agent status

# Re-enroll with fresh token
castellan-agent enroll --token YOUR_TOKEN --portal-url https://api.castellanai.com --force

🔐 Authentication errors in logs

Agent credentials may be invalid or corrupted.

# Check enrollment status
castellan-agent status

# Re-enroll with new token from Portal
castellan-agent enroll --token YOUR_NEW_TOKEN --portal-url https://api.castellanai.com --force

Then restart the service:

Windows
Linux

Restart-Service "Castellan Agent"

sudo systemctl restart castellan-agent

📊 Events not appearing in portal

Verify event sources are configured correctly and agent has permissions to access logs.

Check agent logs for "permission denied" errors
Verify agent service is running with appropriate privileges
Test event collection with a known security event (e.g., failed login attempt)

❌ Agent service won't start

Check for configuration file errors or missing dependencies.

Review agent logs for startup errors
Verify configuration file syntax is correct
Ensure all required dependencies are installed
Try running agent manually to see error messages:

# Run in foreground for debugging
castellan-agent run

Agent Health Best Practices

Proactive Monitoring

Follow these practices to maintain healthy agent deployments.

Practice	Description
Monitor Status Regularly	Check the Agents page periodically to catch issues early
Keep Agents Updated	Apply agent updates regularly for performance improvements and bug fixes
Review Logs Periodically	Check agent logs to catch warnings before they become critical
Ensure Network Access	Verify agents can reach CastellanAI servers through any firewalls or proxies
Document Configurations	Maintain records of agent deployments for troubleshooting

Agent CLI Commands

Use these commands directly on the endpoint for troubleshooting:

Command	Description
`castellan-agent status`	Show enrollment status and agent information
`castellan-agent version`	Show agent version, platform, and runtime
`castellan-agent run`	Start agent in foreground (for debugging)
`castellan-agent enroll --token TOKEN --portal-url URL`	Enroll or re-enroll the agent
`castellan-agent unenroll`	Remove enrollment and clear credentials

What's Next?

Guide	Description
Agent Troubleshooting	Deep dive into advanced agent troubleshooting
Agent Updates	Learn how to update agents and manage versions
Agent Configuration	Configure agent settings

Need Help?

If you're experiencing persistent agent health issues, contact support for assistance.

Contact Support

Contact Support for personalized assistance.

Agent Health Monitoring

Understanding Agent Status

Status Indicators

Connected (Green)

Disconnected (Gray)

Pending (Yellow)

Error (Red)

Updating (Blue)

Monitoring Agent Health

Portal Agents Page

Local Agent Logs

Check Service Status

Common Issues & Solutions

Agent Health Best Practices

Agent CLI Commands

What's Next?

Need Help?

Understanding Agent Status​

Status Indicators​

Connected (Green)​

Disconnected (Gray)​

Pending (Yellow)​

Error (Red)​

Updating (Blue)​

Monitoring Agent Health​

Portal Agents Page​

Local Agent Logs​

Check Service Status​

Common Issues & Solutions​

Agent Health Best Practices​

Agent CLI Commands​

What's Next?​

Need Help?​

Understanding Agent Status

Status Indicators

Connected (Green)

Disconnected (Gray)

Pending (Yellow)

Error (Red)

Updating (Blue)

Monitoring Agent Health

Portal Agents Page

Local Agent Logs

Check Service Status

Common Issues & Solutions

Agent Health Best Practices

Agent CLI Commands

What's Next?

Need Help?