Monitoring
To ensure that your instance of Horizon is performing correctly, we encourage you to monitor it and provide both logs and metrics to do so.
Metrics
Metrics are collected while a Horizon process is running and they are exposed privately via the /metrics
path, accessible only through the Horizon admin port. You need to configure this via --admin-port
or ADMIN_PORT
, since it's disabled by default. If you're running such an instance locally, you can access this endpoint:
- Example
$ stellar-horizon --admin-port=4200 &
$ curl localhost:4200/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 1.665e-05
go_gc_duration_seconds{quantile="0.25"} 2.1889e-05
go_gc_duration_seconds{quantile="0.5"} 2.4062e-05
go_gc_duration_seconds{quantile="0.75"} 3.4226e-05
go_gc_duration_seconds{quantile="1"} 0.001294239
go_gc_duration_seconds_sum 0.002469679
go_gc_duration_seconds_count 25
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 23
and so on...
Logs
Horizon will output logs to standard out. Information about what requests are coming in will be reported, but more importantly, warnings or errors will also be emitted by default. A correctly running Horizon instance will not output any warning or error log entries.
Below we present a few standard log entries with associated fields. You can use them to build metrics and alerts. Please note that these represent Horizon app metrics only. You should also monitor your hardware metrics like CPU or RAM Utilization.
Starting HTTP request
Key | Value |
---|---|
msg | Starting request |
client_name | Value of X-Client-Name HTTP header representing client name |
client_version | Value of X-Client-Version HTTP header representing client version |
app_name | Value of X-App-Name HTTP header representing app name |
app_version | Value of X-App-Version HTTP header representing app version |
forwarded_ip | First value of X-Forwarded-For header |
host | Value of Host header |
ip | IP of a client sending HTTP request |
ip_port | IP and port of a client sending HTTP request |
method | HTTP method (GET , POST , ...) |
path | Full request path, including query string (ex. /transactions?order=desc ) |
streaming | Boolean, true if request is a streaming request |
referer | Value of Referer header |
req | Random value that uniquely identifies a request, attached to all logs within this HTTP request |
Finished HTTP request
Key | Value |
---|---|
msg | Finished request |
bytes | Number of response bytes sent |
client_name | Value of X-Client-Name HTTP header representing client name |
client_version | Value of X-Client-Version HTTP header representing client version |
app_name | Value of X-App-Name HTTP header representing app name |
app_version | Value of X-App-Version HTTP header representing app version |
duration | Duration of request in seconds |
forwarded_ip | First value of X-Forwarded-For header |
host | Value of Host header |
ip | IP of a client sending HTTP request |
ip_port | IP and port of a client sending HTTP request |
method | HTTP method (GET , POST , ...) |
path | Full request path, including query string (ex. /transactions?order=desc ) |
route | Route pattern without query string (ex. /accounts/{id} ) |
status | HTTP status code (ex. 200 ) |
streaming | Boolean, true if request is a streaming request |
referer | Value of Referer header |
req | Random value that uniquely identifies a request, attached to all logs within this HTTP request |
Metrics
Using the entries above you can build metrics that will help understand performance of a given Horizon node. For example:
- Number of requests per minute.
- Number of requests per route (the most popular routes).
- Average response time per route.
- Maximum response time for non-streaming requests.
- Number of streaming vs. non-streaming requests.
- Number of rate-limited requests.
- List of rate-limited IPs.
- Unique IPs.
- The most popular SDKs/apps sending requests to a given Horizon node.
- Average ingestion time of a ledger.
- Average ingestion time of a transaction.
Alerts
Below are example alerts with potential causes and solutions. Feel free to add more alerts using your metrics:
Alert | Cause | Solution |
---|---|---|
Spike in number of requests | Potential DoS attack | Lower rate-limiting threshold |
Large number of rate-limited requests | Rate-limiting threshold too low | Increase rate-limiting threshold |
Ingestion is slow | Horizon server spec too low | Increase hardware spec |
Spike in average response time of a single route | Possible bug in a code responsible for rendering a route | Report an issue in Horizon repository. |
I'm Stuck! Help!
If any of the above steps don't work or you are otherwise prevented from correctly setting up Horizon, please join our community and let us know. Either post a question at our Stack Exchange or chat with us on Keybase in #dev_discussion to ask for help.