Tier 1 Organizations
To help with Stellar’s decentralization, the most reliable and advanced Stellar teams join the ranks of “Tier 1 Organizations.” These organizations run three validators, coordinate any changes to their quorumsets, and hold themselves to a higher standard of uptime and responsiveness.
SDF works closely with Tier One Orgs to ensure the health of the network, maintain good quorum intersection, and build in redundancy to minimize network disruptions. This guide outlines what it takes to be a Tier 1 Org.
The most important function of a Tier 1 Org is to set up and maintain three Full Validators. Why three?
On Stellar, validators choose to trust organizations when they build a quorum set. If you are a trustworthy organization, you want your presence on the network to persist even if a node fails or you take it down for maintenance. A trio of validating nodes allows that to happen: other participants can create a quorum slice for your organization that requires ⅔ of your validating nodes to agree. If 1 has issues, no big deal: the other two still vote on your organization’s behalf, so the show goes on. To ensure redundancy, it’s also important that those three Full Validators are geographically dispersed: if they’re in the same data center, they run the risk of going down at the same time.
Here’s what else Tier 1 Orgs expect of one another:
In addition to participating in SCP, a full validator publishes an archive of network transactions. To do that, you need to configure Stellar Core to record history to a publicly accessible archive, and add the location of that archive to your stellar.toml. To be a Tier 1 Org, you should set each of your nodes to record history to a separate archive.
Public archives make the network more resilient: when new nodes come online, or when existing nodes lose synch, they need to consult an archive to figure out what they missed. Sharing snapshots of the ledger, which detail transactions and their results, allows those nodes to catch up, and more archives mean more redundancy and greater decentralization. Plus, sharing history keeps everyone honest.
To maximize network resilience, we’re asking every Tier 1 node to use the same quorum set configuration, which is made up of subquorums of all validators from each Tier 1 Org.
That way, the validator community can experiment with a larger quorum, and can analyze the results of those experiments without disrupting the network. Using existing Tier 1 Orgs as a safety net, we can work together to expand the quorum methodically and deliberately. To see what that quorum set currently looks like, check out the example Full Validator config file.
SEP-20 is an open spec that explains how self-verification of validator nodes works. The steps it specifies are pretty simple: you set the homedomain of your validator’s Stellar account to your website, where you publish information about your node and your organization in a stellar.toml file.
It’s an easy way to propagate information, and it harnesses the network to allow other participants to discover your node and add it to their quorum sets without the need for a centralized database.
Running a validator requires vigilance. You need to keep an eye on your nodes, keep them up to date with the latest version of Stellar Core, and check in on public channels for information about what’s currently happening with other validators.
The best two ways to do that:
- Join the validators email list
- download Keybase and join the #validators channel on the stellar.public team
We always announce new Stellar Core releases in those channels. You can also find those releases on our github.
It’s also critical that you pay attention to information about what those updates mean: often, you’ll need to set your validators to vote on something timely, such as when to upgrade the network as a whole, or how high to set the operations-per-ledger limit.
Whether you run a trio of validators or a single node, it’s important that you coordinate with other validators when you make a significant change or notice something wrong. You should let them know when you plan to:
- Take your node down for maintenance
- Make changes to your quorum set
Letting other validators know when you plan to take your node down for maintenance or to upgrade to the latest version of stellar-core prevents a critical mass of nodes from going offline at the same time.
Letting other validators know when you plan to change your quorum set allows them to respond, adjust, and think through the implications of expanding the quorum. For the quorum to expand safely, we all need to coordinate to ensure we maintain good quorum intersection.
We recommend using Prometheus to to scrape and store your stellar-core metrics, and Grafana to render that data for human consumption. You can find step-by-step instructions for setting up monitoring and alerts in Monitoring and Diagnostics, along with links to Grafana dashboards we’ve created to make things easier.
You can also use stellarbeat.io to view validators’ quorum configurations, and get information about their availability and uptime, and the quorum command to diagnose problems with the quorum set of the local node.
You should do regular check-ins on your quorum set. If nodes have bad uptime or prove otherwise unreliable, you may need to remove them from your quorum set so that you don’t get stuck and so that the network doesn’t halt. You may also want to add new organizations that come online and prove reliable. If you plan to do either of those things, remember to communicate and coordinate with other validators.
If you think you can be a Tier 1 Org, let us know on the #validators channel on Keybase. We can help you through the process, and once you’re up and running, we’ll work to fold you into the quorum so that you can take your rightful place as a pillar of the network. Once you’ve proven that you are responsive, reliable, and maintain good uptime, we will adjust the quorum set recipe above to include your validators.
As Stellar grows, and more and more businesses build on the network, Tier 1 Orgs will be crucial to the methodical expansion of the network.
Last updated Jul. 13, 2020