XDR
Stellar stores and communicates ledger data, transactions, results, history, and messages in a binary format called External Data Representation (XDR). XDR is optimized for network performance but not human readable. Horizon and the Stellar SDKs convert XDRs into friendlier formats.
.X files
Data structures in XDR are specified in .x
files. These files only contain data structure definitions, no operations or executable code. The .x
files for the XDR structures used on the Stellar Network are available on GitHub.
JSON and XDR Conversion Schema
The canonical JSON and XDR Schema is defined by the stellar-xdr crate and also exposed by the @stellar/stellar-xdr-json npm package. The schema provides a round-trippable means for converting any Stellar XDR type to JSON and converting that JSON back to the identical XDR.
This conversion is visible in the Stellar CLI, Lab View XDR, and Hubble events table.
Key Characteristics of the JSON and XDR Conversion Schema
-
Round-Trippable: The JSON format allows for converting from XDR to JSON and back to XDR without loss of information.
-
Self-Describing: The JSON format describes the internals of the type but does not identify the type that is encoded. This is similar to XDR, which also does not identify the encoded type.
-
64-bit Integers: The JSON format includes integers up to 64-bit in size. JavaScript runtimes do not support 64-bit integers, so a custom decoder must be used, such as lossless-json.
-
Escaped ASCII Strings: The JSON format includes strings that UTF-8 safe, escaped ASCII strings. This is because XDR strings are not UTF-8 encoded, but are instead byte streams that can contain any values. Non-ASCII characters are escaped.
The defined schema (linked above) is not backwards compatible between a given protocol version and prior versions. The best practice is to store required ledger data in XDR format, not in JSON format.
More About XDR
XDR is specified in RFC 4506 and is similar to tools like Protocol Buffers or Thrift. XDR provides a few important features:
- It is very compact, so it can be transmitted quickly and stored with minimal disk space.
- Data encoded in XDR is reliably and predictably stored. Fields are always in the same order, which makes cryptographically signing and verifying XDR messages simple.
- XDR definitions include rich descriptions of data types and structures, which is not possible in simpler formats like JSON, TOML, or YAML.
Parsing XDR
Since XDR is a binary format and not as widely known as simpler formats like JSON, the Stellar SDKs all include tools for parsing XDR and will do so automatically when retrieving data.
In addition, the Horizon API server generally exposes the most important parts of the XDR data in JSON, so they are easier to parse if you are not using an SDK. The XDR data is still included (encoded as a base64 string) inside the JSON in case you need direct access to it.
Digging into XDR structures
Since the XDR format is a fundamental underpinning of the Stellar Network, we often find it necessary to dig into these raw binary data structures for clients to interact with things like transactions and authorization entries. It can be hard to understand because of its structure, but this section will aim to enlighten you on the ways to interact with it in a handful of popular languages. The Protocol's schema is defined in the stellar/stellar-xdr
repository, though that contains far more than what we'll need for our purposes.
XDR's Common Forms
In the Stellar Protocol, XDR takes a handful of different forms that are important to understand and distinguish:
Basic Primitives
These are the easiest to understand for anyone familiar with a programming language and includes things like integers, strings, and arrays of bytes. These are fairly straightforward to interact with, though things can get more complicated when you have aliases.
For example, you have the Uint64
alias in the Stellar Protocol (defined Stellar-types.x), defined for readability instead of XDR's native "unsigned hyper" value. You can treat it exactly as you'd expect:
import { xdr } from "@stellar/stellar-sdk";
const u64 = new xdr.Uint64(12345678);
Some language variations will let you instantiate it in a number of ways. For example, since 64-bit integers are a little... wonky in JavaScript, you can also initialize it with a BigInt
or even a string type:
import { xdr } from "@stellar/stellar-sdk";
let u64 = new xdr.Uint64("1_000_000_000_000_000");
u64 = new xdr.Uint64(1_000_000_000_000_000n);
You can see the full set of options to construct a Uint64
in the definition for UnsignedHyper
in the TypeScript definition here: curr.d.ts. In fact, this file should be your primary guide to navigating the XDR if you're using JavaScript.
Other languages, such as Python (see stellar-sdk
's uint64.html documentation), support arbitrarily-sized integers by default, so there are no such variations:
import xdr from stellar_sdk;
u64 = xdr.Uint64(1_000_000_000_000_000)
As a result, many of the basic primitives can be initialized in the "intuitive" way for your respective language.
Unions
This is where things get a little more complicated. A union type is a generic type that switches between different "arms", each of which is a different internal type. When parsing an XDR union type, it's important to determine which arm is being used before you try parsing what's inside.
Let's take an SCAddress
, for example. From the type definition, we can see that it's either an account (i.e. SC_ADDRESS_TYPE_ACCOUNT
) or the hash of a contract (SC_ADDRESS_TYPE_CONTRACT
). Right above that definition, you can see that these are constants of the SCAddressType
enum defined as 0 or 1, respectively.
In general, unions follow a particular pattern: switch on the type and match the appropriate enum value. So when you have one of these unions, you need to figure out which one it is:
- TypeScript
- Python
import { xdr } from "@stellar/stellar-sdk";
import Buffer from "buffer";
// suppose `scAddr` is an instance of `xdr.ScAddress` we're parsing
switch (scAddr.switch()) {
case xdr.ScAddressType.scAddressTypeAccount().value:
console.log("The account is:", scAddr.accountId());
break;
case xdr.ScAddressType.scAddressTypeContract().value:
console.log("The contract is:", scAddr.contractId());
break;
default:
throw new Error(`Unexpected address type: ${scAddr.switch()}`);
}
import SCAddressType, SCAddress from stellar_sdk.xdr.sc_address_type
if scAddr.type == SCAddressType.SC_ADDRESS_TYPE_ACCOUNT:
print("The account is:", scAddr.account_id);
break
elif scAddr.type == SCAddressType.SC_ADDRESS_TYPE_CONTRACT:
print("The contract is:", scAddr.contract_id);
break
else:
raise Exception(f"Unexpected address type: {scAddr.type}")
You can see the references for these in the documentation (Python and JavaScript, respectively).
It's worth noting that you could just use the defined enum constants directly, e.g.
import { xdr } from "@stellar/stellar-sdk";
import Buffer from "buffer";
// suppose `scAddr` is an instance of `xdr.ScAddress` we're parsing
switch (scAddr.switch().value) {
case 0:
console.log("The account is:", scAddr.accountId());
break;
case 1:
console.log("The contract is:", scAddr.contractId());
break;
default:
throw new Error(`Unexpected address type: ${scAddr.switch()}`);
}
However, this is generally more error-prone and harder to read. You can also refer to them by their name (.name
in JS and str(...)
in Python): this is marginally better for readability but worse for performance since you're doing string comparisons. It's very useful for logs, user-facing rendering, or other debugging, though.
Unions appear in many places throughout the XDR and they can be just one arm or a whole bunch of arms. For example, the smart contract value ScVal
has 22 possible values since it's the universal representation for a value inside of a smart contract. You can see all of them being inspected in the JavaScript utility scValToNative
.
Remember, XDR is the lowest level of communication on the Stellar Network. Generally speaking, there are high-level abstractions that should help with these structures. For example, the above parsing routine is exactly the purpose of the Address.fromScAddress
and Address.from_xdr_sc_address
abstractions in the JavaScript and Python SDKs, respectively. This greatly simplifies your code:
- TypeScript
- Python
import { Address } from "@stellar/stellar-sdk";
// suppose `scAddr` is an instance of `xdr.ScAddress` we're parsing
const address = Address.fromScAddress(scAddr);
import Address from stellar_sdk;
# suppose `scAddr` is an instance of `xdr.ScAddress` we're parsing
addr = Address.from_xdr_sc_address(scAddr)
Always try to find a higher-level abstraction before digging into the raw structure.
Structures
Like in any C-like language, a structure is a packed object containing a bunch of fields of arbitrary types. Structures can themselves contain structures, so be sure you traverse the entire tree of fields when you're constructing one.
Let's take, for example, the ContractEvent
structure. Its definition is a little complicated, so let's break it down:
- An
ExtensionPoint
is a way for the protocol to be extended while maintaining backwards binary compatibility.
It's basically an empty slot into which you can add variations in the future. The SorobanTransactionMeta
structure is a great example of this in action: When the structure was initially added to the protocol, it just contained the details of an invocation (events and a return value). Later, we needed more details about the invocation in the resulting metadata, so SorobanTransactionMetaExt
was injected into the ExtensionPoint
with a V1
variant containing metadata about fees charged as a result of the invocation.
- The
contractID
should be self-explanatory: it's the hash of the contract ID that caused this event. It's a pointer, though, because not all events are related to a specific contract. Some may come from the system, diagnostics, etc.
It should not be surprising that the field is stored as a Hash
: even though at the ecosystem layer we refer to contracts by their "strkey" or string representation (in the form C...
), these are actually "friendlier" representations of a raw SHA-256 hash. This is common at the protocol layer: the simplest, most compact representation will always be used. There are generally SDK tools to make translating these into more user-friendly types (as demonstrated in the example below).
-
The
ContractType
is essentially a way to differentiate which of the fields in the structure you should expect to be present and isn't worth elaborating on. -
Finally, we have the curious
v
union field.
You will often see unions in structures and vice-versa. If you have a handle on how each of these is traversed individually, you can handle them interleaved amongst each other. As a demonstration of how this is done with an SDK, even catered specifically to this structure, we can take a look at the source of the humanizeEvents
helper method in the JavaScript SDK. Specifically, we'll look at the details of the extractEvent
subroutine, replicated here for convenience:
function extractEvent(event) {
return {
...(typeof event.contractId === "function" &&
event.contractId() != null && {
contractId: StrKey.encodeContract(event.contractId()),
}),
type: event.type().name,
topics: event
.body()
.value()
.topics()
.map((t) => scValToNative(t)),
data: scValToNative(event.body().value().data()),
};
}
In this short and sweet example we can see tons of the nuances we've discussed in this guide at work:
- By differentiating on whether or not the
contractId
returns a value or not (it's a pointer, remember?), we decide whether or not we want to turn the raw hash into an ecosystem-compatible contract ID. - We access the
.name
field of the event type, which turns it into a human-readable string rather than an opaque structure, just like we discussed in Unions, earlier. - We inspect deeply into the
body
member to extract thetopics
anddata
fields separately, converting each of those opaqueScVal
ues into JavaScript-friendly equivalents.
In the last part, we "abuse" the fact that there is only one variation of the body
union we discussed: there is only the initial version (an arm with v == 0
), so we can use the .value()
accessor to give us the underlying value directly rather than going through the rigamarole of differentiating the different cases. Calling .v0()
would have been equivalent, but the benefit here is that if a v == 1
variation is introduced that also has a topics
array and a data
field (i.e. it adds additional fields rather than changing the v0 structure), this code will continue to work!
There are many cases in which the different union arms share structure, and .value()
lets you take advantage of that.
This overview should give you a strong baseline on understanding how to inspect XDR in your respective SDK to dig into the fields that you're interested in. Specifics will, of course, be language dependent, but if you start at the foundation--that being the raw .x files themselves--you will get an understanding of the structure itself and should be able to access that same structure directly in your language of choice, as we've outlined here.