Content stored on Autonomi is first broken into chunks, hashed and then encrypted, in a unique process known as Self-Encryption. Those chunks are run through a hashing algorithm to create a unique 256-bit hash for each chunk. Only chunks that are exactly identical will have the same hash value. This hash serves as the XOR address on the Network where that chunk will be stored, which in turn determines the nodes that will manage it.
Chunks with hashes that lie within a certain address range (say 000010... to 000011...) will be secured, stored and managed by the nodes whose IDs are closest (in XOR terms) to that address. These will change over time as new nodes join and others leave.
The hash of a chunk of data serves as the XOR address on the Network where it will be stored, which in turn determines which nodes will manage it
Nodes are paid for data chunks they hold, but not for replicated data. They are expected to hold replicated data as a condition of participating in the network and having the chance to earn.
As nodes fill up, the price of storing a chunk of data rises. All nodes on the Network fill up at more or less the same rate although there will be some local variations and hence a range of prices.
Nodes cache data when a new close node appears and they share some of their primary data store with that node. This cache isn't included in the fullness calculations. It is just something they are required to do to ensure smooth data transfer. For security, nodes cannot delete or edit data, only store and move it.
libp2p allows for the existence of specialised nodes for providing services such as caching. These will be introduced in later iterations of the Autonomi Protocol.
It's important to note that all the features above emerge from simple local interactions between nodes. Likewise, the codebase is simple, making it easier to maintain, update, audit and secure.
libp2p is a Kademlia-based open-source peer-to-peer networking framework developed by IPFS. It consists of a collection of protocols, specifications and libraries that facilitate P2P communication between network peers (nodes).
It is used in the Autonomi Network to allow nodes to discover and connect to other nodes, to manage those connections thereafter, and to facilitate data routing.
libp2p is modular, meaning Autonomi developers can use the parts they want without having to worry about the parts they don't need. The Rust implementation of libp2p is the one adopted.
Among libp2p features useful to Autonomi are:
libp2p supports TCP, UDP, μUDP and QUIC and can be configured to allow connections with nodes behind home routers or business firewalls (NATS), which has long been a problematic area for decentralized networks.
libp2p includes several security features, such as peer identity verification using public key cryptography and encrypted communication between nodes.
libp2p is designed to recover quickly from disruptions or failures. It also offers protection against network attacks through the use of mitigation techniques.
This allows for the transmission and receipt of messages via the gossip protocol, meaning that nodes can listen out for and react to messages that affect them while ignoring the rest.
libp2p is interoperable, opening the door to future collaborations with other projects that use it.
Peers using libp2p are assigned a ’multiaddress’, a type of URL that encodes multiple layers of addressing information into a single “future-proof” path structure. It defines human-readable and machine-optimised encodings of common transport and overlay protocols and allows many layers of addressing to be combined and used together. A multi-addresss looks like this:
/ip4/139.59.181.245/tcp/37569/p2p/12D3KooWPHE8qcKL4CB2n8QvPpE25TRsP9nmkfeM6Qa61aAsokib
Importantly, on joining or rejoining the Network a node cannot simply pick its own XOR address
Storing data and eventually using applications on the Autonomi Network will be as simple as on the current Web but with greatly enhanced privacy, security and control.
Autonomi is designed with ease of use in mind for app developers, too. Developing simple apps is just a short step from what developers are familiar with in terms of APIs and methodologies.
However, those wishing to perform more complex system-level tasks will need to go deeper into its architecture—there are some big differences between traditional client-server systems and decentralized architectures.
This section provides a brief introduction to the underlying technical architecture of the Network…
The first step in understanding the architecture of Autonomi is to take a look at distributed hash tables. These are structures that map a unique ID to something on the distributed network, be that data, a device or a service.
Kademlia is a distributed hash table (DHT) protocol which provides a way for millions of computers to self-organize into a network, communicate with other computers on the network, and share resources between devices, all without a central controlling entity.
Petar Maymounkov and David Mazières released the Kademlia distributed hash table in 2002. The idea is that nodes form a network overlay, and are identified with a different node identification system. So a node with an IP address of 96.251.182.97 might have a 256-bit XOR address that looks like this: 17846cb8a4b53c9e44c616d2415a15984283eee975a1dac8f488dd91d0aed1cd
.
Bitwise Exclusive OR (XOR) has the feature that each address is a unique distance from any other address in the address range. XOR distance bears no relation to physical distance. Indeed, two pieces of data on the network may be very close XOR-wise but be sitting on machines located on opposite sides of the world.
With potentially millions of devices on a network, there's no way a single node could keep track of them all without running out of resources. Instead it keeps a record (a routing table) containing information about a small number of other nodes and lets the magical uniqueness of XOR distances do the rest.
Each node is a unique distance from every XOR address in the space, which means its routing table is also unique. The address space is enormous: 2^256 is more than the number of atoms in the universe. To make this space manageable it is broken into k-buckets. The furthest away bucket (in terms of XOR distance) contains half the network, the next furthest away represents a quarter of the network, the next furthest an eighth, and so on.
Autonomi is based on Kademlia. Each node's routing table contains information on up to 20 nodes in each k-bucket. This means it knows everything about the nodes in the address space close to it, but very little about space furthest away—but crucially it does know some nodes in that space, and those nodes, of course, know everything about other nodes close to them and can pass messages on. The bigger the Network becomes, the more secure it will get because an individual node will have influence over a decreasing range of addresses.
Every piece of data has a unique address; in the jargon it is ‘content-addressable’—its content defines where it is stored. When a node wants to retrieve a piece of data, it checks its address, then asks the node closest to that address whether it's holding the data. That node in turn checks the address and sees whether it knows a node that's closer. If it does, it passes the message on to that node, and so on, until it reaches the node that's actually holding the data. The same messaging process happens when saving data too. It is very fast and efficient, with the enormous address space being traversable in just a few hops.
In the Autonomi Network, a node's closest neighbours (again, in XOR terms) are called its ‘close group’. Data stored at a node is automatically replicated to the nodes in its close group for redundancy. As nodes leave and others join (a process known as churn) the node's close group will change. The size of the close group is defined by a tuneable parameter, currently it is set at 5.
So, nodes must constantly update their routing tables. They do this by observing which nodes are alive, dead and newly arrived during the course of their day-to-day operations.