Pillars of a Global Monolith

Dec 07, 2023

A Global Monolith builds on 5 fundamental pillars of technological innovation.

Post Quantum Cryptography

Security, and more specifically cryptography, is the most fundamental and essential necessity for building a Global Monolith.

The existing internet and the software that enables it are fundamentally insecure and impossible to secure. The existing internet cannot be used for anything that requires real security.

The security threat of the future is not only nation-state level hackers. The security threat of the future is AI.

In order to build a secure system it is now necessary to secure data from the computer itself. As the world races toward AGI this is the most pressing security threat that we face.

A Global Monolith must be built on a foundation of Post Quantum Cryptography with careful integration of both software and hardware into the overall security design.

Every single piece of data must be encrypted using a Post Quantum encryption algorithm and there must be a hardware level system design that physically isolates the storage layer from the computation layer and allows for out-of-band monitoring, auditing, and control of all data access.

The “Post Quantum” algorithms being developed by the NSA/NIST and any similar mathematical fugazi can never be trusted and will most likely be insecure by design.

One-Time Pad (OTP) is the only cryptographic algorithm that is provably secure, and it is simple enough for anyone to understand, so it is the natural starting point for building any truly secure cryptographic system.

OTP is simple algorithmically but difficult to implement practically.

The practical implementation of OTP requires physical security layers and systems. This has the advantage of creating opportunities for patenting the various inventions that are needed to build the physical systems that are necessary to build an OTP based cryptographic system.

Identity and Data Ownership

One aspect of the fundamental insecurity of the current internet is that it can be used anonymously and it is impossible to attribute activity to a specific individual.

In a Global Monolith every single request and every single data access must be digitally signed and attributable to a verified identity.

The mechanism for doing this will be a physical device that users buy and link to an account where they have performed ID verification.

Whenever a user wants to access the Global Monolith they must have their physical ID device present and every single request they make must be digitally signed using that device.

In order to secure the data layer from a potentially malicious AI it is necessary to attribute every single data access to an individual user.

Autonomous computer processess will only be able to access data using delegated authority from a user and all data access by the automated process will be attributed to the user who authorized it.

Unlike current computer systems, where the “root” computer process has unlimited access to all data in the system, in a Global Monolith no computer process will ever have access to any data outside of the scope of the individual user account that it is operating with delegated authority from.

In order to build a Global Monolith that enforces legal rights it is essential to build Identity and Data Ownership into the lowest level of the system. In a Global Monolith every single piece of data will have an identified owner and there will be an irrefutable auditable record of when that data was created.

For a more in-depth (though somewhat dated) review of this topic see: Digital Real IDs that Preserve Anonymity are Possible and Necessary

Distributed Consensus and Consistency

Consistency is the most difficult problem in a Global Monolith. Consistency is achieved in a distributed system through a Distributed Consensus algorithm.

A single consistent global database requires that a user be able to write data anywhere on earth and have that data be visible and readable from any other location on earth without ever having any inherent conflict in the data, such as from having two different users attempting to write the same data to the same location at the same time.

The level of consistency required for a Global Monolith, if it is to be able to handle every possible use case, is Linearlizability.

A database that offers this level of consistency can typically only be replicated over a very short distance on a very low latency local network.

A Global Monolith requires Linerlizability in replication over the entire globe which may entail latencies of a second or even more, compared to microseconds (millionths of a second) on high speed local networks.

The latency of fiber optic communications is a fixed physical fact. Achieving consistency in a global database can only be achieved through innovations at the algorithmic level.

A Global Monolith requires innovations in Distributed Consensus algorithms and the unique properties of Immutable Content Addressable Storage.

Immutable Content Addressable Storage

Traditional computer systems are built on storage systems that use location based addressing and are mutable.

The typical model of compuation is to read data from a location on a device and then write data back to the same or another location on that device. Data is mutable which means that it can be changed.

In a Content Addressable Storage system data is read and written based on a cryptographic hash of the data. This means, for all practical purposes, that the address is the data.

Content Addressable Storage is unique because it is impossible for there to be inconsistency between the address and the data. The same address will always have the same data because the address is the data.

Content Addressable Storage is inherently immutable because it is impossible to change the data for an address. Different data will always have a different address.

People often assume that immutability means that data cannot be deleted. Data can always be deleted. Immutability at the algorithmic level only means that data can never be modified. A system built on an immutable data model can be designed to tolerate the deletion of data without impacting its operations.

The fact that Content Addressable Storage can never be in an inconsistent state means that it is possible to build a distributed data store that breaks the CAP “Theorem”. Since inconsistency is impossible at the algorithmic level, it is guaranteed, and so it is only necessary to achieve Availability and Partition Tolerance.

Graph Data Model

Relational Databases, such as Oracle and Postgres, are the standard way of storing strucutred data and are used for the vast majority of enterprise and web based applications.

A Relational Database relies on a schema that defines Tables with a set list of Columns. Individual pieces of data are stored as Rows in the Table. Rows in a Table can have Columns with values that point to Rows in other Tables which allows for building associations between related data.

The Relational Data Model was designed to deal with the hardware constraints of computers that existed 50+ years ago. Computers today are billions of times more powerful.

Putting data into a Relational Database Schema requires a laborious process of “cleaning” the data so that it can be reduced down to its common features and then “normalizing” the data so that it can be put into multiple related Tables where no Column is ever duplicated in the entire system.

Managing Relational Databases is a highly labor intensive process and a large percentage of overal software engineering effort is spent on this task or spent on fixing the errors that occur at the application level if it is not done properly.

The Relational Data Model is not only labor intensive. It also has the result of stripping away and discarding much of the valuable context that surrounds data. Because data must be reduced to its common features anything that is not common to the entire data set is thrown away.

The relationship between an actual enterprise data set and what ends up in an enterprise Relational Database is similar to the relationship between a Ledger and Balance Sheet and the actual activity that a company performs. The Ledger and Balance Sheet record a list of transactions and a current balance, and the dollar amounts can be compared between different Ledgers and Balance Sheets, but they tell you very little about the actual activity that a company performed.

A Graph Data Model starts with unstructured Data Objects that can have arbitrary schemas. Every single object, or collection of data, is recorded exactly as it exists. Associations between objects are a separate fact from the data of an object itself and so Associations between Objects in a Graph can be added and removed over time.

While a Relational Database Table has only 2 dimensions a Graph is n-dimensional which means that it can have as many dimensions as actually exist in the underlying data. The real world is n-dimensional so a Graph Data Model allows for storing data as it exists in the real world.

Storing data in 2-dimensional tables makes sense when the primary consumers of data are humans because humans are not really capable of thinking n-dimensionally. The 2-dimensional representation of data in a Relational Database arranges data in a way that makes it queryable and understandable by human users.

AI changes everything because AI is very good at thinking n-dimensionally. For AI to be the most effective it needs as much data as possible. All of the context and “messyness” that gets stripped out of real world data and thrown away before putting it into a 2-dimensional Relational Database is gold for AI.

A Graph Data Model is essential for maximizing the capabilities of AI.

The key performance advantage of Relational Databases is that computers have always been much faster at performing sequential IO (reading and writing data) than random IO. Even though the gap between sequential and random IO has narrowed over time it still remains to this day.

Graph Databases are highly depenent on random IO and so an essential requirement for a Global Monolith is that it be optimized for random IO. A Global Monolith should have NO random IO penalty at all at the storage layer.

Maximizing random IO depends on hardware innovations that will increase concurrency and reduce latency for random IO at the SSD storage layer combined with algorithmic innovations that allow SSDs to be used more efficiently. The combination of these innovations will result in SSDs that have much simpler and lower power controller chips while still offering more channels for concurrent access and higher storage density than existing SSDs.

Building a Global Monolith

Building a Global Monolith requires innovations at all layers of the hardware and software stack.

The 5 Pillars of a Global Monolith: Post Quantum Cryptography, Identity and Data Ownership, Distributed Consensus and Consistency, Immutable Content Addressable Storage, and a Graph Data Model are only the most essential technologies, without which a Global Monolith is impossible to build.

In some cases, such as with Post Quantum Cryptography and Content Addressable Storage, there is only one possible algorithmic solution to the problem. If it were possible to patent those algorithms or necessary features of the system to practically implement those algorithm those patents would be extremely valuable.

Proof-of-concepts for aspects of the Post Quantum Cryptography system have been open sourced with the Ciph Encrypted Video Streaming Platform and a Provisional Patent METHOD FOR ENCRYPTING DATA FOR TRANSMISSION OR STORAGE VIA A CLOUD SERVICE SUCH THAT THE SERVICE OPERATOR CANNOT IDENTIFY THE DATA BEING TRANSMITTED OR STORED which was filed and allowed to expire in order to permanently enter it into the public domain.

A proof-of-concept for building web based applications using a Graph Data Model and immutable data has been open sourced with the immutable-app and immutable-core-model libraries.

Complete designs, algorithms, and proof-of-concepts exist in proprietary form for the remaining pillars. This IP portfolio is owned by a non-US domiciled company and is available for licensing on non-discriminatory terms to any company in the world.

Yottascale

Discussion about this post