- Fractal Corp
- Posts
- New Post
New Post
On Data as a renewable resource
Data Veracity is key

Modernity since the Gutenberg printing press introduced a new paradigm on what it means to be human. For the first time information that was only available via private wires was being democratized into affordable commodifiable units. ‘Tokenised’ to use the AI term. But to borrow the concept from Peter Thiel, this shift leaned more towards the bits world than the atom world. AI has gotten well past the tipping point where it’s now everywhere and the flows of funds are gushing.
Before we dissect that further let’s motivate the concept with some more quotes from John Naisbitt
We are drowning in information but starved for knowledge.
The more high technology around us, the more the need for human touch...HighTech/High Touch. The principle symbolizes the need for balance between our physical and spiritual reality.
The new source of power is not money in the hands of a few, but information in the hands of many.
In a world that is constantly changing, there is no one subject or set of subjects that will serve you for the foreseeable future, let alone for the rest of your life. The most important skill to acquire now is learning how to learn.
Machine learning algorithms rely on mechanisms such as reinforcement, repetition, recall, replication, & redundancy to improve past trained data sets and evolve on new field data, structured data, and locked value unstructured data. These algorithms can attack structured data using supervised learning methods and unstructured with reinforced learning (trial by error, 10-armed bandit, etc.)
Before going any further down the technical rabbit hole, let's consider solutions on how to avoid drowning in data. A huge problem with big data.
Filtering
Veracity
Trusted Source(s)
Vetting
Risk Savvy Considerations
We will not consider methods such as batching, stacking, harvesting… basically anything this is data warehouse oriented / data lakes (structured tabular data). Anything sitting in a big 3 (Google BigQuery, Amazon Redshift, Microsoft Azure) cloud. No formal ETL procedures or tools will be discussed below.
To distill on the 5 outlined, we will build a case study with a simple example of a news story.
The rest of of this content is for paid subscribers only