What if our Content was just Data?

20 Jun

Data aggregation in the modern web.

It strikes me in this day and age that we are dealing with a very different beast when it comes to data, and yet so much of the web is still stuck in old paradigms.

Monolithic

Back in the day we were all starting out. Everybody needed a website and content and data started to grow, though they were always two very different things. Now everybody has data, and they have plenty of content - possibly even sporting multiple websites to boot.

Technology got more powerful, web development became a frontend and backend thing as complexity & capacity grew and new problems arose, new problems were tackled. UIs and APIs became the norm and yet we were still building our ‘content’ in content management systems (CMS) that were tied to the UI. They were always entirely dependent on each other. We refer to these CMS as being ‘Monolithic’ as they are essentially one.

The lines between content and presentation was vague and interspersed. It’s always been normal for us to store our content in a database field with all the templating markup wrapped around what we were trying to convey to the world. Because of this our content was delivered with its use, look & feel pre-determined. Re-usability of content was never a thing. Cross platform consisted of my work desktop and my personal laptop. All of that was fine as we were only intending to use it on a simple website.

Now people are realising they have collected so much data in their field of relevance and they don’t know how to make sense of it. They don’t know how to surface it, how to make it consumable or how to work with it efficiently. Monolithic lockups are not viable options to modern business problems. These tools do not scale well, and impose limitations to technology and architecture.

Authentication

Authentication concerns & complexity have risen to the point where it’s becoming the recommendation and the norm to use 3rd party authentication providers. Storing your users profile data separately to their authentication credentials. This separation of credentials from profile information is a good thing. It frees us to use the same credentials for multiple purposes, allowing the same user credentials to be to access multiple applications.

Anything from a small marketing landing page to large enterprise applications are now being tooled out & tackled with separate APIs and cloud services, though there is still this innate dissonance between data and content.

The rise of microservice architecture and similar notions are becoming increasingly valuable and necessary to handle growth in this increasingly interconnected world.

Cloud functions and cloud compute, asset CDNs, caching, SSR, preloading, hydration, performance, service-workers, offline first, security and scalability are becoming musts in order to capture your audience.

People move fast on the web. You’ve got to be more engaging and do it faster or your lead, your customer, your user is gone. Tools and technologies have continued to advance at an alarming rate and it’s not about to slow down any time soon.

Our old monolithic paradigms are not serving us so well anymore. We are now well inside the age of IoT, and the content we’ve spent large sums of money (more than most are willing to admit to) curating and finessing, is no longer usable in the modern web.

Stuck Inside

Our Data (read content) is stuck inside monolithic CMSs and lost inside template markup that isn’t even in alignment with our blogs theme intentions let alone being ready for cross platform use. The old tables and articles with imagery is painfully un-responsive and is quite frankly embarrassing when viewed on mobile.

People don’t use their mobiles to browse the web right? If they are connecting with us, they’ll go to the app store and download our app surely?. Not so much. People are consuming most of their content on mobile web pages which leads to the question: "Are our sites ready?”

It’s expensive and complicated to try and get our data out of a CMS like WordPress. Often redevelopment of a site will have started elsewhere before the question gets asked “How will we get the content from our old CMS site into our nice new expensive CMS?”. This is all too often expensive and complex. It’s even harder to consolidate and migrate users - without at least a little user disruption of course. This is the world of the monolithic CMS, data (blog posts, product information, company information) and page layouts (templates) are all locked up inside the same box.

Content is Data

So how do we handle this increasing problem?

Well, like with auth, we get our content out, we unlock it, we unburden it. We connect our UIs to one gateway that can provide access to; and communicate with, all the APIs & services it requires – delivering content through one access point. We remove the templating logic that is wrapping our content and make it reusable again.

We need to stop thinking about our content as page content, Content is Data, and when we frame that data right - we can use it any way we chose, on any platform, at any size - even on our wrist watch.

When we distil our content down to just data, it opens many doors for its use, its purpose, its limitations (or lack of). We should be able to show our content on screens in cars, on kiosks and mini displays, screen readers, any media display type. When our content is just data, we can connect it to other data, and make relationships and references to all of our existing data. We can unpack, better understand and plan all of our data types and use cases.

Ability to Scale

At Cucumber we are a tech company – we’re not just building blogs and marketing bumf. We’re solving real world problems through connected technology and working with big data & aggregated feeds from many sources, cloud functions and microservices and surfacing it in elegant, meaningful ways that leads to unlocking new pathways and new opportunities.

One of the most limiting factors in terms of being able to scale and evolve a product is it’s dependencies. We evaluate our solutions to find the balance between cost & scale within our architecture designs – we measure against ‘is this closing doors or opening doors?’ what limitations could this decision be creating or imposing. With a monolithic CMS the answer to that question is ‘closing doors’ most defiantly!

We’ve recently been investigating options and alternatives in this space and it has been an interesting investigation. There are many wonderful tools out there. Some designed to deliver websites very quickly, others designed to provide a huge array of integrations to other sites. Some of these tools are very powerful indeed, and may require a super human effort to learn.

However as in life, too much power can be crippling.

If we were to provide a tool where our clients could do ‘anything’ it sounds like a big ask, there are so many factors at play. But let’s just say for a moment, with regards to content design and structure you can drag and drop, scale, resize, rotate anything. How many people that are not trained web designers, could actually do their company, their brand, their content justice?

Headless

Enter the headless CMS.

What if we separated it all out. So our content was just display agnostic data (ignorant of the other moving parts as the rest of the APIs and services).
What if we centralised the core concerns of our digital ecosystem into the right buckets and split the responsibility of complexity into logical camps of common concerns.
What if our content was just data? We could use it anywhere any-way, we could remain unencumbered, we could do ‘anything’ within the capacity of the web. And that’s a big playing field.

That certainly sounds like a better position to be in when things are moving so fast. It would be nice if we could keep up, it would be critical that we don’t fall too far behind.

Scalability

Scalability is a key consideration and very important when looking at any software platform. Is the product designed for scale naturally or are there factors which limit the ability to scale?

Understanding the key focus of the product and their guiding principles in build and implementation of the components that go into a solution package is key.

This has already been pointed to above and it’s really around whether this can be seen in the implementation and whether efficiency, scalability and extent-of-use-case options can be perceived – these will all be greatly impacted by this factor.

When evaluating what product to use, understanding the target audience and feature sets included/excluded - including evaluating the rationale for such decisions is very important. Every product has a target audience. Sometimes the finer details can be hard to identify but if a product is being built out with a focus that doesn’t suit your use case it should be dropped from the list.

Control of data

Can we download our data. Where does our data live. Who ultimately owns it.

The answers to these need to be: yes, where we want, us (as in the client, not the provider). Data ownership, and data sovereignty are key considerations when choosing any platform that will host your data. The ability to host your own system, manage and control access to the underlying database(s) and know exactly where the data is, who has access and for what purposes is a key consideration in any IT decision today.

Control of UI

Does it build into it the capacity to easily control, scale, modify the UI for the CMS? It’s common for CMS’s, particularly low code CMS’s to be a black box in this regard. It’s very important to be able to work with any UI that we provide and not be locked into solutions and limitations.

API options (GraphQL/REST)

The quality and the customisability of the API are very important factors. GraphQL is a great base benchmark as this is inherently scalable & flexible and ideal for gathering data requirements for UIs. The Ability to produce a REST API for other applications and mobile apps is a very important feature when separating content from presentation.

But on top of that, how much control over data visibility & access, model schemas and customisations, relationships and aggregations. These are all base requirements for a system to be powerful and scalable for CMS data.

The Tin

At the end of the day a Headless CMS is just exactly as it says on the tin - a Content Management System. Not what we’re more accustomed to expecting with a monolithic CMS that comes free with a templating engine and a theme-able UI. That sounds great, but when it means your content is inherently embedded into template structure and your theme is on such a high level – without a huge amount of extra work you’ve got yourself locked into quite an inflexible solution that is expensive to scale and limited in ultimate scope.

With a Headless CMS, your content is just data, and we can use that data to create any UI and render any subset of that data on any device. When we are working so closely to the data we can define relationships and filters, collections (sets) between and within any part of that data to be able to create our own notion of what we know “widgets” to be. So we can provide bite sized building blocks for you to inevitably be able to create any kind of page or element structure you want. And the onus of responsive design and cross platform stability/consistency is on us.

Initially it’s just a case of understanding, together, what those components (“widgets”) look like, what else is interacting with it, which pages need additional (custom) functionality or integration with data from other APIs. There can easily be a blended/hybrid approach where we can achieve a powerful web interface with full freedom of control and all responsibilities around responsive design and device handling, what your content actually looks like - All of this is a UI problem, the data knows nothing about the UI, only about the data.

A different UI can tackle a completely different problem with the same data and deliver a new business outcome.

What did we choose?

After reviewing a range of products and solutions that are available – we settled on 1 product as being best for our needs, Directus.

Directus was the only one to make it through testing without identifying a “failure to meet” against it.

Directus combined with Hasura (an open-source platform for bringing federated data together into a consistent structure), enables Cucumber to unlock data that has been locked away in monolithic CMS systems, backend databases, ERP tools, spreadsheets and any number to sources holding data hostage.

Once unlocked, this data can then be brought forward and make to work for you business and deliver real value to your customers in meaningful ways, on devices that your customers are using, where and when they need it.