Thoughts around second order effects on text

Thoughts around second order effects on text

My friend Paulo Vasconcellos recently shared a post on LinkedIn that sparked a deep reflection for me. One specific part of his analysis regarding data quality and AI partnerships caught my extra attention:

“A practical example is GPT Image, the one from the Studio Ghibli trend. Behind it was Sora, a model trained on data from these platforms. Even when denied by executives like Mira Murati, the facts show that quality comes from access to this data, combined with the architectural quality of the model.

Quality data still exists, and in large quantities. It’s just harder to obtain access. That’s why we’ve seen major multi-million dollar partnerships between AI providers and data holders, such as the NY Times licensing content to Amazon. Those who don’t have this capability are now investing heavily in synthetic data as an alternative.”

I responded with some thoughts on what I believe will be the “second-order effects” of these shifts on text and LLMs:

Sensational reflection, Paulo. I’m still in the camp that will always bet on text, but your reflection shows how these “walled gardens” will continue to dominate online text.

I also don’t buy into the intellectual hipsterism that suggests an industry collapse is imminent; however, I think about three specific things:

  1. Cognitive Retention: We are in uncharted waters regarding how these new forms of information transmission will affect cognitive aspects, especially knowledge retention.
  2. Linguistic Erosion: How much spoken language will affect written language, specifically in terms of further lowering lexical and grammatical barriers.
  3. Algorithmic Incentives: The alignment of incentives in audio/video production relative to the reward from the “algorithm™” of these walled gardens—specifically concerning the injection of mannerisms, speech habits, aesthetic and stylistic elements (or the lack thereof) and how this will affect knowledge production.

If I had to make a prediction, I would say that the first company or platform that manages to index, summarize, and provide the same speed that writing offers today (without all the scenic and rhetorical aspects of audio/video) will have a 1-trillion-dollar business.