Morpheus: a perceptual-hash based registry with proofs of transformation

Context

Motivation

We are in the midst of a meteoric decline in the cost of content creation and distribution. For anyone with an internet connection, it is becoming increasingly trivial to share a news snippet, take and upload an image, or write and produce a video. Social media cracked mass content distribution, but artificial intelligence is (and will) crack content creation at scale. Now, we find ourselves in a scenario where:

<aside> 💡 (1) social media can disproportionately amplify fake news and hearsay,

(2) AI can effortlessly create content, &

(3) citations/references are sparse and faulty

</aside>

This scenario is nightmarish when we consider the psychology behind content consumption. The weight and gravity we give to content is a function of its perceived authenticity; our willingness to internalize it is a function of its relevance and timeliness. If perceived authenticity is manipulable (i.e. social media), while relevance is engineerable (i.e. AI), content consumption online is about to resemble a noisy, chaotic, and ambiguous echo chamber.

So, where do we go from here? We can:

<aside> 💡 (1) ban AIGC on the web,

(2) severely limit the scope of what can be generated, or

(3) track and watermark AIGC content.

</aside>

The cat is out of the bag and (1), at this point, is impossible. (2) is just as hard because of the realities of OSS and software distribution. Thus, we’re left with (3).

But how do we approach this?

An ideal scheme

Remember that the goal with option (3) isn’t to remove or ban AIGC, but rather, to track and watermark it. Thus, in a perfect world, you might imagine that all content on the web is marked with some kind of invisible identifier. The identifier could contain a cryptographic signature attesting to the origin of the content, or additional metadata like the model used to create a deepfake or the approximate location of a photo.

The ideal identifier should be imperceptible to human senses and impervious to algorithmic analysis. It should also remain secure against both unintentional attacks like compression, screenshotting, and re-captures, and intentional attacks, such as histogram or Gaussian noise manipulation.

<aside> 💡 Our ultimate aim is to embed critical information into content in a resilient and invisible manner, to enable us to reason about the creator and the context in which it was created.

</aside>

This, unsurprisingly, sounds a lot like what image steganography and digital watermarking can enable us to do.

Image steganography and digital watermarking

Both image steganography and digital watermarking are techniques used to hide information within an image. Image steganography involves embedding secret data within an image in such a way that it is not visible to the human eye. Digital watermarking, in a similar vein, is is a technique used to mark content to indicate ownership, copyright, or to detect unauthorized copies.

High level, there are three approaches used in steganography and watermarking to accomplish data embedding: