Letter to Mr. Matt Rickard on 2022-11-28

Letter to Mr. Matt Rickard on 2022-11-28

Dear Mr. [[Rickard]],

It's Sunghyun. I hope you had a great Thanksgiving break — I've been great! I hope to work on some projects on curating human knowledge and sentience, which I've been delving into these days.

I recently had an idea for a hypothetical product, "Let's Alter." I have a vision but need clarification on where to start.

Let's Alter is a non-profit product that uses Generative AIs to describe any image on the web. For example, given an idea of dogs playing frisbee, AI will put the following as such.

ALT: Three white dogs are playing frisbee on a lawn with a woman wearing a red skirt

Facebook and Instagram have been doing this for a while, but AFAIK there are no open-source products. As Generative AIs evolved, I expected it to be possible to reverse Gen AIs to take images and spit out different variations of descriptive texts.

The final goal is to create a product like Let's Encrypt, which heavily contributed to removing unencrypted websites on the web. I imagine a future where no single image on the web is missing the alt tag with this technology.

What would be a logical first step to diving into this goal? Any ideas or meta-ideas that come to your mind?

Any help would be heavily appreciated.

Best Regards, Sunghyun Cho

Response on 2022-11-29

Resources

openai/CLIP: Contrastive Language-Image Pretraining
CLIP: Connecting Text and Images
Vision Transforms (ViT)
CLIP Interrogator - a Hugging Face Space by pharma

Implementation Suggestion

CLIP inference server + Library that wraps an image component (raw img or next/image)

function AltTextImage() {
...hash image on the client
...check if a hash exists on AltTextDB
...if it does, set alt text
...if it doesn't, upload the image and the alt text will be returned (hash + maybe image is stored in [[Database|DB]])
}

Transparency Logs

Possible incorporation of a transparency log like Let's Encrypt.
Maybe a log of image hash → alt that anyone can contribute?
Maybe Common Crawl is a better model?

Perceptual Hashing

Perceptual Hashing

Letter to Mr. Matt Rickard on 2022-11-28

Response on 2022-11-29

Resources

Implementation Suggestion

Transparency Logs

Perceptual Hashing

Links to This Note

Stable Alternative

Response on 2022-11-29​

Resources​

Implementation Suggestion​

Transparency Logs​

Perceptual Hashing​

Links to This Note

Stable Alternative

Response on 2022-11-29

Resources

Implementation Suggestion

Transparency Logs

Perceptual Hashing