260409
260409

260409

  • AutoBuilder

Cursor is incredibly good harness.

Opus is also not Oputhetic anymore

Backlinks (0)

No backlinks found.

260407
260407

260407

AutoBuilder

Backlinks (0)

No backlinks found.

Screenshot as an API
Screenshot as an API

Screenshot as an API

  • Letter to Mr. Matt Rickard on 2022-10-03

References

Screenshots as the Universal API

  • With ML advancements, screenshots are now a universal data format.
    • (decoder) relatively easy to extract...
      • meaning (image-to-text)
      • layout information (object recognition)
      • text (OCR)
      • other metadata (formatting, fonts, etc.)
    • (encoder) diffusion-based models like Stable Diffusion and DALL-E (text-to-image) Prompt Engineering
  • What's good:
    • Easier to parse than highly complex layout formats
      • No need to understand PDF data format
      • No need to hydrate webs for web crawlers
    • Universally available, easily copyable
      • Images aren't the most efficient encoding method for text.
      • But they can be the simplest for humans
      • You can copy objects from photos in the latest Apple iOS 16 update.
  • Permissionless.
    • Many applications won't allow you to export data.
    • Screenshots are always available.
    • Related to when Naver Vibe attempted to steal other music players' market cap with Screenshot Recognition technology.
      • [튜토리얼] 다른 뮤직앱 플레이리스트, 쉽게 가져오는 법
      • '타사 음원 리스트 수초만에 이동' 네이버 바이브에 OCR 적용 - 전자신문
  • More complex metadata
    • Look how effective image search is on mobile. Dogs, City, Oceans...
    • Some come from the actual image metadata, and others are inferred with On-device models.
    • Automatically encoding this data in traditional formats like PDF takes much longer.
  • I wrote a reply like the following. Letter to Mr. Matt Rickard on 2022-10-03

Rethinking the PDF

  • It's founder, John Warnock (co-founder of Adobe), prototyped a compatibility layer where documents would look and, most importantly, print (!) the same regardless of the computer they were viewed on (1993 video). This is the PDF.
  • The "killer app" for PDF was tax returns - the IRS adopted PDF in 1996 because of a rumored frustration with the US Postal Service.
  • Things that lack:
    • Enterprise-grade OCR for PDF documents still doesn't exist in 2022, albeit having state-of-the-art computer vision techniques.
    • Interactive and web-enabled forms. Sometimes it saves without the data filled in
    • Slow page loads. Better alternatives. EPUB, MOBI for texts. For generic use cases, DjVu.

References

  • PDF processing and analysis with open-source tools
Backlinks (4)
  • Unsemantic
  • 221010
  • 221003
  • Matt Rickard
Index
cho.sh
I prefer CLIBB9A08260619260619컴퓨트로늄37A88F컴퓨트로늄0CF03F컴퓨트로늄2C60FB260618260618260418260418260528260528AutoBuilder63849A260419260419Setup9AC296StellaD226F7260415260415Debian SetupD2F701260414260414anaclumos/configs/AGENTS.mdED86A3Ramp의 AX (회사를 AI로 물들이는 법)840774260413260413How to get your company AI pilled46544C260411260411260409260409260407260407260406260406Separating Claude Code Personal Sub and Claude Code Company Sub33A53C
Warning
This post is more than a year old. Information may be outdated.