Skip to main content

Counting...

Image: Hero image of the iPhone's Korean keyboard: "Sky, Earth, and Human."

💎Give me the App Store Link first!

Of course! Here is the App Store Link. Also available on GitHub.

More and more Korean citizens are considering iPhones. Interestingly, the elderly Korean generation is purchasing iPhones more than ever. While many state the primary reasons for choosing a Galaxy to be call recording and Samsung Pay, my observation after my parents switched to iPhones differed.

Unexpectedly, the most significant difficulty for older generations was the keyboard. Korean customers have had no problem typing Korean with a 10-key dial pad since the very early days because there has been a powerful input method known as "천지인" (Cheon-Ji-In, which translates to "Sky, Earth, and Human") to input Hangul, the Korean characters. This was unlike Roman alphabet keyboards, which required several characters to be crammed onto a single button. Koreans had far less of a need to switch to QWERTY keyboards because of this. Many people still use Cheon-Ji-In unless they are members of Gen Z who grew up with smartphones.

The patent for Cheon-Ji-In entered the public domain in 2010. iPhone added support for Cheon-Ji-In in 2013, but its shape differed from that of the standard Cheon-Ji-In. The starkest difference was that the space button and the next character buttons were separate.

The space button and the next character button

The space button and the next character button
💎For example, to type "오 안녕"...
  • Galaxy: → Space → Space
  • iPhone: → Space → Next Character

Moreover, the size of each button was smaller, making people produce more typos than ever. For these reasons, I decided to replicate the Galaxy Cheon-Ji-In experience on iPhones.

💎Goal

Let's recreate the original Cheon-Ji-In for iPhones!

🍯Extra Tip

I also open-sourced the research notes for this project.

First of all, I checked the legal rights. I found that the patent holder, 조관현 (Cho Kwan-Hyeon), had donated the patent to the Korean government, and Cheon-Ji-In had become the national standard for input methods, publicizing the legal rights to the keyboard. So I confirmed these details and then moved on to the development process.

🛠 Readying the tech

I first read through Apple's Creating a Custom Keyboard document. It was similar to creating a regular iOS app — create ViewControllers and embed the logic inside. However, I wanted to try SwiftUI since it was my first time using it. Moreover, SwiftUI Grid would be a clean approach to organizing buttons. Still, I figured that this class is more suitable for things like the Photos app, which has numerous elements to lay out, and a simple HStack and VStack (similar to display: flex on the Web ecosystem) would suffice my needs.

iPhone third-party keyboards use a unique structure known as extensions. Anything not running on the main iOS app is an extension — custom keyboards are extensions, iOS widgets are extensions, and parts of Apple Watch apps are extensions. I read through Ray Wenderlich and understood how keyboard extensions worked.

keyboard image of having gray background around `ㅇ`

keyboard image of having gray background around `ㅇ`

keyboard image of having gray background around `ㅇ`

A few early prototypes

The gray background of "ㅇ" was iOS's NSRange and setMarkedText. It helped enter the text by marking the currently edited characters, but such methods seemed more suitable for Pinyin in Chinese, not Cheon-Ji-In for Hangul.

Another interesting observation was that the colors of the default iPhone keyboards differed from any default system colors provided with iOS. I had to extract the color with Color Meters one by one.

😶‍🌫️ But then how do we make Cheon-Ji-In?

Supplementary YouTube video on how Hangul system works.

I first thought of individual cases to figure out the input logic of Cheon-Ji-In, then figured that this is tremendously difficult. For example, take:

  • To input , we start with 안ㅅ and press ㅅㅎ to acquire . That is, we must check if the characters are "re-mergeable" with the character before.
  • From , when we input , it must be 안즈. Therefore, we must check if the last consonant is extractable from the previous character.
  • From , when we input ㅂㅍ, it should result in 깔ㅃ. We must check if the consonants are extractable and switch between fortis and lenis (strong and weak sounds, like '/p/ and /b/', '/t/ and /d/', or '/k/ and /ɡ/' in English).
  • From , when we input ㅅㅎ, it should result in . More than switching between , , and , we must consider double consonant endings like .

These are just a few examples. Even if we used KS X 1001 Combinational Korean Encodings, it took a lot of work to consider all cases. I concluded that using a Finite State Machine required more than 20 data stacks and dozens of states. (I am unsure of this calculation because I guessed some parts of it; there may be a more straightforward implementation.) If you want to try building such an algorithm, refer to this patent's diagrams. I found some implementations online, but they were long and spaghettified. Translating them to the Swift language and understanding the codes would take significant time.

But then I came to an epiphany:

💎If there are too many cases...

why don't I hardcode every combination?

After all, aren't keyboards supposed to input the same character, given the input sequence is the same? What if I generate all possible combinations and put them into a giant JSON file? Korean character combinations are around 11,000. Even considering previous characters, the combinations seemed to be at most 100K levels. The size of the JSON file will not exceed 2MB.

We are not living in an era where we must golf with KBs of RAM on embedded hardware. As long as Hangul coexists with the human species, someone will recreate Cheon-Ji-In in the future, making constructing the complete Hangul map worth it.

🖨️ Hwalja: The most straightforward Cheon-Ji-In implementation

Therefore, I created Hwalja: the complete map 🗺️ of Hangul, containing all such states and combinations of Cheon-Ji-In. There are around 50,000 states, and the minified JSON is about 500 KB. (Note: Hwalja means movable type in Korean.)

To implement additional higher-level features (such as removing consonants, not characters, on backspace or using timers to auto-insert "next character" buttons), we need more functional workarounds; however, the critical input logic is as simple as the following:

const type = (prev: string, Hwalja: hwalja, key: string, editing: boolean) => {
const last_two_char = prev.slice(-1)
const last_one_char = prev.slice(-2)
if (editing && last_one_char in Hwalja[key]) return prev.slice(0, -2) + Hwalja[key][last_one_char]
if (editing && last_two_char in Hwalja[key]) return prev.slice(0, -1) + Hwalja[key][last_two_char]
return prev + Hwalja[key]['']
}

I boldly claim this is the simplest implementation of Cheon-Ji-In, given its five-liner.

Some may ask how I preprocessed such large combinations; I set the 11,000 final Hangul characters as the destination and traced back what would've been the previous state and what button the user must have entered last. For example, to input 역, the previous state must have been 여, and the keypress must have been ㄱ. Of course, there were many more edge cases. My work from four years ago helped a lot. The following is an interactive example of Cheon-Ji-In, made with Hwalja.

🧪Try it out!
This is an interactive demo of Cheon-Ji-In, made with Hwalja.

I open-sourced Hwalja for platform-agnostic usage.
Please try out the above demo!

💎Don't be mistaken...
Hwalja is the most simplest implementation, not the lightest.
Can't we use combinatory Hangul sets and normalize the combinations to reduce the case count?

On the Hwalja project, Engineer 이성광 (Lee Sung-kwang) pointed out that using Normalization Form D and decomposing consonants will reduce the case count. I only considered Normalization Form D, but Engineer 이성광 is correct. For example, we decompose 안녕 as 안 ᄂᆞᆞㅣㅇ and use Hwalja to gather ᆞᆞㅣ into and then normalize ㄴㅕㅇ into 녕.

I decided to maintain Hwalja's current approach because it aims for the easiest and simplest Cheon-Ji-In implementation. The current system enables developers to stick with "substring" and "replace." If I add dependencies on Normalization Form D and Unicode Normalization, the Hwalja project may be lighter, but the developers using Hwalja must add additional handlers for normalizations. I created Hwalja because using Automata and Finite State Machines had steep learning curves. Thus, requiring any learning curves to use Hwalja violates the original purpose. Also, the final minified version is already 500KB, which is manageable for a full-fledged input engine.

🤖 Implementing Keyboard Autocompletes

Cheon-Ji-In users can type at blazing speeds because of their active use of autocompleted texts (Apple QuickType). In addition, these autocompleted texts continuously learn from the user to assist with typing.

Fortunately, Apple's UIKit supports UITextChecker, which frees us from going down to Core ML and Neural Engine levels. Korean is also supported, and we can use learnWord() and unlearnWord() to record data on user activities.

import UIKit

let uiTextChecker = UITextChecker()
let input = "행복하"
let guesses = uiTextChecker.completions(
forPartialWordRange: NSRange(location: 0, length: input.count),
in: input,
language: "ko-KR"
)

/*
[
"행복한", "행복합니다", "행복하게", "행복할", "행복하다", "행복하고", "행복하지",
"행복하다고", "행복하다는", "행복하기", "행복하면", "행복할까", "행복하길",
"행복함을", "행복하기를", "행복함", "행복하니", "행복한테", "행복하자", "행복하네"
]
*/

I used such features to implement the autocomplete feature. Sometimes the flow feels unnatural, or the keyboard does not suggest anything, but this is a perfect implementation for an MVP.

Happy 2023 💙

Happy 2023 💙

⌨️ Advancing Keyboard Functionalities

Cheon-Ji-In, rooting from the 10-key keypad, has many higher-level functionalities, such as long-pressing backspace to delete multiple characters until you release the key or holding any key to input the corresponding number key. I used Swift's closure to extend the keyboard component.

struct KeyboardButton: View {
var onPress: () -> Void
var onLongPress: () -> Void
var onLongPressFinished: () -> Void
var body: some View {
Button(action: {})
.simultaneousGesture(
DragGesture(minimumDistance: 0) // <-- A
.onChanged { _ in
// Code to be executed when long pressed or dragged
onLongPress()
}
.onEnded { _ in
// When long press or drag gesture finishes
onLongPressFinished()
}
)
.highPriorityGesture(
TapGesture()
.onEnded { _ in
// Code to be executed on tap
onPress()
}
)
}
}

Code simplified for explanation. KeyboardButton.swift

I found an ingenious implementation on the part marked A. With this, I can successfully implement two features with one code.

  • Flicking (swiping) on a button to input numbers.
  • Long-pressing on a button to input numbers.

It utilizes iOS's behavior that when the minimum distance of DragGesture is set to 0, iOS cancels the highPriorityGesture when it recognizes long-press and falls back to DragGesture.

Furthermore, I used Combine, introduced with iOS13. Combine Framework is a Declarative Swift API to implement asynchronous operations. With this, we can create timers to implement the "long press backspace" action.

struct DeleteButton: View {
@State var timer: AnyCancellable?
var body: some View {
KeyboardButton(systemName: "delete.left.fill", primary: false, action: {
// on tap, execute the default delete action.
options.deleteAction()
},
onLongPress: {
// when long pressed, create a timer that will trigger every 0.1 seconds.
timer = Timer.publish(every: 0.1, on: .main, in: .common)
.autoconnect()
.sink { _ in
// while pressing the button, execute the delete action every 0.1 seconds.
options.deleteAction()
}
},
onLongPressFinished: {
// when the long press finishes, cancel the timer.
timer?.cancel()
})
}
}

Code simplified for explanation. HangulView.swift

With these codes, I implemented particular functionalities using long-press or drag gestures.

🦾 Accessibility and usability

I added a few helpful accessibility features. For example, if the user enables "bold text," the keyboard button will reflect the change. The following code implements such behavior.

let fontWeight: UIAccessibility.isBoldTextEnabled ? .bold : .regular

Bold Text Enabled

Bold Text Enabled

Bold Text Disabled

Bold Text Disabled

Also, I found one feature particularly inspirational. This keyboard is primarily for those Galaxy android devices with a "back" button in the bottom right corner. Galaxy users are used to dismissing the keyboard with the "back" button. So I placed the keyboard's dismiss button in the bottom right corner to resemble this.

Pressing the bottom right corner button dismisses the keyboard.

Pressing the bottom right corner button dismisses the keyboard.

🧑🏻‍🎨 Using Midjourney to create the app icon

Midjourney Images

Images created with Midjourney

I used Midjourney, a text-to-image AI program, to create the app icon. This is called prompt engineering. Creating paintings with various keywords was amusing.

☁️ CI/CD with Xcode Cloud

Finally, I built CI/CD using Xcode Cloud (released in 2022). When using this, if you push your React code to GitHub, Vercel will build and deploy it independently. iOS apps are compiled and stored on the Apple Xcode Cloud servers. For Apple iPhone apps, there is an App Store review process, so they are not automatically distributed. (You must select a build in the App Store console and hit the "request review" button.) Still, it's much easier than creating an archive file in Xcode and manually uploading it.

You can check the build linked with GitHub on the App Store console

You can check the build linked with GitHub on the App Store console

Push notifications are supported.

Push notifications are supported.

🏁 Finishing up

It has been a while since I did iOS development; it was a thrilling experience. The iOS platform has greatly matured. In particular, while working on Hwalja, I felt that Hangul was meticulously engineered. Most of all, I felt good because I made this app for my parents as a present. I will finish this article by attaching the links.

💙A five-star review on the App Store and a star on GitHub would really help me!

Counting...
🗣Talk is cheap; show me right now!
Of course. Click on the black oval below. It will display a song I am currently listening to or any of my 30 most recently played songs.

Click on the black oval above!

Good artists copy; great artists steal — and I am now replicating Vercel's DX VP Lee Robinson's idea, LeeRob.io. Known for being a excellent testing bed for new Next.js features, Lee Robinson has one outstanding functionality: It will display the owner's song currently playing.

leerob.io

Now Playing — Spotify @ leerob.io

I have an unmistakable taste in music genres and longed to implement this one day on my website. However, I wanted it to be a technical challenge rather than simply recreating it. I also tried out various music services, making me postpone the development. I kept delaying the action until Apple released an exciting feature in 2022 called the Dynamic Island.

The Dynamic Island

The punch-hole camera at the top will reshape itself into different widgets.

The Dynamic Island perfectly satisfied my desire for a technical hurdle, so I planned to implement it with Web technologies. I also checked out different copies of the Dynamic Island for Android products, which all had awkward animation curves, further getting me interested in learning such details.

💡Goal

Let's recreate the Dynamic Island on the Web!

💰Extra Tip

I also open-sourced my research notes for this project.

🛠 Readying the Tech

I went with the most familiar choice for the framework: Next.js and Tailwind. However, the animation troubled me. I have never dealt with anything more complicated than ease-in-ease-out CSS animations. I learned about a production-ready motion library named Framer Motion and opted for it.

Framer Motion

Framer Motion

🧑🏻‍🏫 The Physics of Animations

We first want to understand why Apple's animations look different from the others. We can classify animations into two big categories. (At least Apple classifies theirs into these two categories in their platform.)

Parametric Curve. Given a start and an endpoint, place a few control points and interpolate the curve in between using mathematical formulas. Depending on the type of interpolation formula, it can be a linear curve, polynomial curve, spline curve, etc. The Bézier curve that many developers often use falls under this category.

Spring Curve. Based on Newtonian dynamics (Hooke's law, the law that governs a spring's physical motion), we calculate the physical trajectory using stiffness and dampening. Learn More: Maxime Heckel

Any further discussions on animation curves will be out of the scope of this post. Most replications of the Dynamic Island choose parametric curves (it's the easiest, standardized in CSS). Apple uses spring motion, supposedly to mimic real-world physics. Framer Motion, the library I chose for this project, also provides a React hook named useSpring() to give control of such physical animations.

import { useSpring } from 'framer-motion'
useSpring(x, { stiffness: 1000, damping: 10 })

🛥 To the Dynamic Island

Source: Apple

Source: Apple

I had to study the different behaviors of the Dynamic Island with Apple's official documents. The Dynamic Island can be any of the following forms:

Minimal. The widget takes one side of the Dynamic Island when two or more background activities are ongoing.

Minimal. The widget takes one side of the Dynamic Island when two or more background activities are ongoing.

Compact: The standard form, where the widget takes both sides of the Dynamic Island when there is one ongoing background activity.

Compact: The standard form, where the widget takes both sides of the Dynamic Island when there is one ongoing background activity.

Expanded: The biggest size of the the Dynamic Island, shown when the user long-presses on the Dynamic Island. It cannot display content in the red area.

Expanded: The biggest size of the the Dynamic Island, shown when the user long-presses on the Dynamic Island. It cannot display content in the red area.

Furthermore, I found the following image on the Web. Apple puts Expanded for all big sizes, but this image describes the Dynamic Island's expanded states.

Different sizes of the Dynamic Island. Considering that there is a typo on the image, it doesn&#39;t seem like an official document.

Different sizes of the Dynamic Island. Considering that there is a typo on the image, it doesn't seem like an official document.

I declared the type as the following, reflecting the earlier information.

export type DynamicIslandSize =
| 'compact'
| 'minimalLeading'
| 'minimalTrailing'
| 'default'
| 'long'
| 'large'
| 'ultra'

Then I spent a whole night (2022-10-16) and figured out how to naturally shift sizes with Framer Motion. It uses the following codes. I especially experimented with a lot of stiffness and dampening values; the golden ratio was const stiffness = 400 and const damping = 30.

<motion.div
id={props.id}
initial={{
opacity: props.size === props.before ? 1 : 0,
scale: props.size === props.before ? 1 : 0.9,
}}
animate={{
opacity: props.size === props.before ? 0 : 1,
scale: props.size === props.before ? 0.9 : 1,
transition: { type: 'spring', stiffness: stiffness, damping: damping },
}}
exit={{ opacity: 0, filter: 'blur(10px)', scale: 0 }}
style={{ willChange }}
className={props.className}
>

As of Oct 16th, 2022

As of Oct 16th, 2022

📞 Hello?

Before connecting with external APIs, I mimicked Apple's incoming phone call widget. There's no big reason for this; it was just to get used to the animations. I love how it turned out; it looks exactly like the official Apple animation! Finished on Oct 20th, 2022.

↑ Click ↑

🍎 Apple Music API

Then I needed to integrate with Apple Music's API. I previously made a technical demo with Spotify's API at the beginning of 2021. Spotify officially has a Now Playing API, so naturally, I expected a similar Now Playing API at Apple Music.

I was a huuuuge fan of IZ*ONE back then... 😅

I was a huuuuge fan of IZ*ONE back then... 😅

Spotify Now Playing API

Spotify for Developers

When Apple Music API 1.1 was released, Apple released an API named Get Recently Played Tracks — the closest we ever got to the Now Playing API. FYI, such an API did not even exist two years ago.

Apple Music Get Recently Played Tracks

FYI, such an API did not even exist two years ago.

Now we need to issue and save the different tokens needed for OAuth 2.0. Spotify almost precisely followed the OAuth 2.0 standards, while Apple required a little more processing. The Now Playing API, especially, accessed the data on the Apple Music server and the user's private data, so I needed a separate privilege control with user access grants. Moreover, all of these needed to be better documented, making it significantly more complicated. I needed the following:

Aviation IndustrySame Concept at AppleExplanation
Establishing your Aviation CompanyApple Developer Paid Account$99
Pilot LicenseApple Music Key from Apple Developer Website.Ensures that you have permissions to get data from Apple Servers
Air Carrier Operating PermitApple Music Main Token from requesting to Apple ServerA permit to attach when requesting to Apple
Airline tickets for PassengersApple Music User Token from user's grantEnsures if the user wants to use my service

All four pieces of information should work harmoniously to retrieve users' data (of what they were listening to). All the others were pretty straightforward (More Info: Research Note.) The trickiest one was User Token. User Token was specialized for iOS, macOS, and MusicKit on the Web. MusicKit on the Web was intended for Apple Music Web clients, like music.apple.com, Cider, and LitoMusic and was not designed for such API request bots. Still, Apple put MusicKit on the Web will automatically take care of it without documenting it. So what are we going to do? Reverse engineer the API.

Apple: Documentation? Nah.

Apple: Documentation? Nah.

MusicKit on the Web

MusicKit on the Web. Is Apple using Storybook? Based on Apple's track record, this MUST be an Alpha of Alpha.

🦾 Cracking the MusicKit

First, I mimicked the specs of MusicKit on the Web, creating a website.

This website will do nothing else than calling the authorization process.

This website will do nothing else than calling the authorization process.

It will show the Apple Music access grant page like this.

It will show the Apple Music access grant page like this.

Then digging into the request headers of the website will reveal the media-user-token.

There we go.

There we go.

Finally, I can successfully get a JSON response from the Apple server by filling in other information with Postman software. Finished on Oct 28th, 2022.

It sounds straightforward, but it took me days to figure it out. 😭

It sounds straightforward, but it took me days to figure it out. 😭

Requiring the information whenever someone accesses the Web will deplete my API quota in minutes. I wanted to make a cache server of some sort. But remember, the best database is no database.

Don't use the database when avoidable. Which is always more often than I think. I don't need to store the 195 countries of the world in a database and join when showing a country-dropdown. Just hardcode it or put in config read on boot. Hell, maybe your entire product catalogue of the e-commerce site can be a single YAML read on boot? This goes for many more objects than I often think. It's not Ruby that's slow, it's your database

So I made a GitHub Secrets that holds my private keys and made GitHub Actions to retrieve the data every few minutes and publish it on GitHub.

I don&#39;t know how long I struggled to find this typo.

I don't know how long I struggled to find this typo.

🎼 Equalizers

Similar to finishing the phone call component, I completed the music player component.

But something felt empty.

But something felt empty.

There were no equalizers! I searched for good equalizers for React but later decided to implement this with Framer Motion. So these are the few iterations of the product.

FANCY by TWICE

After Like by IVE

Lavender Haze by Taylor Swift

Hype Boy by NewJeans

Each stick of the equalizers will have a random length. But as seen in the last song, something was also awkward. Usually, vocal music has smaller amplitudes on low and high frequencies, but completely randomizing the amplitude will also make those frequencies have similar ups and downs. So I set a base length as an anchor and made the randomized values slightly shake the values. Finally, I set the equalizer color to match the album cover's key color. I did not need additional work; it came with the Apple Music API.

Much smoother.

Much smoother.

🔎 The Physics of Squircles

We're not done yet! Such completed widgets still felt slightly off; the curves felt too sharp. We needed squircles.

Squircle

Source: Apple's Icons Have That Shape for a Very Good Reason @ HackerNoon

A standard curve made by setting a border-radius has constant curvature, leading the end of the curve to have a sudden surge in curvature, making it feel sharp. On the contrary, gradually increasing and decreasing the curvature will make a much more natural curve.

For those AP Physics nerds, it's like uniformly increasing the jerk instead of uniformly accelerating.

For those AP Calc nerds, a squircle is a superellipse — the set of points satisfying the following equation. nn is the curvature, aa is the length in xx axis, and bb is the length of yy axis. Here, For any deeper dives, check out Figma's article on squircles.

xan+ybn=1{\lvert{x \over a}\rvert}^n + {\lvert{y \over b}\rvert}^n = 1

I used this tienphaw/figma-squircle to create an SVG squircle and cut the Dynamic Island with the clipPath property.

I saw a similar bug at the iOS 16 Notification Center. Maybe Apple is also clipping?

I saw a similar bug at the iOS 16 Notification Center. Maybe Apple is also clipping?

However, to clip all frames of the animations, we would have to create squircles for every frame, risking speed. Therefore, I opted to use borderRadius for the animation and clipped it right after the animation finished. It was barely noticeable, even if you looked very closely, so it was a good trade-off between performance and detail. Finished on Nov 11th, 2022.

Look closely. The border cuts into the squircle when the animation exits.

Look closely. The border cuts into the squircle when the animation exits.

💨 Optimizing Performance

CSS has a will-change property. It tells the browser which elements on the screen will change, preparing the browser for it beforehand. The browser rasterizes every frame if there is no will-change property; however, the browser will reuse a static image while the animation processes, rasterizing only when the animation finishes. Therefore, the animation may seem blurry depending on the type, but it will give more fluidity for transform, scale, or rotate animations.

The Dynamic Island usually modifies scale and opacity, so it was perfect for will-change. We can apply the property in Framer Motion, as in the following example:

import { motion, useWillChange } from 'framer-motion'

// ...

const willChange = useWillChange()

// ...

<motion.div style={{ willChange }}/>

🔗 Integration

Last but not least, I made pages for integration purposes (/embed-player, /embed-phone-call.) I did not want to add Tailwind or Framer Motions as a dependency on other websites, so I tried to use the iframe method. I used davidjbradshaw/iframe-resizer to make a responsive iframe. I also used CSS's position: sticky property to make it stick on specific pages — it's on this website, too!

💭 Postmortem

This completes the project. Here are some thoughts:

First of all, I succeeded in managing a mid-to-long-term side project. I have always respected people with persistence, and I was very happy to finally complete the project after working on it for more than a month. I was also delighted that I successfully juggled 🤹 CS Major classes, job searching, and side projects simultaneously (although they still need to be completed).

Second, I would like to express my gratitude to Tim (cometkim), whom I met during my previous internship. I had a memorable experience during this internship when Tim showed me that it is possible to reverse engineer a compiled webpack codebase. It was indeed a spiced-up 🌶 and intense learning environment. However, that gave me confidence when I was blocked by Apple's undocumented API services.

I am also developing the habit of note-taking. There's a saying that people's will is weaker than we think, so it's better to reshape the environment. I did a decent job remodeling my website as a digital garden (or Extracranial Memex) that is optimized for note-taking. I want to continue taking notes and learning new stuff. Tim also had a significant effect my note-taking by showing his workspace on Roam Research.

Anyhow, this concludes the project. Thank you, everyone!

Counting...

I worked as a full-time Mini App researcher intern at Karrot (Korean Unicorn Company 🇰🇷🦄). This is what I found and learned from it.

📱 Mini Apps

Mini Apps are a collection of third-party services that run on top of a native Super App.

info

Imagine the Shopify app hosting thousands of small shopping mall web apps. You sign in once, and you can access all the apps. No need to log in, no need to download, no need to update; it goes beyond Shop Pay, which simply provides a payment gateway. There could be a Game Super App that hosts thousands of mini-games, a Shopping Super App that hosts thousands of mini-shopping malls, a Social Super App that hosts thousands of mini-social networks, and so on.

How is this different from the status quo? You can get the best of both worlds; deploy it as an app (gets the best retention and metrics) with making a web (simple JavaScript development)

At the same time, you can use Super App's complete account and wallet information (no need to sign up or bother to enter data)

Therefore,

  • it is faster than making an app
  • it reaches more demographic than making a web
  • it can target more user base than making an app
  • it guarantees unparalleled reachability, retention, and payment conversions.

The so-called BAT (Baidu, Alibaba, and Tencent) is already dominating the Chinese market. WeChat, the first player in the market, already has a Mini App ecosystem of 400 million active daily users and 900 million active monthly users. Apple and Google are struggling to maintain their platform power in the Chinese market because of these Mini Apps. For Chinese users, the App Store and the Play Store are like Internet Explorer. Just as IE only exists to download Chrome, so the App Store and the Play Store are simply gateways for downloading WeChat.

Of course, international businesspeople have reacted by replicating this outside of China. Snap tried to create Snap Mini, and Line tried to implement Line Mini Apps. Karrot, a Korean Unicorn company that has 60% of Korean citizens as their user base, also wants to become a Super App and create a Mini App environment. Offering more information on the Mini App system is out of the scope of this post; please refer to Google's in-depth review on Mini Apps.

💡So far
  • A Mini App is easy to make (web-like developer experience) while having powerful business effects (app-like user experience).
  • Karrot wants its internal and external partners to provide service through the Mini App within the Karrot App.
  • Karrot thinks that all Super Apps will want to make Mini App Systems and that there will be repeated work and fragmented developer experience if all the Super Apps make their own Mini App systems.
  • Goal. Figure out a Mini App Model that will succeed in Korea, Japan, United States, United Kingdom, and so on. (Karrot's business regions)

🔥 For Thriving Ecosystems

The previously mentioned BAT have created their proprietary languages and browsers, seemingly inspired by the web. These three companies possess immense platform power; they can ask whatever they want from the developers. However, most Super App services cannot justify developers following their demands, like asking devs to use non-standard SDKs or asking for logical branching for detecting a Mini App environment. In that case, developers will give up creating a Mini App to spend that effort on creating an iOS and Android app (which has a much higher chance of success). If you have other thoughts, why is PWA still stagnating? Therefore, a standard Mini App should follow the web standard. Developers should deploy their web app as a Mini App with little to no change.

😻 For Beautiful Interfaces

Having a pretty design is much more important than you think. This statement is especially true for permission request screens. If, for example, a service requires location without context, the user will likely decline, affecting the service's stability. I mean that permission requests should make sense, for which we require persuasive interfaces and designs. Therefore, it needs to be pretty.

Let us take Starbucks as an example. The following image shows permission requests from Starbucks Web, App, and Mini App. Which one do you think you will grant? Which one will you decline?

Web

Web

Mini App

Mini App

App

App

Most users will likely grant our request as we go to the right, given more details. A standard Mini App should at least provide the context level of the middle screenshot.

📨 For Prettier Permission Requests

The geolocation permission requests mentioned above display whenever JavaScript calls the Geolocation API. It's not magic — executing the following code will prompt the permission request.

navigator.geolocation.getCurrentPosition()

Based on backgrounds 1 and 2, we would need to provide a more persuasive and prettier permission request when we execute the above code, based on the Web Standards.

🌐 But Isn't That the Browser's Job?

Yes, displaying such a request screen falls under the browser's responsibility. Therefore, we will meet the above permission request if we call the Geolocation API inside a Web View (specifically, WKWebView for iOS). This behavior also happens inside Karrot Mini, an intermediary version of the Mini App system built by Karrot. So, how can we solve this? Do we plan on making a new browser?

Even worse, an unknown URL can urge people to deny such a request.

Even worse, an unknown URL can urge people to deny such a request.

🎭 We don't care who's who

For web apps, 99.99% don't care who's who. They call the function wherever they need it. So, what if we make a fake navigator like the following?

const navigator = {
geolocation: {
getCurrentPosition(success, error) {
// do some random stuff...
},
},
}

JavaScript does not check for the authenticity of the navigator. Therefore, we can inject whatever behavior we want. This methodology is called Shim.

In computer programming, a shim is a library that transparently intercepts API calls and changes the arguments passed, handles the operation itself, or redirects the operation elsewhere. — Shim (computing) - Wikipedia

I have created a demo website where a cat gif asks for location permission.

Default behavior

Default behavior

Injected behavior

Injected behavior

If we advance this methodology and implement the Document Object Model in JavaScript, we can inject all behaviors that are deemed suitable for Mini Apps.

🗿 For Consistent Experiences

A Mini App is all about a consistent experience. It's akin to universal components like Refresh, Favorite, or Close buttons not changing in browsers when you navigate different websites. For more information on consistent experiences, please refer to Google's Mini App User Experiences document. Of course, this consistency will require us to inject standard components.

⚡️ For Snappy Experiences

Opening and closing different Mini Apps should at least be faster than websites, if not faster than their app versions. For this, we would need prefetching policies for Mini Apps. We also want data persistency when opening and closing apps so we can contain the Mini App inside an iframe and delegate the managing to the Super App's web view. This procedure will also require implementing crossOriginIsolated, Cross-Origin-Opener-Policy, and Cross-Origin-Embedder-Policy headers so that the codes inside the iframes will not have access to data outside.

🥶 How'd You Solve the Icing Problem?

Super App force-quitting frozen Mini App

Super App force-quitting frozen Mini App

There's another problem here: The iframe works on a single thread, so when the Mini App freezes, the entire Super App will also freeze, including the quit button.

🕸 Multi-threaded Web

🤔Isn't JavaScript Single-Threaded?

Correct and wrong.

  • JavaScript inside a browser is single-threaded.
  • We can, however, create multiple threads with web workers.

Then, if we run our iframe inside the web worker, the Super App will effectively solve the icing problem.

🧑‍🔧 No DOM APIs in Workers

Web workers do not have access to DOM APIs. However, just like our shimming the Geolocation API, the DOM API is also an Object Model written in JavaScript. Therefore, we would effectively solve this problem if we could provide the fake DOM API inside the web worker and mirror the manipulations to the real DOM. Also, we can police the manipulations between the two DOM APIs by verifying if this operation is permitted or not.

👻 Mission Impossible

In the film Mission Impossible 4, the protagonist, Ethan, acts like each other in between two terrorist groups, negotiating them in Ethan&#39;s favor.

In the film Mission Impossible 4, the protagonist, Ethan, acts like each other in between two terrorist groups, negotiating them in Ethan's favor.

Luckily, there is previous research conducted. Google created WorkerDOM for their Accelerated Mobile Pages, and BuilderIO created Partytown to separate 3rd-party codes from web workers. However, none of them is fully appropriate for Mini Apps. Google started WorkerDOM when Spectre security vulnerability was a thing and did not utilize SharedArrayBuffer and Atomics. Therefore, WorkerDOM cannot make synchronous data transfers (elaborated later). Partytown cannot Event Prevent Default. But fundamentally, we can use this Mission Impossible model to isolate and quarantine third-party codes.

💽 No Synchronous Data Transfer

Web Workers do not have synchronous data transfer by default. Synchronous data transfer is essential for many places; for example, drawing animations or displaying a map on the screen requires it because we need to calculate the pixels on the screen to render the next frame. However, since we do not have synchronous DOM APIs inside of Workers, all of the animation codes will not respond.

🤝 Then Make It Synchronous!

JavaScript was meant to be asynchronous from the beginning due to user interactions. That is why we have the notorious triumvirate: callbacks, promise, async/await. Synchronously performing such asynchronous JavaScript means that if I call a specific function, the entire operation will sit there and wait until it gets the response.

We can make this synchronous using the following two methods.

  1. Synchronous XMLHttpRequest
  2. SharedArrayBuffer and Atomics
    • SharedArrayBuffer is a shared data channel between Web Worker and the main thread. The Atomics operation ensures thread safety in such mutual operations. At the same time, it means we can pause the Worker thread, harnessing the power of Atomics. Mini Apps already use Web Workers, so using SharedArrayBuffer and Atomics seems more suitable.

✂️ Oops, You Got Disconnected

We cannot access the regular web environment offline. For example, if we have a calculator Mini App, we expect it to work without network access. This condition also tightly relates to initial loading speeds. Although we can use progressive web apps to cache the website offline, it also requires plenty of initial network requests to cache it, deeming it inefficient.

📦 Pack it up!

Source: web.dev/web-bundles

Source: web.dev/web-bundles

There is also a solution. Google is already experimenting with WebBundle, based on the CBOR file format. WebBundle contains all the necessary files for the web, including HTML, CSS, JS, and images, into one file. WebBundle is already enabled in Chrome, and Google is experimenting with this technology in various ways. But sadly, Google's hidden goal is to disarm and bypass URL-based adblocking technologies. Related Thread.

🦠 What if it gets malicious code?

A perfectly fine code on GitHub can suddenly become an attacking code in NPM. For example, UAParser.js, a popular library marking 40M+ monthly downloads, once got hacked and distributed malicious codes. Accident Records.

Such a trustful library with big names can suddenly hit you back.

Such a trustful library with big names can suddenly hit you back.

Essential in any way, the Super App provider should get the package from Mini App providers, audit them, and host by themselves so that others cannot swap out codes. However, there is very little to say because this part of the system is developed almost wholly.

😊 Conclusion.

If we solve all the abovementioned problems, we can finally construct a proper Mini App environment. However, as you can tell, each issue exhibits a vast range of technical and administrative challenges. I focused on problems #2 and #3 during my internship, but the resource was extremely scarce since it delved into such a niche area of interest. I imagine seeing a Mini App environment that is ① internationally accessible ② scalable ③ interoperable with Web Standards ④ and maximizing values for creators and users without being confined to a specific geographic region like China.

But the challenges will only delay our joyful union.

Counting...

tossface.cho.sh

tossface.cho.sh

info

I would like to thank @sudosubin and the Tossface team for reviving Korean emojis with Unicode PUA!

Background

Tossface is an emoji font face a Korean (almost) Decacorn company, Viva Republica, created. Tossface initially included a series of intentionally divergent emoji designs, replacing culturally specific Japanese emojis with designs representing related Korean concepts and outdated technologies with contemporary technologies.

Tossface&#39;s first release. Toss: &quot;Right Now, The Right Us (Hinting Modern &amp; Korean Values)&quot;

Tossface's first release. Toss: "Right Now, The Right Us (Hinting Modern & Korean Values)"

Unfortunately, these replacements caused backlash from multiple stakeholders, and Viva Republica had to remove the emojis.

Unicode Private Use Area

However, there is a hidden secret in Unicode; There is a unused, hidden area from U+E000-F8FF, U+F0000-FFFFD, U+100000-10FFFD, which is known as Unicode Private Use Area. This area will remain unassigned for standard emojis, and companies can use it at their own will.

Regrettably, those letters with Korean and contemporary style in a clean and neat tone and manners disappeared into history. Therefore, I have proposed returning the emojis using a standard technology known as Unicode Private Area.

@toss/tossface/issues/4

@toss/tossface/issues/4

After about three months, Viva Republica accepted the request. They redistributed those emojis in Tossface v1.3, from PUA U+E10A to U+E117.

But how shall I type?

However, these emojis remained uncharted in the Unicode standard. PUA U+E10A to U+E117 cannot be inputted with the standard keyboard, nor does it appear on the emoji chart. Ironic that we finally got the glyphs back but can't type.

So I have created a small website where you can check the glyphs and copy them. I call these Microprojects. They're perfect for trying out new technologies; I wanted to try Astro, but it kept giving me unrecognizable errors primarily because the platform was still in an early stage, so I used Next.js, Vercel, and Tailwind.

Now, it somehow became a Museum of Korean Culture

After creating the website, it now looked like a Museum of Korean Culture, so I added some text in English and shared it publicly.

Page View Nationality Break Down

Page View Nationality Break Down

Postmortem

It was a fast and fun project before the beginning of school!

Counting...

Banner image showing onboarding goods such as MacBook, charger, sticker, guide, etc.

It's already been a week since I've been living as an intern at Karrot (2022-05-22), a Korean Unicorn Company. It's an internship that lasts for three months, but it would be good to organize the interview and onboarding before it's too late.

Application and Interview

A Great Start

It starts with Karrot Market Team Recruiting Site. I got a lot of feelings that Karrot was putting a lot of energy into discovering good talent. While running the recruitment website neatly, Karrot wrote down all the information applicants might be curious about. Above all, Karrot wrote the JD (Job Description) specifically and clearly. Some companies I interviewed with did not disclose the JD, so Karrot was much more considerate.

Karrot Mini R&D Engineer Intern JD

Who we are looking for.

Karrot Market is still actively using web technology to create mobile apps. The web is a great tool, but it still has a lot of limitations when it comes to native platform support. The OS's WebView environment is unsuitable for running multiple apps simultaneously. Due to the difference between the web security model and the basic OS security model, it is challenging to replicate the native experience. For example, if you request user location information through the web API, you will experience a different UI/UX from the user consent seen in native. The Karrot Mini team is looking for a breakthrough from the modern web, not the OS WebView. We are looking for someone who will break through what was initially thought to be challenging to achieve on the web and create an OS-level experience that can run entirely in the browser.

Specifically, they will

  • Study the next-generation web-based execution environment to be used in the Karrot market
  • Provide a sandbox environment to isolate multiple apps
  • Must provide Karrot market integration function through web standard interface
  • Implement a scheduler that can observe and control the running state of multiple apps

We are looking for someone.

  • Familiar with HTML, CSS, and JavaScript-based web development
  • Skilled in program development using JavaScript and TypeScript
  • Those who are interested in reading the DOM standard and implementing it themselves
  • Those who are interested in various web standard APIs
  • Have a basic understanding of the security model of web browsers
  • Those who want to operate an open-source project from the beginning

Even better if you

  • Have experience contributing to or operating an open-source project in which many people participate.
  • Have good knowledge of OS, scheduling, and concurrent programming
  • Know how to handle various programming languages
  • Have experience with system programming languages such as C/C++, Go, Rust, or Zig is preferred

Please Note...

  • This position is held for three months, and in some cases, a 6-month extension is possible

Procedure...

  1. Document submission
  2. Job interview
  3. Final acceptance

Document Screening

Karrot Market is accepting freestyle applications. Please freely express various information that shows your strengths. You can freely select the document format, such as word, pdf, or web link, excluding hwp files. Please forward your portfolio, GitHub link, etc., as needed.

Job Interview

This is the stage where you have an in-depth talk about your job-related experiences and competencies based on your resume and assignments. The job interview lasts from 1 hour to 1 hour 30 minutes with the Karrot Market team members who are highly related to the job.

Doesn't it look very detailed and subtle? The information transparency was excellent, allowing me to predict what position I would hold and what responsibilities I would be given even before the interview. The application process was also straightforward. I didn't have to write a cover letter, etc.; I only had to attach my existing resume. It took less than 15 minutes to apply.

The Interview

As mentioned in JD, the interview was scheduled for 1 hour and 30 minutes. I had been interviewing for several companies before. Up to this point, the discussions I had experienced could be divided into two types.

Example of Behavioral Interview
  • If this happened within your team, how would you deal with it?
  • What do you think is the most important thing as a PM or developer?
  • Please describe this project written on your resume. What did you learn? What did you miss the most?
Example of Technical Interview
  • ~ Please solve this problem.
  • (In case of Web3 company interview) Please explain the concept of blockchain Proof of Stake. How is it different from Proof of Work? What problem are you trying to solve?
  • Please explain the difference between HTTP POST/GET/PUT, etc.

Among them, if you were looking for a Computer Science intern, you had to prepare well for the second technical interview. In the meantime, most of the companies that have been interviewed are preparable as above. Karrot's interview was different. They did not ask questions about the interviewer's knowledge, and we discussed practical work within 5 minutes of starting the discussion. I felt like I was in a team meeting rather than an interview. First, he explained the team's current problem. Then, he asked me to analyze the expected solutions presented and their strengths and weaknesses. During the actual interview, he explained the following information.

On Mini-Apps

  • In China's WeChat, there is a Mini Program called Xiaochengxu.
  • A feature that allows sideloading of small programs within WeChat.
  • You don't need to install the app, take a QR code, and the mini-app loads super-fast, giving you a similar experience to the app.
  • At the same time, membership registration and payment connection are not required. Since WeChat ID and WeChat Pay are automatically linked, no roadblocks hinder the user's flow.
  • In China, the mini-app app ecosystem already dominated the market, and Line and Snap are also preparing for this trend.
  • Apple has also launched its mini-app, App Clips.

If you are curious about the mini-app, please refer to this article!

Karrot

Karrot

So, how did it go?

The answers to the previous question lead to the next question.

Interview Questions
  • In the case of WeChat, they create their native client, and the native client runs the mini-app. However, in this case, mini-apps do not comply with web standards and use their security model, making it difficult to introduce them globally. Karrot Market is also envisioning a similar mini-app environment. What is the appropriate strategy for this?
  • → It would be sufficient to implement a general-purpose mini-app that complies with standard web specifications and perfectly follows the web security model. In other words, you want to run a WebView inside the web. The first method that comes to mind is an iframe. What's the problem with implementing this in an iframe?
  • → Since the external and internal codes of an iframe run on the same thread, the client app also freezes if the mini-app freezes. What should I do to solve this?
  • → With Web Worker, it is possible to separate the mini app and the client app into separate threads. However, the Web Worker cannot access the DOM API if you do this. For example, you cannot use the DOM API called getClientBoundingRect. What should I do to solve this?
  • → Provide a virtual DOM API that Web Workers can access. To solve this problem, Google developed a model called WorkerDOM. And an open-source project called PartyTown, an implementation that separates third-party JS code into a separate Web Worker, was recently released. So how can we implement a mini-app system using this?
  • → Let's assume that the mini-app system is implemented using the underlying technologies of Web Worker and WorkerDOM. Then, can we implement forced shutdown and multitasking on the web within the web? What should I do?

It was not a typical interview, but it aimed to find out how to come up with an idea and find a solution at a practical level. It felt like I was having a coffee chat, and I thought I was receiving significant consideration even though I was the interviewee. The people team's efforts were evident in many aspects, such as promising to inform both successful and unsuccessful applicants of the results within three days and asking for understanding via e-mail when the announcement of the results was delayed.

If you are curious about the questions above, you can learn the answers by looking at the articles below.

Interesting Things about Karrot

Onboarding

Onboarding

Interns with Power

Our team consists of 8 people, and I felt like I became a core member of a tiny startup, not an intern at a Unicorn company. Interns were also given a fair amount of voice and power, and information and opportunities were unrestricted. Even as an intern, I could develop ideas, contribute to production-level products, and suggest new directions for product design. The team leader supported me in expressing more opinions on the first day, which helped me tremendously. I was learning on the job in the actual field.

Great Power and Responsibility

Karrot showed me trust first by giving me as much freedom as possible. For example, I go to work freely between 9:00 and 11:00 and do not say hello when I leave work. I would not record work hours and only had to prove it with performance. I was an intern, so I asked a bunch of things, and I was impressed with team members saying: We trust you, proceed as you wish.

Working Anywhere

It's related to the above, but our team has never gathered offline. I went to work alone with my team for the last few days. Currently, one of our team's developers is living in Jeju Island for a month. Nevertheless, all team members maintained the best performance. Also, the concept of asynchronous communication impressed me. As we move increasingly to remote work, it takes too much energy to hold meetings where everyone gathers in real-time. We instead document and record everything and use corporate messengers like Slack to handle everything during my working hours. Of course, this is based on trust between members and the freedom and responsibility above. (i.e., it is a system that runs under the belief that there are no employees to free ride like group assignments)

Transparent Information

Weekly team meeting where all information is shared

Weekly team meeting where all information is shared

There are no restrictions for interns. You can view the server code of the Karrot market, check the sales volume of local advertisements, and view the minutes of meetings with Karrot investors. After reading this, you will probably think of Reed Hastings's No Rules Rules book.

In the corporate meeting every Monday, we share updates from each team. It was also impressive that no one used presentation slides; they were all written in a shared document in Notion for easier future reference. Overall, the culture promoted creative and powerful expression of opinions. In addition, it induced a responsible attitude by first trusting the employees.

Endless Debates

Our team wrote all 277 replies that morning 😔

Our team wrote all 277 replies that morning 😔

I put the one that impressed me the most last. The discussion culture based on mutual respect was awe-inspiring. Two days after I arrived, we had a 6-hour meeting, and our team exchanged hundreds of Slack messages until dawn to discuss the direction of the product. While I've done various group assignments and student startup projects, I've never seen such deep and delicate affection for a product and heated discussions. Debating how to allocate limited resources to succeed in the market inspired me. Everyone communicated logically to understand the other people's points of view and find a middle ground. Even so, it was cool to see that the discussion never linked to personal feelings and respected each other.

Moving Forward

I will be working on sandboxing for the mini-app standard in the future. Simply put, it's about creating a web within the web and the basis for the mini-app environment. I have a variety of technical & product goals. I plan to write another article after finishing my internship. Please look forward to the Karrot Mini Team!

Counting...

After a few years of technical writing, I felt limitations on writing platforms that hindered me from writing the best-class articles. Technological knowledge is dynamic and intertwined in that none of the current formats - academic papers, lecture videos, code examples, or straightforward posts - can best represent the knowledge. I have examined and observed some attempts that addressed this issue, namely, stuff called the second brain or digital gardens, but none of them seemed to correctly solve the problem. Therefore, I have distilled my inconveniences into this huge mega-post and imagined what I would've done if I had created the new incarnations of digital brains.

Update 2022-06-12

Since this post, I have extensively studied non-linear PKM software, such as Roam, Obsidian, Logseq, and Foam. I acknowledge that I misunderstood the concept of manual linking; that PKM software performs a fuzzy search to intelligently identify linked and unlinked references. I found some PKM software with automatic linkings, such as Saga or Weavit. But none of them worked how I expected. Manual linking helps refine the database. So, even if I make a Next-gen digital brain, I will not remove the linking process.

Update 2022-07-01

Well, you're now watching my next-gen digital brain! For the past two weeks, I have worked on the WWW project that built this website. It checks off almost all of the marks detailed in this post!

TL;DR
  • Create an aesthetic-interactive-automatic pile of code-image-repo-text that organizes-presents-pitches itself.
  • There is no manual tagging, linking, or image processing, etc., etc.
  • You just throw a random knowledge; creating a knowledge mesh network.
  • The algorithm operates everything. It will be contained, processed, organized, and distributed all around the world in different languages.
  • You don't tend knowledge. The algorithm penalizes outdated content (you can mark the post as evergreen to avoid this.)

So what's the issue?

Apart from popular belief, I noticed the best method for managing a digital garden is not tending it. Instead, try to make a digital jungle - you don't take care of it; nature will automatically raise it. In other words, the digital brain should make as less friction as possible. The less you tend, the more you write.

Especially,

I despise the [[keyword]] pattern prevalent in so-called second brains (obsidian, dendron, ...). Not to mention it performs poorly for non-alphabetical documents, it is manual - creates a lot of friction. The fact that you must explicitly wrap them with brackets doesn't make sense... What if you realize you want to make a linkage to a term you've been writing for 200 posts? Do you go back and link them all one by one? No! The solution must lie in algorithmic keyword extraction.

Organizing Contents

Interconnected entities

Practical knowledge does not exist in simple posts (though they might be straightforward). Create a knowledge bundle that interconnects GitHub Repository, Codes, GitHub README, and other posts in the same brain network. Examine how Victor's post has rich metadata for the paper, dataset, demo, and post. This is what I see as interconnected entities.

Interactive Contents & Animations

victordibia.com. Seems like using MDX.

victordibia.com. Seems like using MDX.

bluewings.github.io. Confirmed using MDX.

bluewings.github.io. Confirmed using MDX.

pomb.us. Reacts to user scroll.

pomb.us. Reacts to user scroll.

qubit.donghwi.dev. This isn&#39;t a blog; it&#39;s a web app that demonstrates key concepts of Quantum Computers. But still interesting.

qubit.donghwi.dev. This isn't a blog; it's a web app that demonstrates key concepts of Quantum Computers. But still interesting.

Unorganized Graphing.

Trust me, manually fiddling with tag sucks. Necessarily tagging posts and organizing posts into subdirectories resembles organizing your computer. However, you wouldn't want to do this if you have thousands of posts; also the border gets loose. What if the post has two properties? What becomes the primary tag and what becomes the secondary tag? Notable trends. Gen Z's don't organize folders anymore! Recent trends, I would say, are dumping everything into a mega folder and searching up things whenever needed. I also used to organize folders a lot more, but recently as searches like Spotlight and Alfred improve, I don't see the need to manage them all by hand, considering I always pull up those search commands to open a file. You don't need to manually organize all of the files when algorithms can read all the texts and organize them for you! Use algorithmic inspections to analyze how the posts may interrelate with each other properly.

Velog.io, the Korean version of dev.to, links relevant posts for every post.

Velog.io, the Korean version of dev.to, links relevant posts for every post.

Therefore, creating a cluster of posts, not classified by me, but bots and algorithms. WordPress also has this plugin. This is similar to backlinking, which most so-called digital brains such as Obsidian and Dendron are doing.

Example of backlinking from Dendron

Example of backlinking from Dendron

I agree with the importance of interlinking knowledge crumbles, but I can't entirely agree with their method. Manually linking posts are inconsistent and troublesome; it can only be done on a massive communal scale, like Wikipedia. You cannot apply the same logic to individual digital brain systems.

SEO and Open Graphs

Precis Bots for Meta description

I can apply the above technique for crosslinking to TL;DR bots for meta tag descriptions.

Automatic Open Graph Image Insertion

For example, GitHub creates automatic open graph images with their metadata.

Example open graph image from GitHub

Example open graph image from GitHub

There are quite some services using this technique. GitHub wrote an excellent post on implementing this feature. I also tried to implement this on top of Ghost CMS, which I gave up after figuring out the Ghost Core Engine should support this. However, I have created a fork that I can extend later. http://og-image.cho.sh/

GitHub - anaclumos/cho-sh-og-image: Open Graph Image as a Service - generate cards for Twitter, Facebook, Slack, etc

Multilanguage

Proper multilanguage support

Automatic Langauge Detection. The baseline is to reduce the workload, that I write random things, and the algorithm will automatically organize corresponding data. hreflang tags and HTTP content negotiations. I found none of the services which use this trick properly (outside of megacorporate i18n products)

Translations

At this point, I might write one English post and let Google Translate do the heavy lifting. Also, I can get contributions from GitHub.

While supporting multilanguage and translations, I want to put some 3D WebGL globe graphics. Remember infrastructure.aws in 2019? It used to show an awesome 3D graphic of AWS's global network. AWS Edge Cloud Continuum

I kind of want this back too. Meanwhile, this looks nice:

Also made some contributions...

Fonts and Emoji

I want to go with the standard SF Pro series with a powerful new font Pretendard.

font-family: ui-sans-serif, -apple-system, BlinkMacSystemFont, 'Apple SD Gothic Neo', Pretendard, system-ui -system-ui,
sans-serif, 'Apple Color Emoji';

However, I am exploring other options. I liked TossFace's bold attempt to infuse Korean values into the Japan-based emoji system for emoji. (lol, but they canceled it.)

Tossface Original Emojis

Tossface Original Emojis

Honestly, I want this back. They can use Unicode Private Use Area. But Toss is too lazy to do that considering they still didn't make the WOFF version Webfont. So I might use Twemoji.

Domains and Routes

URL Structures

Does URL Structure matter for SEO? I don't think so if the exhaustive domain list is provided through sitemap.xml. For SEO purposes (although I still doubt the effectiveness), automatically inserting the URLified titles at the end might help (like Notion)

Nameless routes

Autolinks with alphanumeric IDs | GitHub Changelog I don't like naming routes like cho.sh/blog/how-to-make-apple-music-clone. What if I need to update the title and want to update the URL Structure? Changing URL structure affects SEO, so I would need to stick to the original domain even after changing the entity title to maintain the SEO. But then the title and URL would be inconsistent. Therefore, I would give the entity a UID that would be a hash for each interconnected entity. Maybe the randomized hash UID could be a color hex that could be the theme color for the entity? Emoji routes seem cool, aye? I would also need Web Share API since Chrome doesn't support copying Unicode URLs. Some candidates I am thinking of:

  • cho.sh/♥/e5732f/ko
  • cho.sh/🧠/e5732f/en

Also found that Twitter doesn&#39;t support Unicode URLs.

Also found that Twitter doesn't support Unicode URLs.

Miscellany

Headline for Outdated Posts

There should be a method to penalize old posts; they should exist in the database but wouldn't appear as much on the data chain. i.e., put a lifespan or "valid until" for posts.

홍민희 블로그

홍민희 블로그

Kat Huang

Kat Huang

Footnotes

An excellent addition. But not necessary. If I ever have to make a footnote system, I want to make it hoverable, which namu.wiki did a great job. I do not want to make it jump down to the bottom and put a cringy ↩️ icon to link back.

ToC

A nice addition. But not necessary.

Comments

Will go with Giscus.

Counting...

I recently saw this Gist and Interactive Page, so I thought it would be cool to update it for the 2020s. This can serve as a visualization of how fast a modern computer is.

How to read this calendar

Imagine 1 CPU cycle took 1 second. Compared to that, A modern 4.0 GHz CPU has a CPU cycle of 0.25 ns approx. That's 4,000,000,000 times difference. Now, imagine how that CPU would feel one second in real life.

ActionPhysical TimeCPU Time
1 CPU Cycle0.25ns1 second
L1 cache reference1ns4 seconds
Branch mispredict3ns12 seconds
L2 cache reference4ns16 seconds
Mutex lock17ns68 seconds
Send 2KB44ns2.93 minutes
Main memory reference100ns6.67 minutes
Compress 1KB2μs2.22 hours
Read 1MB from memory3μs3.33 hours
SSD random read16μs17.78 hours
Read 1MB from SSD49μs2.27 days
Round trip in the same data center500μs23.15 days
Read 1MB from the disk825μs38.20 days
Disk seek2ms92.60 days
Packet roundtrip from California to Seoul200ms25.35 years
OS virtualization reboot5s633 years
SCSI command timeout30s3,802 years
Hardware virtualization reboot40s5,070 years
Physical system reboot5m38,026 years

Counting...

OK — I admit. The title is slightly misleading. You are reading a technical post about converting any video into an ASCII Art text stream that one can play on the terminal. The text stream here is a subtitle file. You can use any video player or terminal program to parse and display subtitles to play the music video. But the playing part is out of the scope of this post. Still don't get it? Here's a demo:

Enable subtitles and wait for a couple of seconds. If the video errors out, check out the following screen recording:

My text streams use braille to represent pixels. And to display consecutive streams of texts paired with music playback, what would be more suitable than the subtitle format? Therefore, I aim to convert any video into a YouTube subtitle. The tech stack is:

  • OpenCV (C++ cv2) — used to convert video into frames
  • Python Image Library (Python 3 Pillow) — used to convert frames into ASCII art (braille)
  • Python Standard Library (sys, os, pathlib) — used to read and write files
  • ffmpeg (optional) — used to pack everything into a video

Open-sourced on GitHub: anaclumos/video-in-dots.

note

Technically, braille characters are not ASCII characters. They are Unicode, but let's not be too pedantic.


Design

We need to first prove the concept (PoC) that the following technologies achieve our goal:

  1. Converting any image into a monochrome image
  2. Converting any monochrome image into ASCII art
  3. Converting any video into a series of images
  4. Converting any frames into a series of ASCII art and then packaging them into a subtitle file.
  5. (Figured out later) Compressing the subtitle files under a specific size.
  6. (Figured out later) Dithering the images to improve the quality of the ASCII art.

1. Converting any image into a monochrome image

A monochrome image is an image with 1-bit depth, comprised of #000000 and #FFFFFF colors. Note that grayscale images are not monochrome images. Grayscale images also have a wide range of gray colors between #000000 and #FFFFFF. We can use these pure black and white colors to represent the raised and lowered dots of the braille characters, to visually distinguish borders and shapes. Therefore, we convert an image into a BW image and again convert that into a 1-bit depth image. One detail we should note is that subtitles are usually white, so we want the white pixel in the monochrome image to represent 1, the raised dot in braille.

As you can see in the right three images, you can represent any image with border and shape with pure black and white. DemonDeLuxe (Dominique Toussaint), CC BY-SA 3.0, via Wikimedia Commons.

As you can see in the right three images, you can represent any image with border and shape with pure black and white. DemonDeLuxe (Dominique Toussaint), CC BY-SA 3.0, via Wikimedia Commons.

The leftmost image has 256 shades of gray, and the right three images have only two shades of gray, represented in different monochrome conversion algorithms. I used the Floyd-Steinberg dithering algorithm in this project.

Converting the image

There are many ways to convert an image into a monochrome image. However, this project only uses sRGB color space, so I used the CIE 1931 sRGB Luminance conversion algorithm. Wikipedia. Sounds fancy, but it's just a formula:

def grayscale(red: int, green: int, blue: int) -> int:
return int(0.2126 * red + 0.7152 * green + 0.0722 * blue)

red, green, and blue are the RGB values of the pixel, represented in integers from 0 to 255. If their sum goes over the hex_threshold, the pixel is white (1); otherwise, it is black. We can now run this code for every pixel. This grayscale code is for understanding the fundamentals. We will use Python PIL's convert function to convert the image into a monochrome image. This library also applies the Floyd-Steinberg dithering algorithm to the image.

resized_image_bw = resized_image.convert("1")  # apply dithering

2. Converting any monochrome image into arbitrary-sized ASCII arts

The above sentence has three parts. Let's break them down.

  1. Converting any monochrome image into
  2. Arbitrary-sized
  3. ASCII arts

We figured out the first, so now let's explore the second.

Resizing images with PIL

We can use the following code to resize an image in PIL:

def resize(image: Image.Image, width: int, height: int) -> Image.Image:
if height == 0:
height = int(im.height / im.width * width)
if height % braille_config.height != 0:
height = int(braille_config.height * (height // braille_config.height))
if width % braille_config.width != 0:
width = int(braille_config.width * (width // braille_config.width))
return image.resize((width, height))

I will use two-by-three braille characters, so I should slightly modify the height and width of the image to make it divisible by 2 and 3.

Converting the image

Seeing the image will help you better understand. For example, let's say we have the left image (6 by 6). We would cut the image into two-by-three pieces and converted each piece into a braille character.

Left → Right

Left → Right

The key here is to find the correct braille character to represent the two-by-three piece. A straightforward approach is to map all the two-by-three pieces into an array, especially since two-by-three braille characters only have 64 different combinations. But we can do better by understanding how Unicode assigns the character codes.

Note: Braille Patterns from Wikipedia and Unicode Tables

Note: Braille Patterns from Wikipedia and Unicode Tables

To convert a two-by-three piece into a braille character, I made a simple util function. This code uses the above logic to resize the image, convert it into braille characters, and color them on the terminal. You can color the terminal output with \033[38;2;{};{};{}m{}\033[38;2;255;255;255m".format(r, g, b chr(output)). For more information, see ANSI Color Escape Code. If you want to try it out, here is the code: anaclumos/tools-image-to-braille

tip

This code uses an ANSI True Color profile with 16M colors. macOS Terminal will not support 16M color; it only supports 256. You can use iTerm2 or VS Code's integrated terminal to see the full color.


3. Converting any video into a series of images

I planned to experiment with different dimensions with the same image, so I wanted to cache the images physically. I decided to use Python OpenCV to do this.

  1. Set basic configurations and variables.
  2. Read the video file.
  3. Create a directory to store the images.
  4. Loop through the video frames.

An example screenshot. I didn&#39;t use GPU acceleration, so it took about 19 minutes. I could&#39;ve optimized this, but this function runs only once for any video, so I didn&#39;t bother.

An example screenshot. I didn't use GPU acceleration, so it took about 19 minutes. I could've optimized this, but this function runs only once for any video, so I didn't bother.

4. Convert text streams into formalized subtitle files

I already had the braille conversion tool from section 2; now, I needed to run this function for every cached image. I first tried to use the .srt (SubRip) format. The .srt file looks like this:

1
00:01:00,000 --> 00:02:00,000
This is an example
SubRip caption file.

The first line is the sequence number, and the second is the time range in the Start --> End format ( HH:mm:ss,SSS ). Lastly, the third line is the subtitle itself. I chose SubRip because it supported colored subtitles.

It turned out that SubRip&#39;s text stylings are non-standard. Source: en.wikipedia.org

It turned out that SubRip's text stylings are non-standard. Source: en.wikipedia.org

I made several SubRip files with different colors, but YouTube won't recognize the color; it turned out SubRip's color styling is nonstandard.

Types of subtitles YouTube supports

No style info (markup) is recognized in SubRip.

No style info (markup) is recognized in SubRip.

Simple markups are supported in SAMI.

Simple markups are supported in SAMI.

YouTube docs shows the above table. I figured that SAMI files supported simple markups, so I used SAMI. (Oddly enough, I am very familiar with SAMI because .smi is the standard file for Korean subtitles.) Creating subtitles is already simple because it is appending text to a file in a specific format, which didn't require a lot of code change. Microsoft docs shows the structure of SAMI files.

<SAMI>
<HEAD>
<STYLE TYPE = "text/css">
<!--
/* P defines the basic style selector for closed caption paragraph text */
P {font-family:sans-serif; color:white;}
/* Source, Small, and Big define additional ID selectors for closed caption text */
#Source {color: orange; font-family: arial; font-size: 12pt;}
#Small {Name: SmallTxt; font-size: 8pt; color: yellow;}
#Big {Name: BigTxt; font-size: 12pt; color: magenta;}
/* ENUSCC and FRFRCC define language class selectors for closed caption text */
.ENUSCC {Name: 'English Captions'; lang: en-US; SAMIType: CC;}
.FRFRCC {Name: 'French Captions'; lang: fr-FR; SAMIType: CC;}
-->
</STYLE>
</HEAD>
<BODY>
<!<entity type="mdash"/>- The closed caption text displays at 1000 milliseconds. -->
<SYNC Start = 1000>
<!-- English closed captions -->
<P Class = ENUSCC ID = Source>Narrator
<P Class = ENUSCC>Great reason to visit Seattle, brought to you by two out-of-staters.
<!-- French closed captions -->
<P Class = FRFRCC ID = Source>Narrateur
<P Class = FRFRCC>Deux personnes ne venant la r&eacute;gion vous donnent de bonnes raisons de visiter Seattle.
</BODY>
</SAMI>

You can see it's just a simple XML file. Looking closely, you can also see how multi-language subtitles are handled in one SAMI file.


5. Compressing the text files

You would never imagine _compressing_ a _text_ file...

You would never imagine _compressing_ a _text_ file...

I finally got my hands on the SAMI file to discover that the file was over 70MB. I couldn't find any official size limit for YouTube subtitles, but empirically, I discovered the file size limit was around 10MB. So I needed to compress the files.

I thought of three ways to compress the files:

  1. Reduce the width and height.
  2. Skip some frames.
  3. Use color stacks.

I already separated the configurations from the main code, so I could easily change the width, height, and frame rate. However, after many experiments, I figured that YouTube only supports 8—10 frames per second for subtitles, so I decided to skip some frames to reduce the file size.

class braille_config:
# 2 * 3 braille
base = 0x2800
width = 2
height = 3


class video_config:
width = 56
height = 24
frame_jump = 3 # jumps 3 frames

What I mean by "color stacks" is that I could push the same color to the stack and pop it when the color changes. Let's take a look at the original SAMI file:

<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<FONT color="#FFFFFF"></FONT>
<!-- Text Length: 371 -->

Although they are all the same color, the code appended the color tag for every character. Therefore, I can reduce the repetition by using color stacks:

<FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT>
<!-- Text Length: 41. Reduced by 89% -->

It's not the complete-search-maximal-compression you usually see when Leetcoding, but it's still an excellent compression to make it under 10MB. This simple algorithm is especially good when you have black-and-white videos.

<SYNC Start=125><P Class=KOKRCC><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR></SYNC>
<SYNC Start=250><P Class=KOKRCC><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR></SYNC>
<SYNC Start=375><P Class=KOKRCC><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR></SYNC>
<SYNC Start=500><P Class=KOKRCC><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR></SYNC>
<SYNC Start=625><P Class=KOKRCC><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR><FONT color="#FFFFFF">⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿⠿</FONT><BR></SYNC>

The file completed so far: raw.githubusercontent.com (No Dithering)


6. Ditherings

I uploaded the file I created so far, but something was off. It seemed like a problem with how mobile devices handle braille characters. For example, a flat braille character appeared as a circle on computers but as an empty space on mobile devices. (Maybe legibility issues?) I needed extra modifications to resolve this issue: dithering.

Mobile devices show space instead of an empty circle. On the left, you can see almost no details, but on the right, you can see more gradients and details. The right one is the dithered version. Dithering especially shines when you have a black background or color gradients.

Mobile devices show space instead of an empty circle. On the left, you can see almost no details, but on the right, you can see more gradients and details. The right one is the dithered version. Dithering especially shines when you have a black background or color gradients.

The original image from the video. BTS Jimin

The original image from the video. BTS Jimin

Dithering is a technique to compensate for image quality loss when converting an image to a lower color depth by adding noise to the picture. Let me explain it with an example from Wikipedia:

The first image uses 16M colors, and the second and third use 256 colors. Dithered images use compressed color space, but you can feel the details and gradients. Image from en.wikipedia.org

The first image uses 16M colors, and the second and third use 256 colors. Dithered images use compressed color space, but you can feel the details and gradients. Image from en.wikipedia.org

Can you see the difference between the second and third images? They use 256 colors, but the third image has more details and gradients. In this way, we can adequately locate pixels to represent the image properly.

Dithering is also used in GIF image conversion, so most GIF images show many dotted patterns. Digital artifacts are also related to ditherings. You lose some details when you convert an image to a lower color depth. If the dithering happens often, you will get a picture with many artifacts. (Of course, digital artifacts have many other causes. See dithering and color banding for more information.)

Monochrome conversion also requires dithering because we are compressing the 16M color space into two colors. We can do this with the PIL library mentioned above.

resized_image_bw = resized_image.convert("1")  # apply dithering

Let us check this in action.

Can you perceive the difference, especially from 1:33?


Results

I completed the project and uploaded the video to YouTube. I aim to study computer graphics and image processing more further. If you are interested in this topic, please check out my previous post: How Video Compression Works

Butter

Fiesta


Added 2021-07-09: Irregular Subtitle Specs?

I tested the subtitle file on the YouTube app on iOS/iPadOS, and macOS Chrome, Firefox, and Safari. However, I heard that the subtitle file does not work on some devices, like the Android YouTube app and Windows Chrome. I have attached a screen recording of the subtitle file on macOS 11 Chrome 91. You can expect the subtitle file to work when using an Apple device.

I also made the screen recording in 8K to show crisp dots in motion 😉

Counting...
💬Work in Progress

I wrote this post in another language. I did not translate it to other languages yet. If you speak different languages, look for this post in that language.

Counting...
note

This document is machine-translated. It is readable, but may contain awkward phrasing. Please let me know if you find any errors!

Recently, I applied for an on-campus technology start-up club, which has a very competitive rate (with a pass rate of around 5%), and I had the following question.

Here, a meme means a short video that makes you laugh.

Here, a meme means a short video that makes you laugh.

It was a club I had always wanted to join, so I had a lot of trouble. This is because everyone has different interests; what is funny to one group can be offensive to another.

Then, I thought, "What if I could create a service that recommends memes based on choices?" Because it is not a responsive recommendation system (a system that dynamically changes recommendations based on newly accumulated data), the technical complexity seemed not to be that high. But I just had a weekend, so I decided to make it quick.

🎬 Designing the System

First of all, I have listed the videos that I enjoyed watching. (I thought it would be okay to use YouTube instead of TikTok, Twitter, and Instagram.)

I could easily find the videos I thought were funny by searching for the keyword `youtu` in a group chat room.

I could easily find the videos I thought were funny by searching for the keyword `youtu` in a group chat room.

I've since broken it down by category on the Notion page. For example, it could be divided into Music, Movies, Games, Coding, and General memes.

As I envisioned the selection-based recommendation test, there seemed to be two approaches. One is a system that gives weight to each answer to a question, calculates a final score, and recommends results. The other is to set up the entire scenario tree and recommend results according to the combination of options. The popular MBTI results analysis uses the first score-based recommendation system. However, I used the second scenario tree-based recommendation system. Here's why:

The score flag system was too complex to configure.

  • MBTI has a simple score flag. Since there are only four flags: E/I, N/S, T/F, and J/P, it is relatively convenient to manage the score status.
  • I immediately had five categories: Music, Movies, Games, Coding, and General, and each type had various subcategories, so it wasn't easy to pinpoint the flags.
  • For example, I cannot recommend LoL meme videos simply because the game flag score is high. Because you may not know the rules of the LoL, or you may not empathize with the laughter. In other words, to fix this, you need to either add a roll score flag or set a separate "favorite game" flag.
  • In this case, state management becomes very complicated, and I didn't want to increase the technical difficulty.
  • However, the design difficulty rises higher than the technical difficulty. Above all, I felt it was tough to elaborately plan which score range should be recommended for each flag. In other words, it is difficult to make a perfectly-fit meme recommendation based on the score.

I wanted to make checking all endings possible.

  • In a choice-based game, you may want to see a different ending by changing only one final decision (especially these meme recommendations that are not just MBTI).
  • But score-based systems usually require the test to be restarted from scratch and more engineering to add optional 'undo' actions.
  • If you use the scenario tree base, this part becomes more convenient. This is because I need to navigate to the Parent Node.
  • As will be described later, in my case, because I used Next Link, just going back in the browser becomes the undo action.

I wanted to include a curated choice vocabulary rather than a typical optional vocabulary.

  • In a score-based system, you only ask questions and answers in a general form. That is, you cannot ask follow-up questions.
  • I tried to use the Scenario Tree to make the question and the answer exactly fit each other, giving you an everyday experience.
  • Also, as a result, this system is intended to be "attached to the club application".
  • Even if you recommend a funny video, if you can't remember my name and only remember the Video, it serves no purpose!

In the process, I wanted to give the feeling that I want to join this club!!!

In the process, I wanted to give the feeling that I want to join this club!!!

🥞 Choosing the Stack

I didn't worry too much about the front end. Since I recently fell in love with TypeScript Next, it was a natural choice for me, and knowing Vercel's compatibility with Next, I decided to host it on Vercel. For the style, I used the styled component.

Where to store the data was a problem. Since the data about Meme is not dynamic and there is no need to store user information, I decided to hard-code all data modularly instead of using DB or DBaaS separately. You can see the hardcoded data here.

The backend likewise didn't need to be configured. So I decided to make it serverless.

💻 Dev

It can be summarized as follows:

  1. Each Question has a unique link for each question, and each Video has a special link, and when you select an option, you access that link.
  2. Each option is in an Object with a 'next question' or 'result video' field, and an interface is constructed based on this.
  3. Use getStaticProps and getStaticPaths to make responsiveness super fast.

Each Question and Video has the following URI structure:

https://smile.cho.sh/question/[id]
https://smile.cho.sh/video/[id]

💬 2. Type Definitions

To take advantage of TypeScript, I have predefined the type structure.

export type question = {
id: number
contents: string[]
answers: Answer[]
}

export type answer = {
id: number
content: string
result: number | null
nextQuestion: number | null
}

export type Video = {
id: number
title: string
uploader: string
desc: string
youtubeId: string
}

In type Answer, result, and nextQuestion can have only one value. Links are created based on this. With these two separate fields, I was able to avoid the mistake of confusing question and video. I also wanted to avoid unintentional 'null' errors by defaulting to '0' when writing data. So you can check the traces at /question/0.

🚀 3. Making it Blazing Fast

For example, pages corresponding to /question/[id] are statically created at build time through the following code:

export const getStaticPaths: GetStaticPaths = async () => {
const paths = questionData.map((question) => ({
params: { id: question.id.toString() },
}))
return { paths, fallback: false }
}

export const getStaticProps: GetStaticProps = async ({ params }) => {
try {
const id = params?.id
const item = questionData.find((data) => data.id === Number(id))
return { props: { item } }
} catch (err) {
return { props: { errors: err.message } }
}
}

Here, getStaticPaths sets a list of path of pages to be created statically, and getStaticProps retrieves the question data matching path and sends it to the React App in the form of props. This allows you to statically pre-generate all your questions and video pages. Furthermore, if you use a combination of <Link> of next/link, you can prefetch pages, making interactions very fast. (Literally, I don't see any loading or unloading in the browser favicon!)

💅 4. Styling and Tidying up

In other words, creating the intro and ending pages and adding the missing details. Next, I worked on handling different types of Views for exceptional cases. For instance, if the user answers that they do not know all the questions, the following results are displayed. While other views 'embed' the Video right away, only in this case was it shown in the form of a button.

Fallback Video

Check it yourself what kind of Video it is!

✨ Results

  • smile.cho.sh
  • Try it yourself, and let us know what you think!
  • I finally got into the club!

🔥 Postmortem

  • There seems to be a good balance between the design and technical difficulties.
  • I am happy that I learned the map function of ES6+ properly!
  • I built a good understanding of how to use Static TypeScript Next.
  • It's a bit disappointing that I ignored Favicon, Metadata, SEO, etc., but I don't think I'll add them separately because it doesn't require search or SNS inflow.
  • Grinding over the weekend delivers the product... 😉
📜Heads up!
  • I wrote this post more than 2 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...

Let's create a calendar with JavaScript but without any external library. This project is based on my previous internship at Woowa Bros, a unicorn food-delivery startup in Seoul.

Show me the code first.

GitHub - anaclumos/calendar.js: Vanilla JS Calendar

Show me the demo first.

Goals

  • Use functional programming* instead of Object-oriented programming.
  • No DOM manipulation after initializing. This philosophy is based on the React framework (or any other Single Page Application libraries.) DOM manipulation can be highly confusing if 30 different codes are trying to edit the same thing. So instead, we will rerender the components if we need to edit something.

💡

Don't fix it. Buy a new one. — Rerendering in Front-end

Stack

  • JavaScript Date Object
  • CSS display: grid will be useful.

Basic Idea

  • There will be a global displayDate object that represents the displaying month.
  • navigator.js will change this displayDate object, and trigger renderCalendar() function with displayDate as an argument.
  • renderCalendar() will rerender with the calendar.

Before anything, prettier!

Prettier helps write clean and neat codes with automatic formatting.

// `.prettierrc`
{
"semi": false,
"singleQuote": true,
"arrowParens": "always",
"tabWidth": 2,
"useTabs": false,
"printWidth": 60,
"trailingComma": "es5",
"endOfLine": "lf",
"bracketSpacing": true
}

Now throw in some HTML.

<!-- `index.html` -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>JavaScript Calendar</title>
</head>
<body>
<div id="navigator"></div>
<div id="calendar"></div>
</body>
<script>
// code for rendering
</script>
</html>

I generated this boilerplate with VS Code.

Then trick VS Code to read JS String as HTML Tags.

Since we use Vanilla JavaScript, we don't have access to fancy JSX-style highlighting. Instead, our generated HTML codes will live inside JavaScript String, which doesn't have syntax highlighting or Intellisense. Therefore, let's create a function that tricks VS Code to recognize JavaScript String as HTML Tags.

// `util.js`
const html = (s, ...args) => s.map((ss, i) => `${ss}${args[i] || ''}`).join('')

to be added - screenshot of highlighting

calendar.js

Then we connect calendar.js and index.html.

<!-- `index.html` -->
<script src="calendar.js"></script>

Defining constants will help before writing renderCalendar().

// `calendar.js`
const NUMBER_OF_DAYS_IN_WEEK = 7
const NAME_OF_DAYS = ['sun', 'mon', 'tue', 'wed', 'thu', 'fri', 'sat']
const LONG_NAME_OF_DAYS = ['Sunday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']
const ACTUAL_TODAY = new Date()

Note that we use NUMBER_OF_DAYS_IN_WEEK to remove magic numbers inside our code. It can be tough to decipher if we meet a random 7 during a code. Instead, using such constant increases the maintainability of the code.

for (let d = 0; d < NUMBER_OF_DAYS_IN_WEEK; d++) {
// do something
}

If there was a random 7, who knows if we are iterating through the number of Harry Potter Books?

This code block will be the baseline for our calendar generation. We will pass in the HTML target and day object. today represents the month being displayed. Thetoday object will come from navigator,js. Navigator will return the actual date for the current month and return on the first day of the month for other months.

// `calendar.js`
const renderCalendar = ($target, today) => {
let html = getCalendarHTML(today)
// minify html
html = html.replace(/\n/g, '')
// replace multiple spaces with single space
html = html.replace(/\s{2,}/g, ' ')
$target.innerHTML = html
}

Now, we need four different Date objects for displaying the calendar. We could've used fewer objects, but it is up to the implementation. I think reducing date objects here would cause a minimal performance increase but spike the understandability of the code, so using four objects seems like a fair middle ground.

Four Date objects we need

  • The last day of last month: needed to highlight last month's weekend and display the correct date for last month's row.
  • The first day of this month: needed to highlight this month's weekend and figure out how many days of last month we need to render.
  • The last day of this month: needed for rendering this month with iteration.
  • The first day of next month: needed to highlight the weekend of next month.

I made a function that would process these four dates when inputted a specific Date.

// `calendar.js`
const processDate = (day) => {
const month = day.getMonth()
const year = day.getFullYear()
return {
lastMonthLastDate: new Date(year, month, 0),
thisMonthFirstDate: new Date(year, month, 1),
thisMonthLastDate: new Date(year, month + 1, 0),
nextMonthFirstDate: new Date(year, month + 1, 1),
}
}

I created a function that binds these 4 dates into an object and returns them. It receives a Date object as argument, and in this calendar, a Date object corresponding to "today" will be inserted.

const processDate = (day) => {
const date = day.getDate()
const month = day.getMonth()
const year = day.getFullYear()
return {
lastMonthLastDate: new Date(year, month, 0),
thisMonthFirstDate: new Date(year, month, 1),
thisMonthLastDate: new Date(year, month + 1, 0),
nextMonthFirstDate: new Date(year, month + 1, 1),
}
}

2-2. Create getCalendarHTML

Now let's draw a calendar in earnest. I created a getCalendarHTML function that returns the contents of the calendar as HTML. The getCalendarHTML function is a bit bulky, so I framed it first.

const getCalendarHTML = () => {
let today = new Date()
let { lastMonthLastDate, thisMonthFirstDate, thisMonthLastDate, nextMonthFirstDate } = processDate(today)
let calendarContents = []

// ...

return calendarContents.join('')
}

Add a line at the top to display the day of the week. Use the const we added at the beginning to remove the magic number.

for (let d = 0; d < NUMBER_OF_DAYS_IN_WEEK; d++) {
calendarContents.push(html`<div class="${NAME_OF_DAYS[d]} calendar-cell">${NAME_OF_DAYS[d]}</div>`)
}

Then let's draw the last month. For example, if the first day of this month is Wednesday, the role of drawing the last month corresponding to Sunday, Monday, and Tuesday. For days corresponding to Sunday, sun HTML Class is added.

for (let d = 0; d < thisMonthFirstDate.getDay(); d++) {
calendarContents.push(
html`<div
class="
${d % 7 === 0 ? 'sun' : ''}
calendar-cell
past-month
"
>
${lastMonthLastDate.getMonth() + 1}/${lastMonthLastDate.getDate() - thisMonthFirstDate.getDay() + d}
</div>`
)
}

Let's draw this month on a similar principle. For today's day, today HTML Class and "today" String are added. Similarly, sat and sun HTML Class are added for Saturday and Sunday respectively.

for (let d = 0; d < thisMonthLastDate.getDate(); d++) {
calendarContents.push(
html`<div
class="
${today.getDate() === d + 1 ? 'today' : ''}
${(thisMonthFirstDate.getDay() + d) % 7 === 0 ? 'sun' : ''}
${(thisMonthFirstDate.getDay() + d) % 7 === 6 ? 'sat' : ''}
calendar-cell
this-month
"
>
${d + 1} ${today.getDate() === d + 1 ? ' today' : ''}
</div>`
)
}

Finally, draw the days of the next month in the remaining cells.

let nextMonthDaysToRender = 7 - (calendarContents.length % 7)

for (let d = 0; d < nextMonthDaysToRender; d++) {
calendarContents.push(
html`<div
class="
${(nextMonthFirstDate.getDay() + d) % 7 === 6 ? 'sat' : ''}
calendar-cell
next-month
"
>
${nextMonthFirstDate.getMonth() + 1}/${d + 1}
</div>`
)
}

3. Writing CSS

3-1. Using display: grid

If you use display: grid on an element, you can neatly put its child elements into a grid (table).

  • grid-template-columns: Information on how to arrange columns. 1fr means 1 fraction, and since it is written 7 times in total, 7 columns with the same width are created.
  • grid-template-rows: You can define the size of rows. Here, there is only one 3rem, so the first row is defined as 3rem.
  • grid-auto-rows: You can define the size of the next row. Here, it says 6rem, so all subsequent rows have a row size of 6rem.

Below we define additional styles.

#App {
/* grid */
display: grid;
grid-template-columns: 1fr 1fr 1fr 1fr 1fr 1fr 1fr;
grid-template-rows: 3rem;
grid-auto-rows: 6rem;

/* style */
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
border: 1px solid black;
max-width: 720px;
margin-left: auto;
margin-right: auto;
}
  • When drawing a table, you want to wrap all cells with a uniform border, just like Excel, but there are cases where only the outermost cells have thin lines. In terms of HTML, borders are applied only to th and td.
  • I prefer to apply this "n px to all cell borders, n px to table borders" border. This will give you a uniform border of 2n px overall.
.calendar-cell {
border: 1px solid black;
padding: 0.5rem;
}

3-2. 토요일과 일요일, 오늘 하이라이팅

.past-month,
.next-month {
color: gray;
}

.sun {
color: red;
}

.sat {
color: blue;
}

.past-month.sun {
color: pink;
}

.next-month.sat {
color: lightblue;
}

.today {
color: #e5732f;
}

I felt that

  • At first, I got a little lost when connecting with JS to "initialize" the calendar. This is because you connected renderCalendar to the top of body. Since the DOM is executed sequentially, if you connect it to the top, if the #App div does not appear, renderCalendar will be executed and the DOM element will not be found.
  • Also, I couldn't remember how to render codes that can be expressed in JS associations on the screen. It was simply to querySelect the app in js, which plays the role of index.js, and then insert it into innerHTML.
  • In the Woowa Tech Camp project, magic numbers were used. This time, the magic number was removed to improve readability.
  • The Woowa Techcamp project was written in Object Oriented JavaScript (more precisely, Singleton pattern), but this time it was written in small functions.
  • Tried to use ES6+ syntax. For example, I used it by putting variables in backticks or destructuring the return data of processDate. Also let and const were mainly used.
  • I regret that getCalendarHTML could not have been written a little shorter.
📜Heads up!
  • I wrote this post more than 2 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...

Recently I came across The Noun Project's API. With the combination of the download function I created in the past, you could download hundreds of icons within seconds.

Beware

Do not use this tool to pirate others' intellectual property. Beware of what you are doing with this code and The Noun Project's API. Read the license and API documents thoroughly. Unauthorized use cases are listed here. This entire post & codes are MIT licensed.

Importing libraries

import requests
import os
from tqdm import tqdm
from requests_oauthlib import OAuth1

You will need to pip3 download if you do not have these libraries.

The download function

def download(url, pathname):
if not os.path.isdir(pathname):
os.makedirs(pathname)
response = requests.get(url, stream=True)
file_size = int(response.headers.get("Content-Length", 0))
filename = os.path.join(pathname, url.split("/")[-1])
if filename.find("?") > 0:
filename = filename.split("?")[0]
progress = tqdm(
response.iter_content(256),
f"Downloading {filename}",
total=file_size,
unit="B",
unit_scale=True,
unit_divisor=1024,
)
with open(filename, "wb") as f:
for data in progress:
f.write(data)
progress.update(len(data))

This code fetches the URL and saves it as a file at pathname.

The Noun Project API

# ---

DOWNLOAD_ITERATION = 3
# Returns 50 icons per iteration.
# Three iteration equals 150 icons.

SEARCH_KEY = "tree" # Search Term
SAVE_LOCATION = "./icons"
auth = OAuth1("API_KEY", "API_SECRET")

# ---

for iteration in range(DOWNLOAD_ITERATION):
endpoint = (
"http://api.thenounproject.com/icons/"
+ SEARCH_KEY
+ "?offset="
+ str(iteration * 50)
)
response = requests.get(endpoint, auth=auth).json()
for icon in response["icons"]:
download(icon["preview_url"], SAVE_LOCATION)

For more advanced uses, please visit this docs page. In addition, you can get your API Key and API secret by registering your app here.

Result

I have run some benchmarks and found that downloading ~5k icons shouldn&#39;t be a problem.

I have run some benchmarks and found that downloading ~5k icons shouldn't be a problem.
However, The Noun Project's API has a call limit so beware of that.
📜Heads up!
  • I wrote this post more than 2 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...
💬Work in Progress

I wrote this post in another language. I did not translate it to other languages yet. If you speak different languages, look for this post in that language.

📜Heads up!
  • I wrote this post more than 2 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...

My Ghost blog is not serverless. The server is maintained despite the need for continuous management because there are numerous advantages of blogging through the server. However, managing a blog through a server has one huge drawback. If the server crashes, it will be challenging to restore the text inside. There will be many more texts and photos in the future, but it would be too cumbersome to copy and back up each time. So I wanted to come up with a plan to improve this.

Problems with Ghost's built-in backup

Ghost provides a function to download a blog backup file in .json format. Everything that can be set in Ghost is backed up as it is, including the author's name, used tags, the content and structure of the article, the time the article was uploaded, and the summary in the HTML meta tag.

But there are two problems.

  • Ghost built-in backup files are complex for humans to read. It is not only Minified JSON but also contains a lot of information, so the file structure is complicated, and the text is compressed.
  • Also, Ghost's built-in backup does not back up photos. Therefore, when you restore the blog, all photo files are "not found" (aka Xbox). It is fortunate if the blog server is alive or there are copied photos, but there may be cases where I cannot restore the images.

target

main goal

  • You must back up all text and photos.

Bonus Goal

  • Must be in a human-readable format. (Human-Readable Medium)
  • In preparation for restoring the blog, you must see which photos are included in which position in which post.
  • Backups should be convenient.
  • It should be possible to create clones outside the blog.

Envision

That's RSS. RSS is a technology that emerged during the blogging boom of the early 2000s and acts like a "subscription". Sites and blogs that support RSS provide an RSS feed address. In the RSS feed address, the contents updated on the site are organized in a machine-readable form. When users enter an RSS feed address into an RSS reader, the reader scrapes new content from the RSS feed address every time.

In modern times, SNS is active, and RSS technology has been abandoned, but it is sufficient to achieve my goal. The RSS feed serves as an API to receive articles. Ghost supports RSS, so I decided to use it.

rough idea

  1. Enter the blog RSS address and copy the entire RSS feed.
  2. Parse the RSS and extract the HTML of the article.
  3. Create a folder for each article to save the HTML of the article.
  4. Connect to the src address of the img tag included in the HTML of the article and download the picture.
  5. For posts with pictures, create an images folder for each post folder to save pictures and change the src of the img tag in HTML to the relative path of the saved image.

Development

Reference

All examples below are based on v1 of [anaculos/backup-with-rss](https://github.com/anaculos/backup-with-rss). By the time you read this, I may have added some new features or bug fixes.

Also, the code attached to this article is intended to show a rough deployment, not the entire code. If you try to copy and run this article, it probably won't run! The complete code is publicly available on the GitHub repository.

1. Copy RSS feeds using Feedparser

Copy RSS feeds via a module called Feedparser in Python.

# -*- coding: utf-8 -*-
import feedparser


class RSSReader:
origin = ""
feed = ""

def __init__(self, URL):
self.origin = URL
self.feed = feedparser.parse(URL)

def parse(self):
return self.feed.entries

RSSReader is used to load RSS feeds and pass entries items.

What this code does is:

  1. When an RSSReader Object is created, the RSS address is saved in self.origin, and the RSS address is parsed and stored in self.feed.
  2. When the parse function is executed, it returns entries among the values stored in self.feed.

Among them, entries contains articles from the RSS feed in the form of list. The following example is the RSS of this article.

Structure of self.feed.entries in parse()

{
"bozo": 0,
"encoding": "utf-8",
"entries": [],
"feed": {
"generator": "Ghost 3.13",
"generator_detail": {
"name": "Ghost 3.13"
},
"image": {
"href": "https://blog.chosunghyun.com/favicon.png",
"link": "https://blog.chosunghyun.com/",
"links": [
{
"href": "https://blog.chosunghyun.com/",
"rel": "alternate",
"type": "text/html"
}
],
"title": "Sunghyun Cho",
"title_detail": {
"base": "https://blog.chosunghyun.com/rss/",
"language": "None",
"type": "text/plain",
"value": "Sunghyun Cho"
}
},
"link": "https://blog.chosunghyun.com/",
"links": [
{
"href": "https://blog.chosunghyun.com/",
"rel": "alternate",
"type": "text/html"
},
{
"href": "https://blog.chosunghyun.com/rss/",
"rel": "self",
"type": "application/rss+xml"
}
],
"subtitle": "Sunghyun Cho's Blog",
"subtitle_detail": {
"base": "https://blog.chosunghyun.com/rss/",
"language": "None",
"type": "text/html",
"value": "Sunghyun Cho;s Blog"
},
"title": "Sunghyun Cho",
"title_detail": {
"base": "https://blog.chosunghyun.com/rss/",
"language": "None",
"type": "text/plain",
"value": "Sunghyun Cho"
},
"ttl": "60",
"href": "https://blog.chosunghyun.com/rss/",
"namespaces": {
"": "http://www.w3.org/2005/Atom",
"content": "http://purl.org/rss/1.0/modules/content/",
"dc": "http://purl.org/dc/elements/1.1/",
"media": "http://search.yahoo.com/mrss/",
"status": 200,
"version": "rss20"
}
}
}

2. Create a Markdown file with RSS data

I could extract only necessary values from self.feed.entries returned by RSSReader and created the MDCreator class to process the information provided by RSSReader.

class MDCreator:
def __init__(self, rawData, blogDomain):
self.rawData = rawData
self.blogDomain = blogDomain

def createFile(self, directory):
try:
os.makedirs(directory + "/" + self.rawData.title)
print('Folder "' + self.rawData.title + '" Created ')
except FileExistsError:
print(
'Folder "' + self.rawData.title + '" already exists'
)
self.directory = directory + "/" + self.rawData.title

MDFile = codecs.open(
self.directory + "/README.md", "w", "utf-8"
)
MDFile.write(self.render())
MDFile.close()

The blogDomain parameter is used later.

What this code does is:

  1. When the MDCreator object is created, save the blog address in self.blogDomain and the original RSS feed data in self.rawData. The actual data of this RSS feed is self.feed.entries returned from parse() of RSSReader.
  2. When the createFile() function is executed, a folder is created for each article in the backup folder. In this case, the folder title is the title of the article. Next, create README.md for each folder and put the article's contents in it.

The reason for creating the file through the codecs library is to use Unicode instead of the CP949 codec on Windows. Then the emoji included in the RSS will usually appear 🚀🥊

3. Adding post information to the created Markdown file

I wanted to use Jekyll-type Front Matter when displaying text information. It was the easiest way to check the article's title, tags, links, authors, etc.

def render(self):
try:
postTitle = str(self.rawData.title)
except AttributeError:
postTitle = "Post Title Unknown"
print("Post Title does not exist")
try:
postTags = str(
self.getValueListOfDictList(self.rawData.tags, "term")
)
except AttributeError:
postTags = "Post Tags Unknown"
print("Post Tags does not exist")
try:
postLink = "Post Link Unknown"
postLink = str(self.rawData.link)
except AttributeError:
print("Post Link does not exist")
try:
postID = str(self.rawData.id)
except AttributeError:
postID = "Post ID unknown"
print("Post ID does not exist")
try:
postAuthors = str(self.rawData.authors)
except AttributeError:
postAuthors = "Authors Unknown"
print("Authors does not exist")
try:
postPublished = str(self.rawData.published)
except AttributeError:
postPublished = "Published Date unknown"
print("Published Date does not exist")
self.renderedData = (
"---\nlayout: post\ntitle: "
+ postTitle
+ "\ntags: "
+ postTags
+ "\nurl: "
+ postLink
+ "\nauthors: "
+ postAuthors
+ "\npublished: "
+ postPublished
+ "\nid: "
+ postID
+ "\n---\n"
)

What this code does is:

  1. In the RSS code, check if there is a title, tag, link, ID, author name, and publication date of the article; if there is a value, enter the value in Front Matter.
  2. If there is no value, enter ~ Unknown.

Tags are entered through code such as self.getValueListOfDictList(self.rawData.tags, "term") because Ghost specifies tags in the following format. The same goes for Gatsby and WordPress.

'tags': [{'label': None, 'scheme': None, 'term': 'English'},
{'label': None, 'scheme': None, 'term': 'Code'},
{'label': None, 'scheme': None, 'term': 'Apple'}],
def getValueListOfDictList(self, dicList, targetkey):
arr = []
for dic in dicList:
for key, value in dic.items():
if key == targetkey:
arr.append(value)
return arr

This way, only the term item is removed from tags and added to Front Matter. Then, when executed, the following Jekyll Style Front Matter is completed.

---
layout: post
title: Apple's Easter Egg
tags: ['English', 'Code', 'Apple']
url: https://blog.chosunghyun.com/apples-easter-egg/
authors: [{ 'name': 'S Cho' }]
published: Sun, 19 Jan 2020 17:00:00 GMT
id: /_ Some Post ID _/
---

Jekyll Style Front Matter on GitHub

Jekyll Style Front Matter on GitHub

Front Matter looks like this renders on GitHub.

4. Adding summary and body text to the created Markdown file

Add Summary and Content items of RSS data to renderedData.

self.renderedData += "\n\n# " + postTitle + "\n\n## Summary\n\n"

try:
self.renderedData += self.rawData.summary
except AttributeError:
self.renderedData += "RSS summary does not exist."

self.renderedData += "\n\n## Content\n\n"

try:
for el in self.getValueListOfDictList(self.rawData.content, "value"):
self.renderedData += "\n" + str(el)
except AttributeError:
self.renderedData += "RSS content does not exist."

One curious thing is that while Ghost and WordPress-based blogs support both RSS summary and content, Jekyll-based GitHub Pages or Tistory put all the article's contents in the RSS summary. (...) Ghost provides a function to set the Excerpt of the text, and this Excerpt value is used as RSS Summary.

5. Adding Images to Generated Markdown Files

For backup, I must preserve even the image intact. Except for images embedded in HTML as base64, all of them now have only src specified in the img tag. If the server dies, we cannot load photos from img src, so I must download all images at the time of backup.

[How to Download All Images from a Web Page in Python](https://www.thepythoncode.com/article/download-web-page-images- python).

soup = bs(self.renderedData, features="html.parser")
for img in soup.findAll("img"):

for imgsrc in ["src", "data-src"]:
try:
remoteFile = img[imgsrc]
break
except KeyError:
continue

if self.isDomain(remoteFile) != True:
print("remoteFile", remoteFile, "is not a domain.")
remoteFile = self.blogDomain + "/" + remoteFile
print("Fixing it to", remoteFile)
print(
'Trying to download "'
+ remoteFile
+ '" and save it at "'
+ self.directory
+ '/images"'
)
self.download(remoteFile, self.directory + "/images")
img["src"] = "images/" + remoteFile.split("/")[-1]
img["srcset"] = ""
print(img["src"])
self.renderedData = str(soup)
return self.renderedData

What this code does is:

  1. Read the string renderedData into HTML and find all img tags.
  2. Check if there is an src or data-src attribute. data-src is an attribute corresponding to WordPress.
  3. Create an images folder in each post folder and save images in it. At this time, the name of the image is the lowest directory of img src. For example, if img src is https://blog.someone.com/images/example.png, it will be saved as images/example.png.
  4. Change the existing img src to the relative path of the images folder.
  5. Remove the srcset attribute if it has one (Gatsby correspondence)
def download(self, url, pathname):
if not os.path.isdir(pathname):
os.makedirs(pathname)
response = requests.get(url, stream=True)
file_size = int(response.headers.get("Content-Length", 0))
filename = os.path.join(pathname, url.split("/")[-1])
if filename.find("?") > 0:
filename = filename.split("?")[0]
progress = tqdm(
response.iter_content(256),
f"Downloading {filename}",
total=file_size,
unit="B",
unit_scale=True,
unit_divisor=1024,
)
with open(filename, "wb") as f:
for data in progress:
f.write(data)
progress.update(len(data))

One problem is that the addresses of images are not consistent. Some sites write the entire domain as <img src = "https://example.png/images/example.png"> while others <img src = "/images/example.png"> Write from the subdirectory as well. In some places, it was <img src = "example.png">. To respond to as many cases as possible, a function isDomain() that detects domains has been created. Other libraries recognized file extensions such as .png as Top Level Domains such as .com, so we added some exception handling.

def isDomain(self, string):
if string.startswith("https://") or string.startswith("http://"):
return True
elif string.startswith("/"):
return False
else:
return validators.domain(string.split("/")[0])

The domain name is specified in front if the domain is not directly accessible, such as <img src = "/images/example.png">. At this time, the previously specified self.blogDomain is used.

Result

I backed up this blog. This blog is a Self-hosted Ghost blog. If you run only main.py, the backup will continue.

These are the backed-up texts. The folder name is set as the title of the article.

These are the backed-up texts. The folder name is set as the title of the article.

![This is the appearance of the article backed up on GitHub. Photos are also saved and displayed directly in a folder instead of on a blog server.] (E98816.png)

This is the appearance of the article backed up on GitHub. Photos are also saved and displayed directly in a folder instead of on a blog server.

Photos used in the article are stored in the folder.

Photos used in the article are stored in the folder.

The following services tested are supported: The style or arrangement of the writing may be slightly different, but the purpose of backup is sufficiently achieved.

  • Ghost
  • WordPress
  • Jekyll-based GitHub Pages
  • Gatsby-based GitHub Pages
  • Medium
  • Tistory

Evaluation of achievement of goals

Main Goal

  • You must back up all text and photos. ★★★

The goal has been fully achieved. The video is not backed up, but since the video is embedded via YouTube anyway, the probability of information loss is much less. Because of this, it was excluded from the goal from the beginning.

Bonus Goal

  • Must be in a human-readable format. (Human-Readable Medium) ★★☆

Compared to Ghost's built-in backup, you can see important information at a glance in Front Matter, and the text is rendered almost the same form as a blog. Articles and photos are organized by folder, making it easy to find the data you want. However, even if you use Markdown, it is inconvenient to edit the text because the body of the text is HTML. It is a backup that achieves the purpose of just Lots of copies keeps stuff safe.

  • It should be clear which picture goes in which position in which text. (In preparation for restoring the blog) ★★★

You can see which picture goes where in which text.

  • Backups should be convenient. ★★☆

You have to run main.py manually. I'm thinking of automating it with crontab someday.

Also, due to the nature of using RSS, only posts included in RSS feeds are backed up. RSS feeds often contain only the most recent posts to reduce bandwidth usage, but each blog has the option to adjust this. For example, the Ghost blog, by default, includes the 15 most recent posts in its RSS feed. I cannot manipulate the number of RSS feed posts on the Ghost blog within Ghost CMS, and [Ghost Core's code](https://github.com/TryGhost/Ghost/blob/master/core/server/models/plugins/pagination.js# L20) cannot be touched.

  • It should be possible to create clones outside the blog. ★★☆

WordPress may temporarily block access if you repeatedly download many photos from a WordPress blog.

Future Plans

After completing it and thinking about it, it would be a good tool for those planning to relocate to the blog but are worried about the amount of data they have accumulated. We plan to improve it further to be a helpful tool before blogging.

Reference

📜Heads up!
  • I wrote this post more than 3 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...

Example of Video Ghosting

Example of Video Ghosting

In this article, we will learn about the principle of video compression and discuss why the above phenomenon occurs.

Videos are too big

A video is a collection of photos. However, the capacity becomes surprisingly large if we produce a video as a series of actual images. For example, if the 1920 x 1080 60FPS video we often watch on YouTube is not compressed, its size approaches 7GB per minute. However, if you watch a video with the exact specifications on YouTube, up to 40MB per minute is used. This compression is a reduction of capacity by almost 200 times. Still, we don't notice much of a difference. What happened?

So we encode

Due to the large video size, most videos use some compression level. We call this video encoding, and the world of encoding algorithms is amazingly sophisticated and beautiful.

Video encoding finds the key to saving capacity in redundancy. For example, imagine a singer standing still and singing. Only the singer's mouth moves, and the background and the singer's body do not move at all. If so, is it necessary to provide information about the black pixels in the background and the body movements of the singer every time? No. Because those parts overlap.

Video data overlap in space and time. A method of removing spatial duplication is called intra-frame coding (intra-frame compression), and a way of eliminating temporal duplication is called inter-frame coding (inter-frame compression). As detailed implementation methods, there is a Discrete Cosine Transform used to reduce adjacent pixel data, prediction using motion vectors, and in-loop filtering techniques.

Intra-frame coding

Reduce the size of the photo itself!

A video is a collection of photos. A picture is a set of pixels. We can reduce spatial redundancy if we reduce the information on overlapping pixels in the same image. One of the most straightforward implementations is to use averages. Suppose the data of one pixel is left empty, and the information of the surrounding pixels is left. In that case, the computer takes the knowledge of adjacent pixels when playing a video and expresses the average of the data.

What's interesting here is that adjacent pixels are not up, down, left, or right. Pixel data in a video is stored in order from left to right and top to bottom. Suppose the information of the top, bottom, left, and right pixels is retrieved, and the average is obtained. In that case, it is necessary to wait until the right and bottom pixel data are read and then come back to represent the pixel data. Since it is not efficient when expressing a video quickly, Intra-frame coding temporarily stores the upper left, upper right, upper right, and left data. When encountering blank data, the average value is calculated using the temporarily stored values.

Inter-frame coding

Don't resend information you've sent in the past; let's recycle it!

Remember giving out prizes at school holidays? Let's imagine that the same award is given to 30 people. How long would the vacation ceremony be if the principal read out all the prizes individually? How boring and painful will it be? But the principal doesn't. Just the contents are the same as above, and move on. We can have a lovely vacation afternoon just by expressing that we are the same as the previous person. The principal did inter-frame compression.

The same goes for videos. Since many videos have similar frames, they can also express information about the relationship between the structures before and after or omit it altogether. This can reduce temporal redundancy.

Who&#39;s better? Principal announcing for 2 hours or 2 minutes?

Who's better? Principal announcing for 2 hours or 2 minutes?

#1. I-Frame being the standard

An I-Frame (Intra-coded picture) is a photograph. All information in the I-Frame is new information. I-Frame becomes the standard for expressing the front and back frames.

#2. P-Frame expressing only the amount of change

A P-Frame (Predicted Pictures) is inserted between each I-Frame. The amount of change from the previous screen is expressed in the P-Frame. If the current frame has something in common with the last structure, information about the prior frame is retrieved and used. It is easier to understand by looking at the picture.

Copyright: Blender Foundation 2006. Netherlands Media Art Institute. www.elephantsdream.org

Copyright: Blender Foundation 2006. Netherlands Media Art Institute. www.elephantsdream.org

What is represented by an arrow is a motion vector representing the amount of change. In addition to this, P-Frame includes conversion values for prediction correction. In some cases, new image information is also included in the P-Frame. P-Frame uses only about half the size of the I-Frame. Of course, in actual video encoding, instead of comparing all pixel information, it is divided into several blocks and compared. This is called a macroblock, and in HEVC, the latest video codec, it is called a coding tree unit.

#3. B-Frame saving data

Insert B-Frame (Bidirectionally Predicted Pictures) between I-Frame and P-Frame. B-Frame calculates the screen using the front and backs I-Frames or P-Frames. There is no difference from P-Frame, but B-Frame is used because of its capacity. Since B-Frame utilizes all the data of the preceding and preceding frames, we can omit information as much. So the B-Frame uses only 25% of the size of the P-Frame.

Like P-Frame, B-Frame also uses Motion Vector and conversion values for prediction correction. B-Frame refers to I-Frame and P-Frame, but in the latest video codecs such as HEVC and VVC, B-Frame can also refer to other B-Frames.

Copyright: Cmglee, CC BY-SA 4.0.

Copyright: Cmglee, CC BY-SA 4.0.

Reason for ghostings

The problem occurs when the communication packet containing the I-Frame is lost. As a result, the reference values for calculating the surrounding P-Frame and B-Frame disappeared. Of course, a good video streaming program uses various algorithms to detect communication packet loss in advance and request packets again. Still, we cannot check for I-Frame loss if the server is unstable or the streaming program is poor.

If the I-Frame is lost and only the next P-Frame and B-Frame arrive, the change value is applied to the wrong I-Frame.

A picture is worth a thousand words

If you still don't understand, check it out with your own eyes. Using a commercial video library such as 'FFmpeg', the frame information of a video file can be intentionally corrupted. This type of art is called Datamoshing.

Using Python and FFmpeg libraries, the I-Frame in the music video was damaged to cause the ghosting artificially.

  • All I-Frames in the video were overwritten with the values of the previous frame (probably P-Frame and B-Frame). Therefore, there is no new information due to I-Frame. So the screen does not change, but the characters' movements appear. This is because I applied the amount of change (P-Frame and B-Frame) to the wrong reference point (I-Frame).
  • There are times when a part of the middle screen looks clean for a moment. This is because the P-Frame may also have new image information. However, since I-Frames, all new information, has been deleted, even if the screen looks clean temporarily, the entire screen will not be clean.
  • You will notice that when the video is broken, it is not scattered like small sand but broken into large, easily visible square units. This is because image data compression is not calculated for each pixel but in units of macroblocks (coding tree units) that bundle several pixels. When this phenomenon occurs during a broadcast, it is commonly referred to as "Pixelated Videos".

Considering that there is no I-Frame, you will understand the relationship between I-Frame, P-Frame, and B-Frame more clearly after watching the video. When the opportunity arises, we will discuss how to damage a video using FFmpeg later.


  • If there is an error in the article, please report it to [email protected].
  • "However, to prevent the error from getting bigger, B-Frames do not refer to other B-Frames. They only refer to I-Frames or P-Frames" is incorrect. In video codecs such as HEVC and VVC, B-Frames can reference other B-Frames. Thank you so much for reporting. Credit: (anonymous)
📜Heads up!
  • I wrote this post more than 3 years ago.
  • That's enough time for things to change.
  • Possibly, I may not endorse the content anymore.
Google Latest Articles Instead

Counting...

Hero Image. Building a payment system for school festivals

MinsaPay is a payment system that was built for the Minjok Summer Festival. It works like a prepaid tap-to-pay card. Every source and piece of anonymized transaction data is available on GitHub.

Stats

But why does a school festival need a payment system?

My high school, Korean Minjok Leadership Academy (KMLA), had a summer festival like any other school. Students opened booths to sell food and items they created. We also screened movies produced by our students and hosted dance clubs. The water party in the afternoon is one of the festival's oldest traditions.

Because there were a lot of products being sold, it was hard to use regular paper money (a subsequent analysis by the MinsaPay team confirmed that the total volume of payments reached more than $4,000). So our student council created proprietary money called the Minjok Festival Notes. The student council had a dedicated student department act as a bank to publish the notes and monitor the currency's flow. Also, the Minjok Festival Notes acted as festival memorabilia since each year's design was unique.

The Minjok Festival Note design for 2018 had photos of the KMLA student council members at the center of the bill. The yellow one was worth approximately $5.00, the green one was worth $1.00, and the red one was worth 50 cents.

The Minjok Festival Note design for 2018 had photos of the KMLA student council members at the center of the bill. The yellow one was worth approximately $5.00, the green one was worth $1.00, and the red one was worth 50 cents.

But there were problems. First, it was not eco-friendly. Thousands of notes were printed and disposed of annually for just a single day. It was a waste of resources. The water party mentioned above was problematic as well. The student council made Minjok Festival Notes out of nothing special, just ordinary paper. That made the notes extremely vulnerable to water, and students lost a lot of money after the water party. Eventually, the KMLA students sought a way to resolve all of these issues.

Idea

The student council first offered me the chance to develop a payment system. Because I had thought about the case beforehand, I thought it made a lot of sense. I instantly detailed the feasibility and possibilities of the payment system. But even after designing the system in such great detail that I could immediately jump into the development, I turned down the offer.

I believe in the social responsibilities of the developer. Developers should not be copy-pasters who meet the technical requirements and deliver the product. On the contrary, they are the people with enormous potential to open an entirely new horizon of the world by conversing with computers and other technological media. Therefore, developers have started to possess the decisive power to impact the daily lives of the rest of us, and it is their bound responsibility to use that power to enhance the world. That means developers should understand how impactful a single line of code can be.

Of course, I was tempted. But I had never done a project where security was the primary interest. It was a considerable risk to start with a project like this without any experience or knowledge in security. Many what-ifs flooded my brain. What if a single line of code makes the balance disappear? What if the payment record gets mixed up? What if the server is hacked? More realistically, what if the server goes down?

People praise audacity, but I prefer prudence. Bravery and arrogance are just one step apart. A financial system should be flawless (or as flawless as possible). It should both be functional and be performing resiliently under any condition. It didn't seem impossible. But it was too naïve to believe nothing would happen, as I was (and am still) a total newbie in security. So I turned it down.

Wait, payment system using Google Forms?

The student council still wanted to continue the project. I thought they would outsource the task to some outside organization. It sounded better since they would at least have some degree of security. But the council thought differently. They were making it themselves with Google Forms.

When I was designing the system, the primary issue was payment authorization. The passcode shouldn't be shared with the merchant, while the system could correctly authorize and process the order. The users can only use the deposited money in their accounts. This authorization should happen in real-time. But I couldn't think of a way to nail the real-time authorization with Google Forms. So I asked for more technical details from one student council member. The idea was as follows:

Abstract of a Google-Form-Powered Payment System
  • Create one Google Form per user. (We have about 400 users in total.)
  • Create QR codes with links to the Google Form. (So it's 400 QR codes in total.)
  • Create a wristband with the QR code, and distribute them to the users.
  • Show that wristband when purchasing something.
  • The merchant scans the QR code and opens the link in incognito mode.
  • Input the price and the name of the booth.
  • Confirm with the user (customer) and submit the response.
  • Close the incognito tab.

So the idea was to use the Google Form's unique address as a password. Since the merchants are supposed to use incognito mode, there should be a safety layer to protect the user's Google Form address (in theory). They will need to make a deferred payment after the festival. But as a developer, this approach had multiple problems:

Potential Problems I found
  • How are we going to manage all 400 Google Forms?
  • Intended or not, people will lose their wristbands. In that case, we will need to note the owner of the wristband in every Google form to calculate the spending. Can we deliver those QR codes to the correct owner if we do?
  • If the merchant doesn't use incognito mode, it will be hard for an ordinary person to tell the difference. If that happens, it is possible to attack the exposed Google form by submitting fake orders. We could also add a "password," but in that case, we cannot stop the customer from providing an incorrect password and claiming that they were hacked by someone else.
  • If the merchant has to select the booth and input the price manually, there will be occasions where they make a typo. Operators could fix a typo in the price value relatively quickly, but a typo or misselection in the booth value would be a pain since we would have to find out who made a mistake and the original order. Imagine there were 20 wrong booth values. How are we going to trace the real booth value? We could guess, but would that sort of record have its value as reliable data?
  • How are we going to make the deferred payment? How will we extract and merge all 400 of the Google Forms response sheets? Even worse, the day after the festival is a vacation. People care about losing money but not so much about paying their debts. There could be students who just won't come back. It would be excruciating to notify all those who didn't deliver. But if the money is prepaid, the solution is comparably easy. The council members could deposit the remaining balance to their phone number or bank account. We don't need to message dozens of students; we could do the work ourselves.
  • The student council will make the Google Form with the student council's Google account. That Google account will have restricted access, but a few students will be working together to create all 400 Google forms. Can we track who makes the rogue action if someone manipulates the Google form for their benefit?
  • Can this all be free from human error?

It could work in an ideal situation. But it will accompany a great deal of confusion and entail a noticeable discomfort on the festival day. That made me think that even though my idea had its risks, mine would still be better. So, I changed my mind.

Development

Fortunately, I met a friend with the same intent—our vision and idea about the project aligned. I explained my previous concept, and we talked to each other and co-developed the actual product. We also met at a cafe several times. I set up and managed the DNS and created the front-end side. Below are the things we thought about while making the product.

Details that my team considered
  • We won't be able to use any payment gateway or third-party payment service since we are not officially registered, and we will use it for a single day. Some students don't own smartphones, so we won't be able to use Toss or KakaoPay (Both are well-known P2P payment services in South Korea, just like Venmo). Therefore, there cannot be any devices on the client-side. We would need to install computers on the merchant's side.
  • It is impossible to build a completely automated system. Especially in dealing with cash, we would need some help from the student council and the Department of Finances and Information. Trusted members from the committee will manually count and deposit the money.
  • There must be no errors in at least the merchant and customer fields since they would be the most difficult errors to fix later. But, of course, we cannot expect that people will make no mistakes. So, instead, we need to engineer an environment where no one can make a mistake even if they want to.
  • The booths may be congested. If each customer needs to input their username and password every time, that will pose a severe inconvenience. For user experience, some sort of one-touch payment would be ideal.
  • For this, we could use the Campus ID card. Each card has a student number (of course) and a unique value for identifying students at the school front door. We could use the number as the username and the unique value as the password. Since this password is guaranteed to be different for each student, we would only need the password for identification purposes.
  • The final payment system would be a prepaid tap-to-pay card.
  • Developers would connect each account with its owner's student ID.
  • Students could withdraw the remaining money after the festival.

We disagreed on two problems.

  1. One was the platform. While my partner insisted on using Windows executable programs, I wanted the system to be multi-platform and asked to use web apps. (As you might expect, I use a Mac.)
  2. The other was the method of reading data from the Campus ID card. The card has an RFID chip and a bar code storing the same value. If we chose RFID values, we would have to purchase ten RFID readers, spending an additional $100. Initially, I insisted on using the embedded laptop webcam to scan the barcode because MinsaPay was a pilot experiment at that time. I thought that such an expense would make the entire system questionable in terms of cost-effectiveness. (I said "Wait, we need to spend an additional $100 even though we have no idea if the system will work?")

We chose web and RFID, conceding one for each. I agreed to use RFID after learning that using a camera to read bar codes wasn't that fast or efficient.

Main Home, Admin Page, and Balance Check Page of the product.

Main Home, Admin Page, and Balance Check Page of the product.

And it happened

Remember that one of the concerns was about the server going down?
On the festival day, senior students had to self-study at school. Then at one moment, I found my phone had several missed calls. The server went down. I rushed to the festival and sat in a corner, gasping and trying to find the reason. Finally, I realized the server was intact, but the database was not responding.
It was an absurd problem. (Well, no problem is absurd, per se, but we couldn't hide our disappointment after figuring out the reason.) We thought the free plan would be more than enough when we constructed our database. However, the payment requests surged and exceeded the database free tier. So we purchased a $9.99 plan, and the database went back to work. It was one of the most nerve-wracking events I ever had.

The moment of upgrading the database plan. $10 can cause such chaos!

The moment of upgrading the database plan. $10 can cause such chaos!

While the server was down, each booth made a spreadsheet and wrote down who needed to pay how much. Afterward, we settled the problem by opening a new booth for making deferred payments.

The payment log showed that the server went down right after 10:17:55 AM and returned at 10:31:10 AM. It was evident yet intriguing that the payments made per minute were around 10 to 30 before the crash but went down to almost zero right after restoring the server. If you are interested, please look here.

Due to exceeding the database free tier, the server went down for 13 minutes and 15 seconds after payment #1546.

Due to exceeding the database free tier, the server went down for 13 minutes and 15 seconds after payment #1546.

Results

1. MinsaPay

The entire codebase for MinsaPay is available on GitHub. First, though, I must mention that I still question the integrity of this system. One developer reported a security flaw that we managed to fix before launch. However, the system has unaddressed flaws; for example, though unlikely, merchants can still copy RFID values and forge ID cards.

2. Payment Data

I wanted to give students studying data analysis more relatable and exciting data. Also, I wanted to provide financial insights for students planning to run a booth the following year. Therefore, we made all payment data accessible.

However, a data privacy problem arose. So I wrote a short script to anonymize personal data. If a CSV file is provided, it will anonymize a selected column. Identical values will have the same anonymized value. You can review the anonymized data here.

Note for Developers

I strongly recommend thoroughly auditing the entire code or rewriting it if you use this system. MinsaPay is under the MIT license.

What I Learned

There is ample room for improvement.

First, there are codes with numerous compromises. For example, we made a lot of trade-offs not to miss the product deadline (the festival day). We also wanted to include safety features, such as canceling payments, but we didn't have time. More time and development experience would have improved the product.

Since I wasn't comfortable with the system's security, I initially kept the repository quiet and undisclosed. Afterward, however, I realized this was a contradiction, as I knew that security without transparency is not the best practice.

Also, we were not free from human errors. For example, RFID values were long strings of digits, and there were a few mistakes that someone would input in the charge amount, making the charge amount something like Integer.MAX_VALUE. We could've added a simple confirmation prompt, but we didn't know the mistakes would happen at that time.

In hindsight, it was such a great experience for me, who had never done large-scale real-life projects. I found myself compromising even after acknowledging the anti-patterns. I also understood that knowing and doing are two completely different things since knowing has no barriers, but doing accompanies extreme stress both in time and environment.

Still, it was such an exciting project.

Lastly, I want to thank everyone who made MinsaPay possible.

  • Jueon An, a talented developer who created MinsaPay with me
  • The KMLA student council and Department of Finances and Information, who oversaw the entire MinsaPay Initiatives
  • The open-source developers who reported the security flaws
  • Users who experienced server failures during the festival day
  • And the 400 users of MinsaPay

Thank you!

👋

👋