The Best UX Is No UX

We are building software that should be invisible. Halfway to ambient computing, but not there yet. This is a brief post about the hundred micro-decisions that made it happen.

We like tools that you forget you are using. A mechanic doesn't think about the hammer or whatever tool they are using while they are using it. A writer doesn't think about the pencil while writing. The tool should vanish into whatever you are trying to do. That also applies to Yakki, our macOS desktop dictation app. You hold the key, you speak, you release the key and the text should appear wherever your cursor was. The same way as if you typed yourself—the tool should never get in the way. That's the core of the entire experience. No windows. No app switching. No copy-paste dance. Voice becomes exactly where it has to be.

The problem with simplicity is that, first, it's only really appreciated once you experience it. And second, the actual complexity to make something simple is way higher than leaving the complexity untouched. I read someone saying this week that simplicity does not necessarily sell. Complexity is way more appealing. Getting there requires us to fight almost every single default that the operating system, frameworks, and we ourselves impose. macOS wants your app to have a dock icon, a main window, a presence. The system wants to give you window focus. Every one of those defaults is a step away from disappearance. This post is a recollection about the small decisions we made to resist that gravity, that pull.

What if you want to remove your app from the dock?

This is how we remove our app from the dock. We use this single line:

NSApp.setActivationPolicy(.accessory)

It does remove Yakki from the Dock. There is no icon bouncing at the bottom of the screen. There is no entry in the Cmd+Tab switcher. As far as the OS is concerned, the app barely exists. As a user you still have a toggle in the settings to make the app icon visible if you want that. But our default, the way we translate our product philosophy, is to remain invisible.

We do have a permanent visual presence as a pixel template icon in the menu bar. It sits alongside the Wi-Fi symbol and the clock and the other elements in your menu bar. It adapts to light mode, dark mode, and wallpaper tinting automatically. It looks like it belongs to the system, not necessarily to us. It holds everything: how to start and stop recording, microphone selection, configuration. But during normal use you never open it. You press a hotkey, speak, release. The menu is there for the two percent of the time you need to change something or do something with either your recordings or your media uploads or your transcriptions. Keeping the dictation completely free.

But boss, I do need a window sometimes

When we do need a window—for settings, for onboarding, for license activation—we briefly reveal our main application window. The app surfaces in the dock only while you are actively interacting with its settings or with the home views. Then it sinks back below the waterline. Every window type—column configuration, onboarding, license, models—follows this pattern.

SwiftUI even fights us here. The @main struct requires a WindowGroup, so we give it one:

WindowGroup {
    EmptyView()
}
.windowStyle(.hiddenTitleBar)
.defaultSize(width: 0, height: 0)

It takes a bit of discipline to maintain the invisibility against a framework that keeps trying to make you visible. For good reason—like most users, most applications don't behave like that. They are where the action is happening. They are not enablers of an action in other applications as Yakki is.

What happens when you need to communicate a state change?

During dictation, there is something you still need: one visual element on screen communicating changes in the state. We have several different ways for this visual indicator to manifest. Being completely honest, we still have a long way to improve things and there are a couple of ideas I am really keen on experimenting with. For the moment you have either the option of a tiny floating glass indicator that changes with the state, a tiny cat that shows up when you are typing or dictating, and a dynamic island concept that we are still working on. (I have high hopes for this one. I believe this will be the final shape that Yakki's visual indicator will take.)

There was also the question of the ideal position for the indicator. Early versions placed it in the top right corner. The logic seemed sound: out of the way. But it was wrong. Users were constantly glancing up at it during dictation, breaking their focus on the text they were trying to compose. Moving it to center—either bottom or top—solved the issue. That's peripheral vision territory. You register that the indicator is there, that state is changing, without having to look directly at it. In the same way that you notice a traffic light changing without looking at it directly. There were also a couple of details about it, like which screen to place it on, having to solve a couple of cases for multi-monitor setups that were technically interesting to sort out. Now it follows the user, rather than forcing the user to find it.

Why UX decisions like that matter

Invisible design isn't just an aesthetic choice. It has a function. It has a purpose.

Dictation is cognitively demanding in its own way. You're composing text in real-time, organising thoughts, choosing words. Visual distractions like a bouncing Dock icon, a focus-stealing window, a modal asking for permission—they interrupt that flow. The app needs to be below the threshold of conscious attention so that all of the user's cognitive resources go to what they're actually doing: thinking and speaking.

This is the paradox of invisible software. The more work you put into the tool, the less the user should notice it. Every decision we make—the template icon, the 200ms grace period, the clipboard restoration, the click-through window, the multi-screen following—exists so that our user can forget we exist.

The best compliment we've received from a user: "It just feels like typing, but faster."

They didn't mention the app at all. That's the goal.

Yakki is a macOS dictation app that turns your voice into text, wherever your cursor is. It runs on-device, respects your privacy, and tries very hard to stay out of your way. Learn more at yakki.ai.