Introducing GoBot, an Autonomous AI Agent That Actually Controls Your Desktop

I really love GoLang for how small and portable the binaries are.

As a nod to the Clawd (now Molt), I’ve been working on GoBot. It’s an open-source (MIT) AI agent that’s always running and can actually interact with your desktop, not just chat.

What Makes It Different?

For starters, it’s a single binary and supports multiple sub-agents.

GoBot is a persistent agent that:

Remembers everything. Conversations, facts, preferences survive restarts. It’s backed by SQLite so nothing is lost.

Controls your desktop. Tonight I asked it to check my Gmail. It opened Brave, took screenshots to “see” my inbox, composed an email, and clicked Send. All by itself. Very cool to see.

Runs scheduled tasks. Cron jobs that can execute bash commands OR spawn sub-agents.

Works across channels. The same agent answers on the web UI. I’m working on connecting Discord, Telegram, Slack, and more.

What I Demoed Tonight

“Check my Gmail” → It activated Brave, navigated to Gmail, and took a screenshot to read the unread count.

“Send Ben an email saying I’ll be late” → Composed the email, typed it out, clicked Send.

I also built 7 macOS automation plugins in one session: clipboard, screen capture, window management, accessibility API access, notifications, and mouse/keyboard control.

The Desktop Control Plugins

Plugin	What it does
clipboard	Read/write clipboard, history
screen	Screenshots + OCR via Vision framework
accessibility	Read UI trees, click buttons by label
desktop	Mouse clicks, keyboard input, scrolling
window	List/focus/move/resize windows
app	Launch/quit apps, menu bar interaction
notification	System alerts, TTS

These plugins give GoBot the ability to see and interact with your desktop the way you do. It’s not just parsing APIs or running CLI commands. It’s literally clicking buttons and reading what’s on screen.

Tech Stack

Go backend (go-zero framework)
SvelteKit 2 + Svelte 5 frontend
Multiple AI providers (Anthropic, OpenAI, Gemini, Ollama)
hashicorp/go-plugin for extensibility
SQLite for persistence

The plugin architecture means you can extend GoBot with new capabilities without touching the core. Write a plugin, drop it in, and the agent can use it.

Why This Matters

I believe fully autonomous agents are the future of AI.

Right now, most AI assistants are chat-only. You ask a question, get an answer, copy-paste something, repeat. The human is still the executor.

GoBot flips that. You tell it what you want done, and it does it. It navigates apps, fills forms, clicks buttons, reads screens. It’s the difference between a consultant who gives advice and an employee who gets things done.

This is where we’re headed. Not AI that tells you how to do things, but AI that does things for you.

Get Involved

GoBot is open source under MIT. I’m actively working on it and welcome anyone who wants to help.

The alpha release is available now: github.com/localrivet/gobot

It’s early. It’s rough around the edges. But it works, and I think the core idea of a persistent, desktop-controlling agent is something worth building in public.

X: x.com/almatuck

LinkedIn: linkedin.com/in/almatuck

Introducing GoBot, an Autonomous AI Agent That Actually Controls Your Desktop

What Makes It Different?

What I Demoed Tonight

The Desktop Control Plugins

Tech Stack

Why This Matters

Get Involved

You Might Also Like

Why Personal Desktop AI Companions Must Never Run Interpreted Code

I Reduced My MCP Tools from 96 to 10. Here's the Pattern.

The Nebo Manifesto

I Reduced My MCP Tools from 96 to 10. Here's the Pattern.

Why Personal Desktop AI Companions Must Never Run Interpreted Code