OpenClaw is an open-source framework for building personal AI agents on your own hardware. A gateway process gives Claude (or any model) persistent identity and connects it to messaging channels, browser automation, cron jobs, and whatever APIs you wire up. No cloud dependency. Runs on a Mac Mini.
I wanted to see how far I could take it. Not by giving an agent the keys to everything on day one, but by adding one capability at a time and making sure each one had real limits on what it could do. This is what that looked like after a few months--what I wired up, what broke, and what surprised me.
My setup
I wanted an assistant that could actually do things--adjust the thermostat, book a restaurant, triage email, order household supplies--from an iMessage thread. Each skill got added only after I figured out what it should never be allowed to do.
My instance runs on a Mac Mini in my cabin. It talks to me through iMessage with skills across smart home, dining, email, shopping, and productivity. Most of the work was plumbing, not AI.
The architecture
A Node.js gateway process runs as a system service on macOS. The gateway wraps Claude and connects it to iMessage. Sessions reset periodically, which keeps context fresh without breaking continuity mid-conversation.
System service (macOS)
|
Wrapper script
|-- Loads secrets via separate service account
|-- Applies patches for known bugs
|-- Spawns Node.js gateway
|-- Claude (current model)
|-- Skills (smart home, dining, email, shopping, productivity)
|-- iMessage plugin
|-- Browser automation
|-- Cron scheduler
macOS system services run in a stripped-down environment--keychain access, password managers, GUI interactions all behave differently than a normal terminal session. That wrapper script exists because of this. It handles secrets, patches third-party bugs at startup, and wires up everything the gateway needs to boot cleanly on restart.
What it actually does
The skills fall into a few buckets.
On the smart home side: thermostat control across multiple locations, smart lights, and AC. I also built a climate dashboard--a lightweight web app served over a private network, fed by periodic snapshots.
For dining, it handles restaurant search and booking. Recurring cron jobs automate date night: the agent finds availability, books, creates a calendar event, and pings the group chat. (My partner knows about and opted into all of this.)
Email triage runs with consent against a shared family inbox. Unreads get sorted into categories--urgent, actionable, financial, shopping, informational. Urgent items get flagged with draft replies. A second pass later in the day archives stale threads and flags unsubscribe candidates.
Shopping goes through browser automation. The agent finds and recommends products but won't place an order without explicit out-of-band approval. This was one of the first skills where I spent more time on the restrictions than on the feature itself.
Then there's the grab bag: calendar, reminders, web search, content summarization, and the EchoNest music queue.
The hard parts
Secrets management in headless contexts
Getting password manager access working from a system service was the biggest headache. The CLI hangs when it can't reach the desktop app through the expected IPC channel--no timeout, no error. I ended up with a separate secrets pipeline that pre-loads what the agent needs at boot, backed by a dedicated service account with minimal permissions. Don't assume desktop tooling works headless. It won't.
Browser auth persistence
Browser automation on macOS crashes when it hits encrypted cookies without a desktop session present. Isolated browser profiles with their own auth state get around this, but token rotation and profile corruption mean it needs active maintenance. Worth budgeting real time for if you go this route.
Network instability and third-party bugs
Some dependencies panic on network interface changes, which satellite internet triggers constantly. I wrote patches that get reapplied on every restart. Not elegant, but stable for months. If you're running something 24/7 on variable connectivity, plan for it from day one.
Cron jobs and state management
Cron definitions live in version control, but the gateway decorates them with runtime state--IDs, timestamps, execution history--that shouldn't be committed. A sync script strips the state when saving and merges it back on deploy. I can edit job configs in git, push, and have them picked up on the next restart without clobbering execution history.
Security boundaries
Before adding any skill, I asked: what's the worst thing this could do?
Every integration gets its own service account with the minimum permissions it needs. Read-only where possible. Anything involving money or personal data goes through a separate approval channel--the agent can recommend, but a human confirms. Messaging is locked to allowlisted contacts. Auth checks run before any outbound action, and token problems get surfaced instead of swallowed.
I also run a weekly activity report that flags unexpected auth failures, unusual request patterns, and skills that haven't run when they should have. Detection matters as much as prevention.
What I'd do differently
I started with a smaller model and upgraded after a few months. The jump in multi-step reasoning was immediately noticeable. I should have started with the stronger model and dialed down if costs were a problem.
Headless OAuth is worth the upfront investment. Patching around macOS keychain quirks in a system service context gets fragile fast. Same goes for dependency patching--some kind of version-pinned verification would save time over manual auditing.
The boring part
After a few months, the whole thing just runs. The machine boots, the service starts, secrets load, patches apply, skills come online. Recurring tasks do their thing. The weekly report tells me if anything is off.
Most of the engineering here wasn't AI work. It was system service behavior, browser automation, and secrets management. Handing Claude a bag of tools and watching it figure out how to use them was the easy part. Getting comfortable enough to stop checking on it every hour took longer.