Giving Claude Code a Face

Harry Smaje

·

Steve Yegge has been writing production code for over 40 years. He co-authored the Vibe Coding book with Gene Kim. When he went on the Pragmatic Engineer podcast and said that most developers will soon be programming by talking to a face, I paid attention.

His argument is straightforward: reading ability is becoming the main bottleneck for AI adoption. AI coding agents produce pages of reasoning, diffs, and tool call results. Power users absorb it. Most people don't. Yegge thinks the interface needs to change, and he's describing something that looks a lot like an AI agent avatar.

So I built one. It's called Clawd Face, it's open source, and it gives Claude Code a face you can talk to.

What Clawd Face actually is

Clawd Face is a browser-based interface for Claude Code, Anthropic's CLI coding agent. Instead of a terminal, you get a split screen: on one side, an Anam interactive avatar named Liv who speaks and listens in real time. On the other, a terminal panel showing what Claude Code is doing under the hood.

You talk to Liv. She sends your words to Claude Code. Claude Code does its thing, reads files, writes code, runs tests. Then Liv speaks the response back to you while the terminal scrolls alongside. The whole interaction is a conversation, not a command line session.

The avatar is powered by Anam's avatar API. Anam handles the face, voice synthesis, and speech recognition. Claude Code handles the intelligence. They're completely decoupled, which is what makes the pattern interesting.

How it works (without the code)

The concept is simpler than it sounds. Anam runs in what we call CUSTOMER_CLIENT_V1 mode, which means we're telling Anam "don't use your own LLM, we'll handle the brain." Claude Code runs as a background process on the server. A WebSocket bridges the two: speech from the avatar SDK goes in, text responses come back out as a live avatar stream.

The result is that Anam acts as a real-time avatar layer on top of an existing agent. It doesn't matter that the agent underneath is a coding assistant. The same pattern works for any AI agent that produces text output. Customer support bots. Sales assistants. Internal tools. The streaming avatar face is the interface, the agent behind it is whatever you need it to be.

This is the same integration pattern that companies like Pipecat and VideoSDK use when they add avatar faces to their voice agent frameworks. The ai avatar API is agent-agnostic by design.

What it's like to use

For simple tasks, it's genuinely pleasant. "Explain what this function does." "Find all the TODO comments." You're having a conversation about your code rather than reading a screen of text. It feels like pair programming with someone who happens to have a face on your monitor.

For longer tasks, there's room to improve. When Claude Code spends 30 seconds reading files and running tools, Liv waits silently. That dead air feels unnatural. The obvious fix is streaming, having the avatar speak partial responses as they arrive rather than waiting for the full answer. Claude Code supports chunk-based output, so this is a near-term improvement.

I should be honest: this is a proof of concept, not a polished product. The code was largely vibe-coded. But the experience is compelling enough that it changed how I think about AI agent interfaces.

Why an AI agent avatar matters for the future of dev tools

Yegge describes eight levels of AI adoption for engineers. Most developers are at level one or two, asking the AI for suggestions and manually reviewing the output. He thinks the developers who stay at those levels will be left behind.

The reason most people stall isn't capability. It's interface. Current AI dev tools produce walls of text. Five paragraphs is already a lot to read for many developers, as Yegge puts it. An AI talking head that can summarise what it did, explain its reasoning, and respond to follow-up questions changes the accessibility equation entirely.

This isn't just about coding. Every vertical where AI agents are deployed, customer service, education, sales, healthcare, faces the same output problem. The agent does the work, but the human struggles to consume the result. A live avatar interface solves that by making the interaction conversational rather than textual.

I think we're at an inflection point. The agent frameworks are maturing. The LLMs are getting better at reasoning. What's missing is the interface layer that makes these agents accessible to people who aren't power users. That's the gap an interactive avatar fills.

Try it yourself

The project is open source: github.com/anam-org/clawd-face. You'll need an Anam API key (free tier available), Claude Code installed, and Node 18+. Clone the repo, add your API key, run npm install && npm run dev, and open your browser.

The getting started docs cover the avatar SDK integration if you want to adapt this for a different agent. The core concept, an Anam avatar as a real-time face on any AI agent, is portable to whatever you're building.

What would you build if your AI agent had a face? 👀

Never miss a post

Get new blog entries delivered straight to your inbox.

Never miss a post

Get new blog entries delivered straight to your inbox.

In this article

Table of Content