English ย |ย ็ฎไฝไธญๆ
๐ค An AI agent that autonomously operates social media through a real browser.
- โจ What is LocoAgent?
- ๐๏ธ How It Works
- ๐ Installation
- โ๏ธ Configuration
- ๐ง Model Providers
- ๐ Browser Automation
- ๐ฏ Platform Skills
- ๐ Multi-Platform Targets
- ๐ Workflow Engine
- ๐ Operation Log
- ๐๏ธ Task Scheduling
- ๐ก Trajectory Monitor
- ๐ฉบ Doctor โ Health Check
- ๐ Project Structure
- ๐งฉ Tech Stack
- ๐ค Contributing
- ๐ License
LocoAgent is an AI-powered social-media agent that autonomously operates real accounts through genuine browser automation. It pairs an LLM-driven agentic loop with the agent-browser CLI to perceive โ decide โ act on live web pages โ liking posts, writing replies, following users, and publishing content the way a human would.
Under the hood it is a fork of the Claude Code CLI source tree, re-purposed for social automation: the battle-tested agent loop, ~40 tools, ~90 slash commands, and Ink/React terminal UI are reused, with a thin LocoAgent-specific layer (skills, workflows, persona, operation log) layered on top.
| Feature | Description | |
|---|---|---|
| ๐ฅ๏ธ | Real browser, real sessions | Drives Chrome via CDP with your actual login cookies โ no fragile API hacks, no headless fingerprint |
| ๐ฏ | Platform skill system | Loads complete operation playbooks (37 operations for X.com) so the agent finishes composite tasks in a single pass |
| ๐ | Workflow engine | Deterministic, LLM-free browser pipelines that the agent supervises โ start, stop, schedule as a daemon |
| ๐ | Operation log | Persistent cross-session deduplication so the agent never repeats a like, follow, or reply |
| ๐ง | Multi-provider LLM | Any OpenAI-compatible API โ OpenRouter, DeepSeek (thinking mode), OpenAI, Ollama, LM Studio, plus native Anthropic / Bedrock / Vertex |
| ๐ | Multi-platform, concurrently | Drive X, LinkedIn, and Reddit at the same time โ one isolated Chrome per platform, same-platform serial, cross-platform parallel |
| ๐ฅ๏ธ | Cross-OS | One codebase runs on Windows, macOS, and Linux via a host/device abstraction layer |
flowchart LR
User([๐ค User / Task]) --> Loop
subgraph Agent["๐ค LocoAgent Core"]
Loop["Agentic Loop<br/>(query.ts)"]
Prompt["System Prompt<br/>(prompts.ts)"]
Loop <--> Prompt
end
Loop <-->|"Anthropic / OpenAI shim"| LLM["๐ง LLM Provider"]
Loop --> Tools["๐ ๏ธ Tools ยท Bash"]
Tools --> AB["๐ agent-browser CLI"]
AB --> CDP["Chrome CDP :9222"]
CDP --> Web[("๐ Live Web Page")]
Skills["๐ฏ Platform Skills"] -. inject playbook .-> Prompt
Persona["๐ชช persona/"] -. persona + tasks .-> Prompt
OpLog[("๐ Operation Log")] -. dedup .-> Prompt
WF["๐ Workflow Engine"] --> AB
The agent perceives a page with agent-browser snapshot, the LLM decides the next action, a tool acts (click, fill, open), and the result is verified before the loop continues โ checking the operation log first so nothing gets done twice.
| Requirement | Version | Notes |
|---|---|---|
| ๐ฅ Bun | Latest | Runtime and package manager (Node is not enough) |
| ๐ฉ Node.js | โฅ 18 | Required by some dependencies |
| ๐ agent-browser | Latest | Browser-automation CLI |
| ๐ต Google Chrome | Latest | Driven over CDP |
| ๐ฟ Git | Any | Powers context features |
One command installs Bun + agent-browser, clones the repo, scaffolds .env, and runs the health check. When run in a terminal it asks you to pick a provider (DeepSeek / Anthropic / OpenAI), enter your API key, then pick a model (Enter takes the latest; c lets you type any model name). The base URL is fixed per provider โ no need to type it.
macOS / Linux / WSL2
curl -fsSL https://raw.githubusercontent.com/LocoreMind/locoagent/main/install.sh | bashWindows (PowerShell)
irm https://raw.githubusercontent.com/LocoreMind/locoagent/main/install.ps1 | iexInstalls into the current directory (an empty folder is used as-is; otherwise a ./locoagent subfolder is created; override with LOCO_DIR). The installer prints the exact target and lets you confirm it. Re-running from inside the checkout updates it in place. Afterwards: bun run setup-chrome && bun start. Chrome and Git are detected (not auto-installed) โ install them if the script warns.
git clone https://github.com/LocoreMind/locoagent.git
cd locoagent
bun install
# Verify your environment before first run
bun run doctor# Interactive REPL
bun start
# Single query (headless / print mode)
bun start -p "open X.com and like the first post about AI agents"
# With a specific model
bun start --model anthropic/claude-sonnet-4.5Tip
bun run doctor checks Bun, agent-browser, Chrome, and your .env in one shot. Add --check-cdp to probe the CDP port too.
Create a .env file in the project root โ it is auto-loaded at startup (via the preloaded stubs/globals.ts). Configure your provider through the four neutral LLM_* variables; they are translated to the right internal settings at startup, so you never juggle provider-specific variable names:
# โโ LLM Provider โ pick ONE โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
LLM_PROVIDER=deepseek # deepseek | openai | anthropic | custom
LLM_API_KEY=sk-...
LLM_MODEL=deepseek-chat # blank = provider default
LLM_BASE_URL= # only for `custom` / self-hosted OpenAI-compatible APIs
# Examples:
# OpenAI โ LLM_PROVIDER=openai LLM_MODEL=gpt-5.5
# Anthropic โ LLM_PROVIDER=anthropic LLM_MODEL=claude-sonnet-4-6
# Custom โ LLM_PROVIDER=custom LLM_BASE_URL=http://localhost:1234/v1 LLM_MODEL=...
# โโ Agent behavior โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
SKIP_PERMISSIONS=1 # required for non-interactive / automated runsTip
The LLM_* block is the recommended front door. Under the hood it maps to the
legacy CLAUDE_CODE_USE_OPENAI / OPENAI_* / ANTHROPIC_* variables, which
still work directly for existing setups โ an explicitly-set legacy value always
wins. DeepSeek, OpenAI, OpenRouter, and every OpenAI-compatible endpoint share
the same OPENAI_* namespace internally; the provider is decided by the base
URL + model, not by the variable name.
Note
SKIP_PERMISSIONS=1 makes stubs/globals.ts inject --dangerously-skip-permissions into argv so automated/headless runs don't stop on permission prompts.
LocoAgent talks to any OpenAI-compatible API through a built-in translation shim (src/services/api/openaiShim.ts), keeping the rest of the system provider-agnostic.
| Provider | Base URL | Notes |
|---|---|---|
| ๐ OpenRouter | https://openrouter.ai/api/v1 |
Access 200+ models with one key |
| ๐ณ DeepSeek | https://api.deepseek.com |
Thinking mode (reasoning_content) fully supported |
| ๐ข OpenAI | https://api.openai.com/v1 |
GPT-4o, o-series, etc. |
| ๐ฆ Ollama | http://localhost:11434/v1 |
Local models |
| ๐ป LM Studio | http://localhost:1234/v1 |
Local models |
| (native SDK) | Set ANTHROPIC_API_KEY only |
|
| โ๏ธ AWS Bedrock | (native SDK) | AWS credentials |
| ๐ฅ๏ธ Google Vertex AI | (native SDK) | GCP credentials |
Pick any of these with LLM_PROVIDER + LLM_BASE_URL (e.g. LLM_PROVIDER=custom, LLM_BASE_URL=https://openrouter.ai/api/v1). The neutral front door maps to the shim automatically; advanced users can still set CLAUDE_CODE_USE_OPENAI=1 and the OPENAI_* vars directly.
If a request fails with unable to get local issuer certificate or untrusted root, your network is re-signing TLS with its own CA. Export that gateway's root certificate to a .pem file and point NODE_EXTRA_CA_CERTS at it in .env โ LocoAgent then trusts it on every provider path, including the DeepSeek/OpenAI shim. If the provider host is outright blocked on that network, switch networks (e.g. a phone hotspot) or pick a provider that is allowed.
LocoAgent controls a real Chrome browser over CDP (Chrome DevTools Protocol) using agent-browser.
Social platforms detect and block headless browsers and API automation. LocoAgent runs through a real, full Chrome โ same engine, same fingerprint โ so it behaves like you actually do. It uses a dedicated, isolated, persistent profile (separate from your everyday Chrome): you log into your accounts once and the session sticks, while your normal browsing is never disturbed.
# One-time: launch the isolated Chrome with CDP (same command on Windows / macOS / Linux).
# It never kills your normal Chrome and never wipes your session.
bun run setup-chrome
# First run only: log into X / your socials in the window that opens โ it persists.
# Re-running just reconnects. To wipe the isolated profile and log in fresh:
bun run setup-chrome --reset
# Multi-platform: launch one target, or every target at once (one Chrome each).
bun run setup-chrome --target linkedin
bun run setup-chrome --allagent-browser open https://x.com/home # ๐งญ Navigate
agent-browser snapshot -i # ๐ Perceive โ interactive elements with @ref IDs
agent-browser click @e5 # ๐ Act โ click a like button
agent-browser fill @e3 "Great research!" # โจ๏ธ Act โ type into a reply box
agent-browser screenshot result.png # โ
Verify โ capture the resultThe full agent-browser CLI reference is embedded in the agent's system prompt, so it knows every command natively โ skills and workflows reference operations by name rather than re-explaining them.
Skills are operation playbooks loaded on demand via slash commands. Loading one injects a complete manual into the agent's context, enabling composite task execution in a single pass.
| Platform | Command | Operations | Coverage |
|---|---|---|---|
| ๐ฆ X.com (Twitter) | /x-com |
37 | Browse ยท Engagement ยท Content creation ยท Social graph ยท Profile ยท Navigation ยท Lists |
# Interactive โ load the skill, then give a task
> /x-com open home timeline, like first 3 posts about AI, reply to the best one
# Headless
bun start -p "/x-com like 5 posts about 'large language models', then follow the authors"Create skills/<platform>/SKILL.md with YAML frontmatter:
---
description: "LinkedIn platform operations playbook"
allowed-tools:
- Bash
user-invocable: true
---
# LinkedIn Operations
## 1. Navigation
...
## 2. Engagement
...The skill auto-discovers at startup and becomes available as /linkedin. Design each operation as a self-contained section with preconditions, agent-browser commands, a verification step, and known pitfalls (see skills/x-com/SKILL.md for the established format).
Run several social platforms at the same time โ each in its own isolated Chrome, on its own CDP port, behind its own proxy. A single registry, config/browser-targets.json, is the source of truth:
{
"targets": {
"x": { "cdpPort": 9222, "proxy": "http://127.0.0.1:6738" },
"linkedin": { "cdpPort": 9223, "proxy": null },
"reddit": { "cdpPort": 9224, "proxy": null }
}
}setup-chrome --all/--target, the workflow engine, and doctor --check-cdp all read from it โ add a platform once and every tool picks it up.
bun run setup-chrome --all # ๐ one isolated Chrome per platform
bun run doctor --check-cdp # ๐ฉบ probe every target's CDP portThe engine reads each workflow's "platform" field and injects the matching target (cdpPort, profile, proxy, device) into the executor โ you never hard-code a port. A per-platform file lock then keeps same-platform runs serial (one active tab per profile) while different platforms run concurrently:
# x + x โ serialized; linkedin โ in parallel with them. Automatically.
bun run workflow orchestrate --ids hf-papers-to-x,x-search-reply,linkedin-search-replyWorkflows are deterministic browser-automation pipelines that run without any LLM in the control flow (an LLM may still be called as a single step). The agent acts as a supervisor โ it can inspect status and start/stop runs, while execution stays scripted and reproducible.
| Workflow | ID | Schedule | Description |
|---|---|---|---|
| ๐ฐ HuggingFace Daily Papers | hf-daily-papers |
daily |
Fetch top papers โ titles, abstracts, thumbnails โ and save to local data files |
| ๐ฆ HF Papers โ X.com | hf-papers-to-x |
daily |
Full pipeline: fetch HF papers โ download thumbnails โ post each as an image + text tweet |
| ๐ X.com Search & AI Reply | x-search-reply |
hourly |
Search X.com Latest โ read each post โ generate a reply via LLM โ post reply |
| ๐ผ LinkedIn Search & AI Comment | linkedin-search-reply |
hourly |
Search LinkedIn Latest โ read each post โ generate a comment via LLM โ post comment |
bun run workflow list # ๐ List all workflows + status
bun run workflow run --id hf-papers-to-x # โถ๏ธ Run once (blocking)
bun run workflow start --id hf-papers-to-x # ๐ Run once (background)
bun run workflow daemon --id x-search-reply --interval 3 # ๐ Run every 3 minutes
bun run workflow orchestrate --ids a,b,c # ๐ Multi-platform: same serial, cross parallel
bun run workflow stop --id x-search-reply # ๐ Stop at next checkpoint
bun run workflow reset --id x-search-reply # โป๏ธ Clear stopped state โ idle
bun run workflow status # ๐ Status of all workflows
bun run workflow history --id hf-papers-to-x # ๐ Execution historyStep 1 โ Definition (workflows/<id>.json):
{
"id": "my-workflow",
"name": "My Custom Workflow",
"description": "What this workflow does",
"schedule": "daily",
"platform": "x",
"executor": "executors/my-workflow.ts",
"config": { "searchQuery": "ai agent", "maxPosts": 5 }
}Tip
Set "platform" (e.g. x / linkedin / reddit) โ never hard-code cdpPort. The engine injects the target's cdpPort, profile, proxy, and device from config/browser-targets.json into config at run time, and locks the platform for the run.
Step 2 โ Executor (workflows/executors/my-workflow.ts):
#!/usr/bin/env bun
import { execSync } from 'node:child_process'
const configArg = process.argv.find((_, i, a) => a[i - 1] === '--config')
const config = JSON.parse(configArg!)
function ab(cmd: string): string {
return execSync(`agent-browser --cdp ${config.cdpPort} ${cmd}`, {
encoding: 'utf-8', timeout: 30000,
}).trim()
}
console.error('[my-workflow] Step 1: ...') // ๐ logs โ stderr
// ... your automation logic using ab() ...
console.log(JSON.stringify({ stepsCompleted: 1, stepsTotal: 1 })) // ๐ค summary โ last line of stdoutStep 3 โ Test:
bun run workflow run --id my-workflowImportant
Executor contract: accept --config <json>, log to stderr, and emit a single JSON object ({stepsCompleted, stepsTotal}) as the last line of stdout. Missing or malformed output marks the run as failed.
๐ Full guide: docs/workflow-development-guide.md โ covers deduplication, the checkpoint/stop protocol, LLM integration, and daemon mode.
Persistent memory across sessions. The agent checks the log before acting and records every action after โ preventing duplicate likes, follows, and replies. This dedup contract is the core of the agent's identity.
# ๐ Check before acting (exit 0 = already done โ skip; exit 1 = not done โ proceed)
bun run scripts/log-operation.ts check \
--platform x --action like --url "https://x.com/.../status/123"
# โ
Record after a successful action
bun run scripts/log-operation.ts add \
--platform x --action like --url "https://x.com/.../status/123" \
--status success --note "AI agents research post"
# ๐ View recent operations
bun run scripts/log-operation.ts recent --limit 20
# ๐ 30-day summary (auto-injected into the system prompt at startup)
bun run scripts/log-operation.ts summary --days 30State lives in persona/operation-log.json (human-readable JSON). The 30-day summary is injected into every session's system prompt so the agent always knows its recent history.
Replace ad-hoc prompts with structured daily/weekly task execution.
## Daily Tasks
1. Engage with relevant content (like posts matching topic queries)
2. Monitor own project mentions
3. Leave 1 technical comment on the most relevant post
## Weekly Tasks (Monday)
4. Follow 3-5 relevant researchers
5. Post 1 original tweet about recent research findings
## Session Constraints
| Action | Max per session |
|----------|:---------------:|
| Likes | 10 |
| Comments | 2 |
| Follows | 5 |
| Posts | 1 |bun run run-tasks # Execute today's tasks
bun run run-tasks:dry # Preview the generated prompt without running
bun run run-tasks -- --platform x # Restrict to one platformNote
persona/ is gitignored and absent from a fresh clone. The agent runs fine without it โ just without persona, task, and operation-history context in the prompt.
--print mode is a black box. The trajectory monitor watches the session log and prints live execution status.
# Terminal 1 โ start the monitor
bun run tail
# Terminal 2 โ run the agent
bun start -p "/x-com open timeline, like first post"โโโ New Task โโโ
/x-com open timeline, like first post
[6:30:47 PM] โก Bash: agent-browser connect 9222
[6:30:47 PM] โ Result: Done
[6:31:10 PM] โก Bash: agent-browser open https://x.com/home
[6:31:27 PM] โก Bash: agent-browser snapshot -i -c -s 'article'
[6:31:44 PM] โ Agent: Found first post, like button ref=e136
[6:31:44 PM] โก Bash: agent-browser click e136
[6:31:45 PM] โ Result: Done
bun run tail:history # ๐ Replay latest session from the beginning
bun run tail:list # ๐ List recent sessions
bun run tail <id> # ๐ฏ Watch a specific sessionA cross-platform preflight check and onboarding aid. Run it before your first session or whenever something feels off.
bun run doctor # Check Bun, agent-browser, Chrome, .env
bun run doctor --check-cdp # โฆalso probe every platform target's CDP portIt detects your host OS (Windows / macOS / Linux), resolves the Chrome binary path, probes each target in config/browser-targets.json, and reports anything missing or misconfigured.
locoagent/
โโโ src/ # โฌ๏ธ Vendored Claude Code CLI source โ treat as a dependency
โ โโโ entrypoints/cli.tsx # CLI entry point
โ โโโ services/api/ # Multi-provider LLM shim (openaiShim / codexShim)
โ โโโ services/mcp/ # MCP server management
โ โโโ tools/ # ~40 tool implementations
โ โโโ commands/ # ~90 slash commands
โ โโโ components/ ยท hooks/ # Ink/React terminal UI
โ โโโ query.ts # Agentic loop engine
โ โโโ constants/prompts.ts # ๐ The seam โ injects LocoAgent state into the prompt
โโโ scripts/ # ๐งฉ LocoAgent-specific tooling
โ โโโ setup-chrome.ts # Chrome + CDP launcher (cross-platform)
โ โโโ doctor.ts # Health check / onboarding
โ โโโ log-operation.ts # Operation-log CLI (dedup)
โ โโโ run-tasks.ts # Task scheduler
โ โโโ tail-agent.ts # Live trajectory monitor
โ โโโ workflow-engine.ts # Workflow lifecycle manager
โ โโโ lib/ # Platform layer โ host ยท device ยท config ยท target lock
โโโ config/
โ โโโ browser-targets.json # ๐ Per-platform target registry (cdpPort ยท proxy ยท profile)
โโโ skills/<platform>/SKILL.md # ๐ฏ Platform operation playbooks (โ /<platform>)
โโโ workflows/
โ โโโ <id>.json # Workflow definitions
โ โโโ executors/<id>.ts # Scripted pipelines
โ โโโ state.json # Runtime state (gitignored)
โโโ persona/ # ๐ชช Persona, tasks, operation log (gitignored)
โโโ docs/ # ๐ Public docs (workflow guide, cross-platform guide)
โโโ stubs/ # Preloaded globals + local package stubs
โโโ .env # Local config (auto-loaded)
โโโ package.json
Tip
The LocoAgent layer is small and lives outside src/. The seam is src/constants/prompts.ts, which shells out to inject persona, tasks, operation-log summary, and workflow status into every session's system prompt.
| Component | Technology | |
|---|---|---|
| ๐ฅ | Runtime | Bun (Node not supported) |
| ๐ฆ | Language | TypeScript (TSX) |
| โ๏ธ | UI | React + Ink terminal renderer |
| โจ๏ธ | CLI | Commander.js |
| ๐ | Browser automation | agent-browser + Chrome CDP |
| ๐ง | LLM integration | Anthropic SDK + OpenAI-compatible shim |
| ๐ | Extension protocol | MCP (Model Context Protocol) |
Contributions welcome! High-impact areas:
- ๐ฏ New platform skills โ LinkedIn, Reddit, Instagram playbooks
- ๐ New workflows โ automated content pipelines (development guide)
- ๐ ๏ธ New tools โ extend agent capabilities
- ๐ Bug fixes โ especially browser-automation edge cases
Branch โ change โ bun run typecheck โ commit (feat: / fix: / docs:) โ PR with What / Why / How / Testing. See CONTRIBUTING.md for the full guide.
Note
There is no unit-test suite. Verify with bun run typecheck and a real bun start -p "..." run. bun test scripts runs the platform-layer unit tests.
MIT ยฉ LocoreMind
Built with ๐ค by LocoreMind ยท ็ฎไฝไธญๆ