---
title: HubSpot AI Agent Data Cleanup Workflow
type: knowledge
created: '2026-04-05'
updated: '2026-04-05'
source_docs:
- raw/2026-03-26-crm-working-call-cai-quarra-133075441.md
tags:
- hubspot
- ai-agent
- claude
- crm
- data-cleanup
- workflow
- api
layer: 2
client_source: null
industry_context: null
transferable: true
---

# HubSpot AI Agent Data Cleanup Workflow

A reusable pattern for using AI agents (Claude/Clade) to automate HubSpot field consolidation, contact reassignment, and bulk data updates — tasks that would otherwise take hours of manual UI work.

## Overview

HubSpot accounts accumulate duplicate fields, inconsistent data, and orphaned properties over time. Cleaning these up manually through the UI is tedious and error-prone. An AI agent with HubSpot API access can automate the bulk of this work — including updating forms and workflows that reference old fields — in minutes rather than hours.

This pattern was developed and validated during a live cleanup of a client HubSpot account (~3,000 contacts), consolidating duplicate State, Country, and Juicing fields, reassigning contacts from an archived user, and merging Lead Source data.

**Evidence:** [[clients/citrus/index]] — completed field consolidation across 9 forms and 2 workflows in a single session.

---

## Prerequisites

- HubSpot Personal Access Token (PAT) with full scopes
- AI agent workspace (Claude/Clade) with HubSpot MCP or API integration configured
- A structured list of cleanup tasks (see Input Format below)

### Why a Personal Access Token?

PATs consistently outperform other HubSpot token types for agent-driven API work. When creating the token, grant every available scope — the agent needs broad access to read properties, update forms, and modify workflows in a single session.

> ⚠️ **Caution:** Full-scope tokens can make irreversible changes. Archive fields rather than deleting them. The agent's ability to revert changes is limited — think before you push.

---

## The Four-Step Workflow

### 1. Instruct — Provide Structured Input

Give the agent a clear, structured list of tasks. **JSON or CSV is strongly preferred** over prose notes or Word documents.

- JSON: parsed instantly, no overhead
- CSV: nearly as good, widely exportable
- PDF: acceptable (agent uses OCR)
- Word/Google Docs: slowest — agent must strip formatting before reaching data

**Don't** dump raw notes. Translate your intent into discrete, unambiguous tasks before handing off.

Example task format:
```json
[
  {
    "task": "consolidate_fields",
    "keep": "state",
    "archive": ["state_text", "state_or_region"],
    "migrate_data": true
  }
]
```

### 2. Collaborate — Let the Agent Assess First

Rather than dictating every step, ask the agent to **look and report** before acting:

> "Look at all state-related contact properties and tell me what you see."

The agent will surface things you didn't know to ask about — like how many contacts have data in each field, which forms reference the field, and what API constraints apply. This prevents you from giving instructions that create downstream breakage.

Let the agent suggest its approach, then confirm or redirect. It will flag implications (e.g., "archiving this field will break 2 workflows") that you might miss.

### 3. Push — Don't Accept "Manual Step" Too Quickly

The agent defaults to declaring tasks manual when it hits an obstacle. **Push back.** In practice, most "manual" declarations are the agent being lazy or defaulting to a conservative API version.

Common pushback patterns:
- *"I can't update forms via the API"* → Push it to try the V3 patch endpoint instead of V4
- *"This requires manual UI work"* → Ask it to attempt automation anyway; it often succeeds
- *"I don't have scope for this"* → Verify the token scopes, then push again

Know when to stop: if the agent fails 2–3 times on the same task with different approaches, it's likely a genuine API limitation (e.g., forms created in HubSpot's new editor may require a different API tier).

### 4. Document — Create a Markdown Summary Periodically

Ask the agent to summarize completed work and write it to a file:

> "Summarize everything we've done so far and create a Markdown document."

This serves two purposes:
1. **Reference** — you have a record of what changed and why
2. **Continuity** — in a future session, the agent can find and read that document rather than requiring you to re-explain context

Do this every 30–60 minutes during a long session, or after each major task group.

---

## Common HubSpot Cleanup Tasks

### Field Consolidation

**Pattern:** Multiple fields capturing the same data (one text, one dropdown; or three variants of the same question).

**Agent approach:**
1. Identify all related fields and their contact coverage
2. Designate one canonical field (prefer existing HubSpot standard fields like `state`, `country`)
3. Migrate data from deprecated fields into the canonical field
4. Update all forms and workflows referencing deprecated fields
5. Archive deprecated fields (do not delete — archiving preserves workflow integrity)

**Backfilling:** The agent can often backfill missing data from adjacent fields. For example, missing `country` values can be inferred from HubSpot's IP country data. This isn't perfect (VPNs introduce noise) but is highly accurate at the country level.

**Known constraint:** HubSpot's API enforces a 3-field-per-group limit on forms. If a form has more fields in a group, the agent must split them — this adds complexity but is automatable.

### Contact Reassignment

**Pattern:** Contacts owned by an archived or departed user need reassignment.

**Agent approach:** Bulk-update the `hubspot_owner_id` field across all affected contacts in batches. For ~3,000 contacts, this completes in under a minute.

### Bulk Field Updates via CSV

**Pattern:** You have a cleanup CSV mapping contact IDs to corrected field values (e.g., Lead Source, Lead Source Detail).

**Agent approach:** Parse the CSV, batch-update contacts via the API. This is faster and more reliable than HubSpot's native import for targeted field updates.

### Lead Source Merging

**Pattern:** Two fields capturing the same intent (`Inbound Lead Source` and `Lead Source`).

**Agent approach:** Read values from the deprecated field, write them to the canonical field where the canonical field is empty, then archive the deprecated field.

---

## Safety Mechanisms

| Risk | Mitigation |
|---|---|
| Irreversible field deletion | Always archive, never delete |
| Breaking forms/workflows | Agent checks field usage before archiving; updates dependencies first |
| Data loss during migration | Agent reads source field values before writing to destination |
| Runaway batch operations | Agent sets internal timeouts (typically 10 min); background tasks can be monitored |
| Context loss in long sessions | Periodic Markdown summaries written to disk; agent can re-read them |

### On Reverting Changes

Reverting API-driven changes is possible but not trivial. The agent can attempt rollbacks if you catch an error quickly, but this is not a reliable safety net. **Think before you push**, especially for:
- Archiving fields used in active workflows
- Bulk-updating contact owner assignments
- Merging fields with overlapping but non-identical values

---

## Working with the Agent: Practical Tips

**Avoid interrupting mid-task.** If you hit Enter while the agent is executing, it may read your message as part of the current task and get confused. Wait for it to complete a step before adding new instructions.

**Red code = errors, not failure.** When you see red output, the agent encountered an error and is self-correcting. Multiple red blocks on the same task signal genuine difficulty — consider whether to push further or accept a manual step.

**Background tasks.** For operations expected to take more than a few minutes, ask: *"Can you run this in the background?"* The agent will spawn a sub-process and you can continue with other tasks.

**Workspace hygiene.** AI agent workspaces accumulate files over time, slowing startup and consuming context memory. Periodically ask the agent to clean up unnecessary files. Be conservative — over-aggressive cleanup can delete records of past work.

**JSON/CSV for inputs.** When providing task lists, field mappings, or contact data, always use JSON or CSV. Avoid pasting prose notes — translate them into structured format first.

---

## Known API Limitations (HubSpot-Specific)

- **Form editor versions:** Forms created in HubSpot's new form editor may require a different API tier and cannot always be updated programmatically. These become genuine manual steps.
- **API version compatibility:** HubSpot does not maintain backward compatibility between API versions (V3 → V4). The agent may need to fall back to older endpoints.
- **Field archiving restrictions:** Fields used in active forms or workflows cannot be archived until those dependencies are removed. The agent handles this automatically if you push it to do so.
- **Meeting link forms:** Forms embedded in individual rep meeting links (e.g., Calendly-style HubSpot meeting links) often cannot be updated via API and require manual UI edits.

---

## Related

- [[clients/citrus/index]]
- [[knowledge/hubspot/hubspot-crm-field-management]]
- [[knowledge/ai-agents/claude-agent-workflow-patterns]]