---
title: HubSpot Database Cleanup — Hunter.io & Deduplication
type: article
created: '2026-03-24'
updated: '2026-03-24'
source_docs:
- raw/2026-03-24-weekly-call-w-sebastian-132388501.md
tags:
- aviary
- hubspot
- database-cleanup
- hunter-io
- crm
- deduplication
layer: 2
client_source: AviaryAI
industry_context: saas
transferable: false
---

# HubSpot Database Cleanup — Hunter.io & Deduplication

## Overview

Aviary's HubSpot database contains a large number of contacts with missing or incomplete data, degrading the effectiveness of ABM and nurture campaigns. A multi-step cleanup process was initiated to enrich, deduplicate, and prune the contact database.

This work is a prerequisite for reliable email volume in both the [[wiki/clients/current/aviary/abm-email-automation|ABM automation]] and the planned [[wiki/clients/current/aviary/nurture-sequence|generic nurture sequence]].

## Cleanup Process

The cleanup runs in three sequential steps:

1. **Email Enrichment via Hunter.io**
   All contacts lacking an email address are exported and run against [Hunter.io](https://hunter.io) to attempt email discovery. Any addresses found are written back to the contact record.

2. **Duplicate Merging**
   When Hunter.io finds an email address that already exists on another contact record, a duplicate is created. These duplicates are identified and merged automatically as part of the same job run.

3. **Archiving Incomplete Contacts**
   Contacts with neither a company association nor an email address after enrichment are archived. These records have no actionable marketing value and add noise to segmentation and reporting.

## Scale & Status

As of the 2026-03-24 sync:

- Approximately **20,000+ contacts** in HubSpot lacked email addresses and were queued for processing.
- A batch of **2,000–4,000 contacts** was processed in the most recent run.
- The job runs slowly due to volume; it was noted as stalled and needs to be **restarted**.

## Action Items

- [ ] Restart the Hunter.io enrichment + merge/archive job (@Mark Hope)

## Context & Motivation

Poor data quality was directly limiting campaign reach. Without valid email addresses, contacts cannot enter ABM or nurture sequences. Archiving dead records also reduces list noise and improves lifecycle stage reporting accuracy.

Cleanup progress will directly increase the number of contacts eligible for the ABM campaign and the cold nurture drip, which is expected to significantly raise email send volume in the near term.

## Related

- [[wiki/clients/current/aviary/abm-email-automation|ABM Email Automation — SQL Lifecycle Stage Fix]]
- [[wiki/clients/current/aviary/nurture-sequence|Nurture Sequence Launch]]
- [[wiki/clients/current/aviary/_index|Aviary Client Overview]]