What 10,000 AI-Sent Emails Taught Us About Cold Outreach
- Reply rate lift from specific personalization
- 4.1x vs. name-only templates
- LeadClaw 10,000-email study
- Overall reply rate
- 4.4% (vs. 1–2% industry average)
- LeadClaw 10,000-email study
- Share of replies from second follow-up
- 41% of all replies
- LeadClaw 10,000-email study
- Best-performing subject line open rate
- 38% (property-specific reference)
- LeadClaw 10,000-email study
We Let the AI Run
Most cold email "data" comes from surveys and estimates. We have something better: 10,000 emails actually sent by an AI on behalf of real service businesses, tracked end-to-end.
Every email was written by AI. Every subject line chosen by AI. Every send time decided by AI. Every follow-up triggered by AI.
Here's what happened.
The Setup
The 10,000 emails covered five service business verticals: plumbing, roofing, HVAC, commercial cleaning, and landscaping. All sent to commercial prospects — property managers, building owners, facility directors — across 12 US cities.
We tracked every metric: opens, replies, positive replies, unsubscribes, and bounces. And we tracked the underlying email attributes — subject line pattern, email length, personalization level, send time — so we could see what the AI was getting right and wrong.
Finding 1: Personalization Beat Templates by 4x
This isn't a surprise to anyone who's run cold email before. But the magnitude surprised us.
Emails where the AI found and referenced a specific detail about the prospect — a property address, a job listing they'd posted, a recent news item about their business — had a 4.1x higher reply rate than emails using only the prospect's name and company.
The data across the full 10,000:
| Personalization Level | Reply Rate |
|---|---|
| Name + company only | 1.2% |
| Industry-specific language | 2.8% |
| Specific property or business reference | 5.0% |
The bottom line: one sentence that proves you did 30 seconds of research is worth more than three perfectly crafted paragraphs of generic pitch.
What "Specific" Looked Like
The AI pulled personalization signals from a few places: Google Maps listings, LinkedIn company pages, public business filings, and the prospect's own website.
The best-performing references were concrete and recent. "I noticed your building at 400 Commerce Street recently added three new tenants" performed better than "I work with property managers in the Chicago area."
Specificity = credibility. Credibility = replies.
Finding 2: The AI Learned to Write Shorter Over Time
We didn't pre-set an email length target. We let the AI write at whatever length it thought was appropriate and tracked the results.
In the first 1,000 emails, the average length was 148 words. The AI was writing like most cold emailers — thorough, explanatory, trying to cover every objection upfront.
By email 5,000, the average had dropped to 89 words. By email 9,000, it was 67 words.
The AI kept seeing that shorter emails got more replies and adjusted. What started as a medium-length email tool became a short-email machine on its own.
This is the part that interests us most: the AI had no preset instruction to write shorter. It just noticed what worked and changed.
Finding 3: Subject Lines Changed More Than Anything Else
We started with 8 subject line patterns. Here's how they performed across the full run:
| Pattern | Open Rate | Example |
|---|---|---|
| Property-specific reference | 38% | "HVAC service for 200 Oak Street" |
| Direct service + city | 31% | "Commercial cleaning — downtown Phoenix" |
| First name + question | 28% | "Quick question, David" |
| Company-specific | 26% | "Vendors for Riverside Properties" |
| Benefit-first | 22% | "Fill your Q3 maintenance calendar" |
| Generic intro | 17% | "Wanted to reach out" |
| Emoji opener | 14% | "👋 Quick thought for your team" |
| Urgency-based | 11% | "Limited availability for July" |
Property-specific subjects won by a wide margin. But here's the more interesting finding: the AI started A/B testing subject lines mid-campaign, allocating more sends to winners and fewer to losers.
By the end of the 10,000 emails, 73% of sends were using the top two subject line patterns. The AI had effectively run its own mini-optimization loop.
Finding 4: Tuesday and Wednesday Morning Was Consistent
We let the AI pick send times based on prospect location and timezone.
Across all 10,000 emails, Tuesday and Wednesday 8–10 AM consistently outperformed other windows:
| Day | Average Reply Rate |
|---|---|
| Monday | 2.9% |
| Tuesday | 4.3% |
| Wednesday | 4.6% |
| Thursday | 3.7% |
| Friday | 2.1% |
Friday sends were 2.2x less effective than Wednesday sends. The AI learned this and shifted volume mid-campaign — by week 6, only 8% of sends were going out on Fridays versus 24% at the start.
Finding 5: The Second Follow-Up Was the Best Email
This finding showed up in our broader data too, but it was stark in the AI campaign.
Of all replies generated across the 10,000 emails:
- 24% came from the initial email
- 18% came from the first follow-up (day 3)
- 41% came from the second follow-up (day 7)
- 17% came from the third follow-up (day 14)
The second follow-up generated more replies than the first email and the first follow-up combined.
And looking at what the AI was writing for second follow-ups — they were different. Not "just checking in." Not "bumping this to the top of your inbox." The AI typically used one of two approaches:
- A new angle — instead of repeating the original pitch, it introduced a different reason to talk (a seasonal timing point, a different service line, a different outcome for their specific property type)
- A simpler ask — instead of "would you like to schedule a call," the second follow-up often asked a yes/no question that was easier to respond to quickly
Both approaches outperformed "checking in" by 2–3x.
Finding 6: Industry Vertical Mattered a Lot
We expected some variation between trades. We didn't expect this much:
| Vertical | Overall Reply Rate | Positive Reply Rate |
|---|---|---|
| Commercial cleaning | 5.8% | 3.1% |
| Plumbing | 4.9% | 2.5% |
| HVAC | 4.6% | 2.3% |
| Roofing | 3.9% | 1.8% |
| Landscaping | 3.4% | 1.5% |
Commercial cleaning had the highest performance across the board. The leading theory: commercial cleaning decisions are made frequently (monthly or quarterly contract renewals), and the person who handles vendor relationships is typically reachable and responsive by email.
Roofing and landscaping — both more project-based and seasonal — had lower reply rates. Not because the emails were worse, but because the prospect's urgency fluctuates with the calendar.
What We'd Do Differently
Three things stand out looking back.
1. We should have narrowed targeting faster. The early emails cast a wide net — all commercial property types in a city. By the time the AI had narrowed down to the most responsive segments (mid-size property management companies with 5–20 buildings), we'd already spent 2,000 sends on lower-quality targets.
2. The AI took too long to kill underperforming subject lines. The pattern "Wanted to reach out" had a 17% open rate by email 500. It stayed in rotation until email 2,200 before volume dropped to near-zero. A harder cutoff threshold would have redirected those sends sooner.
3. We underestimated the warmup curve. The first 1,500 emails had a deliverability rate of 83% — some domains were still warming up. Emails that don't land in the inbox don't get replied to. We should have built more warmup runway before pushing volume.
The Core Lesson
AI cold email isn't magic. It's still cold outreach — people still have to want what you're selling.
But the ability to personalize at scale, run A/B tests automatically, optimize send timing, and write follow-ups that don't sound like copy-paste templates — that's a real advantage over manual outreach.
The 10,000-email run ended with a 4.4% overall reply rate and a 2.2% positive reply rate. For context, the average cold email campaign across all senders runs around 1–2% reply rate. We beat that by 2–4x.
And the AI was still getting better at email 10,000 than it was at email 1.
Start your own campaign and see what AI-personalized outreach looks like for your service business.
More on outreach scale playbooks
Other guides in this cluster. See all.
We Analyzed 50,000 Cold Emails: Here's What Gets Replies
We ran the numbers on 50,000 cold emails sent through LeadClaw. Here's exactly what subject lines, lengths, and timing drive the most replies — with real data.
How a 3-Person Cleaning Company Added $12K/Month with Automated Outreach
A real-world case study: how one small cleaning crew went from $5K/month in residential work to $17K/month with commercial contracts — in under 90 days.
How to Land Commercial Cleaning Contracts: The Cold Email Playbook
A step-by-step guide to landing office, medical, and commercial cleaning contracts through targeted cold email — including 3 templates that get replies.
How a 3-Person Cleaning Company Added $12K/Month with Automated Outreach
Maria's cleaning company was stuck at $8k/month for two years. Here's how she added $12k more in 90 days using AI email outreach — without a single cold call.
Ready to automate your outreach?
LeadClaw's AI agent handles lead generation, personalized emails, and follow-ups — so you can focus on closing deals.
ON THIS PAGE