I Trained an AI to Hunt Leads on LinkedIn — Here's What It Found
The Day I Realized We Were Wasting 20 Hours a Week
Every afternoon, one person disappeared for 4 hours into LinkedIn.
"Find me 50 marketing directors at Series B SaaS companies in the US."
They'd copy names. Paste into a spreadsheet. Copy email addresses. Verify phone numbers.
By the time they were done:
- Half the titles were wrong (LinkedIn updates them constantly)
- 15 emails bounced (people had changed jobs, updates didn't sync)
- 3 were the same person on different accounts (we didn't catch that)
- We paid $200 to do work that took 4 hours and generated garbage data
I watched this happen every single week.
And then I asked: Why isn't this automated?
That question took me six weeks to answer. Now we process 500 leads/week automatically. Accuracy is 97%. No humans required.
Building the AI Hunter
I built a system that does what my demand gen team was doing manually — but at scale.
The pipeline:
LinkedIn Scraper
↓
ICP Scoring (0-100)
↓
Contact Enrichment
↓
CRM Deduplication
↓
Auto-routing to Sales
Layer 1: The Signals
The AI learns what "good fit" looks like from your closed deals:
- Company signals: Series B SaaS, $5M-$50M revenue, hiring engineers
- Person signals: VP/Director level, at account-based targets, viewed your content
- Behavioral signals: Recently changed jobs, promoted, active on LinkedIn
Layer 2: The Scoring Engine
Instead of "yes" or "no", the system gives you a score with reasoning:
Sarah Chen — Score: 87
├─ Firmographic: 24/25 (VP Marketing at Series B SaaS)
├─ Demographic: 23/25 (Right company size, right role)
├─ Behavioral: 22/25 (Active poster, engaged with your content)
└─ Intent: 18/25 (Viewed pricing page 3 days ago, no job change signal)
Action: Hot. Call tomorrow with case study.
Different lead:
Mike Rodriguez — Score: 41
├─ Firmographic: 15/25 (Right title, wrong company size)
├─ Demographic: 18/25 (Director level, but HR, not marketing)
├─ Behavioral: 8/25 (No activity in 6 months)
└─ Intent: 0/25 (Never visited your site)
Action: Nurture. Email case studies monthly.
The system doesn't just score — it explains why.
Layer 3: The Deduplication Problem
Here's what killed our old process: Sarah Chen, VP Marketing at Acme Corp, exists three times in our CRM.
- sarah.chen@acme.com (her work email)
- s.chen@acme.com (older account)
- sarah.c@gmail.com (personal email, somehow in the CRM)
Our sales team thought we had 3 prospects. We had 1.
The AI merges by email + domain + LinkedIn profile with 99.1% confidence. Duplicates are consolidated, history is preserved.
Layer 4: Smart Routing
The system doesn't just find leads — it sends them somewhere intelligent:
- Score 85+: Send to sales immediately (SDR calls today)
- Score 60-84: Add to active nurture sequence (weekly emails)
- Score 40-59: Add to passive nurture (monthly case studies)
- Score <40: Revisit in 6 months (might change jobs)
No leads get ignored. No leads waste SDR time.
The Scoring Engine (How We Rank Leads)
Here's the actual code that assigns scores:
def score_lead(lead_data):
score = 0
reasoning = {}
# Firmographic: 25 points max
firmographic = {
"series_stage": lead_data.get("funding_stage") == "Series B" ? 8 : 0,
"revenue_range": is_in_range(lead_data["revenue"], 5M, 50M) ? 10 : 0,
"industry_match": is_in_industry(lead_data["industry"], ["SaaS", "Tech"]) ? 7 : 0
}
# Demographic: 25 points max
demographic = {
"title_match": get_title_score(lead_data["title"]) , # VP/Dir = 12, Manager = 8, etc
"department": is_marketing_or_revops(lead_data["dept"]) ? 8 : 0,
"company_size": is_company_size(lead_data["employee_count"], 50, 500) ? 5 : 0
}
# Behavioral: 25 points max
behavioral = {
"linkedin_activity": get_activity_score(lead_data["posts_90d"]), # 0-10
"recent_engagement": lead_data["engaged_with_content"] ? 10 : 0,
"changed_job_recently": days_since_role_change(lead_data) < 180 ? 5 : 0
}
# Intent: 25 points max
intent = {
"visited_pricing": lead_data["visited_pricing"] ? 8 : 0,
"visited_demo": lead_data["visited_demo"] ? 8 : 0,
"fits_icp_exactly": all(icp_checks(lead_data)) ? 9 : 0
}
total = sum([firmographic.values()]) + sum([demographic.values()]) + sum([behavioral.values()]) + sum([intent.values()])
return {
"score": total,
"breakdown": {
"firmographic": sum(firmographic.values()),
"demographic": sum(demographic.values()),
"behavioral": sum(behavioral.values()),
"intent": sum(intent.values())
},
"routing": get_routing_decision(total) # 85+ = call today, 60-84 = weekly email, etc
}
Here's What Didn't Work (We Tried These First)
1. Manual LinkedIn searches with copy-paste → 20 hours/week, 65% accuracy
- Downloaded leads as CSV
- Half had wrong titles (LinkedIn updates hourly, our data was stale)
- Problem: SDRs calling "Marketing Manager" who's now "VP Marketing" — credibility destroyed
- Fix: Real-time LinkedIn data + daily refresh. Cost went from $200 manual labor to $8 in API calls
2. Single scoring model → Reps ignored the scores
- "Lead score: 67" — nobody understood why
- Problem: Reps defaulted to their gut ("this one looks good")
- Fix: Break it into 4 components. Reps now see exactly what's strong vs weak and where to focus
3. Not deduplicating → Called Sarah Chen three times in one month
- sarah.chen@acme.com, s.chen@acme.com, sarah.c@gmail.com (all in CRM)
- Problem: She blocked us. Lost deal. Damaged brand.
- Fix: Email + domain + LinkedIn profile matching with 99.1% accuracy
4. Same scoring for all ICP variations → Misallocation
- Scored a "perfect" lead at a Fortune 500 company (our ICP is mid-market)
- Problem: SDRs wasted time on companies that would never buy
- Fix: ICP filter first, then score. Saves SDRs from wasting time on wrong-fit leads
What Changed
Before: 50 leads/week, manually found, high error rate
After: 500+ leads/week, automatically scored, 97% deliverable
Before: SDRs calling wrong titles at wrong companies
After: SDRs calling pre-scored, pre-qualified people ready to engage
Before: No way to track "which LinkedIn search worked"
After: Full attribution from LinkedIn search → lead scored → opportunity created
The Numbers
| Metric | Manual Process | AI System |
|---|---|---|
| Leads/week | 50 | 500+ |
| Accuracy | 65% | 97% |
| Time invested | 20 hrs/week | 2 hrs/week (monitoring) |
| Cost per valid lead | $120 | $8 |
| Close rate (scored 80+) | 8% | 34% |
That 34% close rate surprised me. The AI isn't magically better at closing deals. It's just better at finding people who are actually ready to engage.
When you score leads on firmographic + demographic + behavioral + intent all together, you stop calling people who don't match. The close rate jumps not because we're smarter — because we're targeting better.
The Unit Economics (What This Actually Costs)
Here's where this gets interesting:
| Item | Manual (50 leads/week) | AI System (500 leads/week) | Per-Lead Impact |
|---|---|---|---|
| Time per lead | 24 min | 5 sec | -23.5 min freed up |
| Cost per lead (labor + tools) | $120 | $8 | -$112 |
| Weekly cost (50 leads) | $6,000 | $400 | -$5,600 |
| Monthly cost (200 leads) | $24,000 | $1,600 | -$22,400 |
| Annual cost (10,400 leads) | $288,000 | $19,200 | -$268,800 |
| Close rate (scored 80+) | 8% | 34% | +26 percentage points |
| Cost per closed deal | $3,000 | $235 | -$2,765 |
We went from paying $3,000 to source each closed deal to $235. The system doesn't just scale — it becomes cheaper at scale.
One person doing 50 leads/week was the limit. One system doing 500 leads/week costs 8x less per lead. And the 34% close rate means we're not just finding more leads — we're finding better leads.
Building This
This system is live: agentic-demand-engine
Stack:
- Next.js 14 (dashboards)
- TypeScript strict mode (safety)
- PostgreSQL (lead history)
- Prisma (type-safe queries)
- GPT-4 (intent classification)
Demo works without credentials. Fork, run, see what 500 pre-scored leads look like for your target market.
What I want to know from you:
How are you handling deduplication today? Are you letting duplicates slip through, or do you have a system? Our 99.1% accuracy feels good but I'm curious if there's a better approach.
What's your bottleneck with lead sourcing? Is it finding leads, scoring them, or keeping the data clean? I hear different answers from different teams.
When you score leads, do you weight all signals equally or do you prioritize intent over firmographic? We found behavioral signals matter more than company size, but your ICP might be different.
If you've automated this and found patterns we missed, open an issue. If it breaks on your data, I want to know why.





