Automation Pipeline
Everything that happens from the moment a consultant submits the form to leads landing in HeyReach (LinkedIn) or Instantly (Email).
๐ฌLinkedIn Path (HeyReach)
Steps 1โ4 โ Push to HeyReach
No email needed โ leads are pushed by LinkedIn URL with personalized messages as custom variables.
๐งEmail Path (Instantly)
Steps 1โ6 โ Push to Instantly
Same as LinkedIn path + 2 extra steps: find work email from LinkedIn, verify it's deliverable via MillionVerifier.
Entry Point
๐Google Form
Consultant fills out their name, LinkedIn, ICP, areas of expertise, case studies, tool choice (HeyReach or Instantly), and uploads a CSV of their LinkedIn connections.
โกGoogle Apps Script
On form submit, a script fires automatically โ converts the CSV file to base64 and POSTs everything to /api/webhook on Vercel.
๐
Webhook โ /api/webhook
Vercel ยท Next.js API Route
InstantStep 1
Validates the webhook secret so no one else can trigger it
Step 2
Parses the base64 CSV โ extracts first name, last name, LinkedIn URL, company, position per row
Step 3
Saves the submission + all leads to Supabase, then fires the Inngest background pipeline
Background Pipeline โ Inngest
Why Inngest? Vercel serverless functions time out after 60 seconds. Inngest runs the pipeline as a fan-out of parallel background jobs with no time limit. Leads are split into batches of 20, up to 5 batches run simultaneously, and each step is checkpointed so if anything fails it retries from where it stopped. Steps 1โ4 run for both paths. Steps 5โ6 only run for Instantly.
01
LinkedIn Profile Scraping
Apify ยท apimaestro/linkedin-profile-detail
Both pathsEach batch of 20 leads fires off 20 Apify runs simultaneously (fire-and-forget). Inngest then sleeps for 2 minutes while Apify scrapes the profiles. After the sleep, results are fetched and saved.
- Extracts: headline, about section, full experience array, current company
- Fire-and-forget start โ 2 minute sleep โ fetch results โ no Vercel function held open
- If position or company was missing from the CSV, it gets filled from the scraped profile data
- 1 Apify run per lead โ with 5 concurrent batches, up to 100 Apify runs can be active at once
IFScrape fails โ lead is marked scrape_status: failed โ auto-disqualified in Step 3 (no AI call wasted)
Each batch of 20 takes ~3โ4 minutes (start + 2m sleep + process). With 5 batches in parallel: 3,000 leads = ~150 batches รท 5 = ~2 hours.
02
Experience Summarization
OpenAI ยท gpt-5-mini
Both pathsThe raw experience array from Apify is a deeply nested JSON object. We compress it into a clean ~150-word plain text summary before passing it to the qualification step.
- Input: raw scraped_experiences JSON array from Apify
- Output: one line per role โ "Title at Company (dates)"
- Reduces token cost by ~80% for qualification, makes the model's job easier
- 3x retry with backoff on failure โ if all 3 fail, returns empty string
IFOnly runs for leads where scrape_status = done AND scraped_experiences exists. If scrape failed โ this step is skipped for that lead.
03
Lead Qualification
OpenAI ยท gpt-5-mini ยท per lead
Both pathsgpt-5-mini gets the full LinkedIn profile data (headline, about, work history) alongside the consultant's ICP and areas of expertise. It decides if this lead is worth reaching out to. No default rules โ the model judges purely on what it sees vs the ICP. If the data doesn't support a match, it disqualifies.
๐ค
Title Check
Decision-maker title relevant to the consultant's expertise? VP+, Director+, C-suite, SVP, Partner, MD, Head of [function]. ICs, analysts, associates โ disqualified.
๐ข
Company Check
Does the company match the consultant's ICP? The model reads the ICP field and applies it. If the profile suggests a small consultancy but ICP says Fortune 500 โ disqualified.
โ
Active Check
Currently employed in this role? Checks headline, about, and experience for current employment signals. If they left the role โ disqualified.
All 3 pass โ Qualified
Lead moves to name cleaning โ message generation โ push.
Any 1 fails โ Disqualified
Lead is skipped. No cleaning, no messages, never pushed.
IFOnly runs for leads with summarized_experience. If no summary (scrape failed) โ auto-disqualified with reason 'No experience data found'.
FAILIf OpenAI API errors after 3 retries โ lead goes to the Failed tab (not disqualified). It is never pushed.
04
Clean Names + Generate Messages
OpenAI gpt-5-mini (clean) ยท Claude Sonnet 4.6 (messages)
Both pathsAName + Company Cleaning โ OpenAI gpt-5-mini
First Name โ clean_first_name
Removes middle initials: "Michael J." โ "Michael"
Removes suffixes: "Robert Jr." โ "Robert"
Removes handles, special characters, extra formatting
Company โ clean_company_name
Strips legal suffixes: "Acme Corp LLC" โ "Acme"
Removes Inc, Ltd, Holdings, Group, Solutions, etc.
Normalizes casing and removes social handles
IFOnly runs for qualified leads. If disqualified โ skipped entirely, no API call wasted.
BMessage Template Generation โ Claude Sonnet 4.6 ยท once per submission
Claude writes 3 message templates using the consultant's name, expertise, ICP, and case studies. This runs once per consultant โ not once per lead. The templates contain {clean_first_name} and {clean_company_name} as placeholders.
Touch 1Warm reconnect
Opens with "Hey {clean_first_name}, it's been a while...". Update about joining VAI Consulting. ONE specific case study result with a real number. Closes with "Thought of you. Would love to catch up." โ never a pitch, never asks for a call.
Touch 250โ70 words ยท Follow-up
References a second result or rescue/turnaround story from the case studies. Connects it to what the lead's company might be dealing with. Soft close: "if any of that is on your radar at {clean_company_name}, happy to chat."
Touch 325โ40 words ยท Close-file
"{clean_first_name}, last note from me โ" format. Keeps the door open with zero pressure. Ends with "Hope things are going well at {clean_company_name}!"
Substitution โ per lead
For each qualified lead, the placeholders are replaced with their actual cleaned values. The full final messages (with real names) are saved to the lead record and pushed as 1st_Message, 2nd_Message, 3rd_Message custom variables to HeyReach or Instantly.
IFSubstitution only runs for leads with clean_first_name. If company name is missing, the template gracefully strips 'at {clean_company_name}' from the messages.
05
Find Work Emails from LinkedIn
Apify ยท x_guru/linkedin-email-scraper-no-cookies
Instantly onlyAll qualified LinkedIn URLs from the batch are sent in one API call to the x_guru email scraper. It returns work emails and personal emails for each profile. All found emails are saved โ verification in Step 6 picks the best valid one.
- Batch input: sends all 20 URLs (or however many qualified) in a single Apify run
- Returns per lead: work_email + personal_emails array (e.g. corporate, gmail, old company emails)
- ALL found emails saved to the all_emails column โ not just one
- Typical hit rate: 50-70% of profiles will have at least one email found
- If zero emails found โ lead stays qualified but cannot be pushed (missing email)
IFThis entire step is SKIPPED if tool_choice = heyreach. Only runs for Instantly submissions, only for qualified leads.
Example output per lead
work_email: "nathan.bell.81@gmail.com"
personal_emails: ["nathan.bell.81@gmail.com", "pecosbell@aol.com", "nathan@digitaltrends.com"]
06
Verify Emails
MillionVerifier ยท API v3
Instantly onlyEvery email found in Step 5 is sent to MillionVerifier โ not just the "best guess." This way, even if the primary email is invalid, we can still find a valid alternative. After verification, the system picks the best valid email using a priority system.
Email selection priority (from valid emails only)
1stCorporate email matching the lead's current company domainโ e.g. sarah@pepsi.com if company is PepsiCo
2ndAny corporate email (non-personal domain)โ e.g. sarah@oldcorp.com โ corporate but not current employer
3rdAny valid email including personalโ e.g. sarah.j@gmail.com โ last resort, but verified deliverable
- MillionVerifier returns: "ok" (valid), "catch_all", "unknown", "invalid", "disposable"
- Only "ok" results are accepted โ everything else is treated as invalid
- 3x retry with backoff per email โ if all retries fail, email marked as "error"
- Rate limit: 160 requests/second โ even 5,000 emails verified in ~30 seconds
IFSkipped if tool_choice = heyreach. Only runs for leads where all_emails is not null (emails were found in Step 5). If zero emails are valid โ email_verified = false, lead cannot be pushed.
Why verify ALL emails? An old corporate email might be invalid. A personal Gmail might be the only one that works. By checking everything, we maximize the chance of finding a working email for each lead.
๐ฌ
Push to HeyReach (LinkedIn)
Manual trigger ยท Dashboard โ Push page
LinkedIn pathManually triggered from the dashboard. You need to duplicate the master campaign in HeyReach, assign a sender account, and set send times first.
โ
Lead gets pushed if ALL true
- โqualified = true
- โclean_first_name exists
- โAll 3 messages generated
- โpush_status = pending (not already pushed)
โญLead is skipped if ANY missing
- โDisqualified leads
- โMissing clean first name
- โMissing any of the 3 messages
- โAlready pushed
HeyReach payload per lead
profileUrl, firstName, lastName, companyName
+ customUserFields:
1st_Message, 2nd_Message, 3rd_Message
Rate limit: 200ms delay between batches of 20 leads ยท API: POST /list/AddLeadsToListV2
๐ง
Push to Instantly (Email)
Manual trigger ยท Dashboard โ Push page
Email pathManually triggered from the dashboard. Select an Instantly campaign, then push. Only leads with a verified email address are included.
โ
Lead gets pushed if ALL true
- โqualified = true
- โclean_first_name exists
- โAll 3 messages generated
- โpush_status = pending
- โemail_verified = true โ extra
- โwork_email exists โ extra
โญLead is skipped if ANY missing
- โDisqualified leads
- โMissing clean first name
- โMissing any of the 3 messages
- โAlready pushed
- โNo email found
- โEmail found but failed verification
Instantly payload per lead
email, first_name, last_name, company_name
+ custom_variables:
1st_Message, 2nd_Message, 3rd_Message
linkedin_url, clean_first_name, clean_company_name
Rate limit: 500 leads per batch ยท 6,000 req/min ยท API: POST /api/v2/lead/add
Dependency Chain โ What Blocks What
Scrape failsโno summaryโauto-disqualifiedโno clean namesโno messagesโcan't push
If a lead fails at any step, it drops out of the pipeline for that step onward โ but it does not block other leads. Each lead is processed independently within its batch.
Full pipeline at a glance
6
Emails
Apify x_guru
email only
7
Verify
MillionVerifier
email only
8
Push
HeyReach / Instantly
Costs & Rate Limits
| Service | API / Actor | Rate Limit | Cost | Path |
|---|
| Apify (scrape) | apimaestro/linkedin-profile-detail | Free: 25 concurrent, Scale: 128 | ~$0.005/lead | Both |
| Apify (email) | x_guru/linkedin-email-scraper-no-cookies | Same account limit | ~$0.005/lead | Instantly |
| OpenAI | gpt-5-mini | Tier 4: ~10K RPM | ~$0.01/lead | Both |
| Claude | claude-sonnet-4-6 | Standard | ~$0.02/submission | Both |
| MillionVerifier | v3 API | 160 req/sec | ~$0.004/email | Instantly |
| HeyReach | AddLeadsToListV2 | 10 req/2sec | Included | LinkedIn |
| Instantly | /api/v2/lead/add | 6,000 req/min | Included | Email |
| Inngest | Background jobs | 50K runs/month (free) | Free | Both |
| Supabase | PostgreSQL | 500MB (free) | Free | Both |
Cost estimates are approximate. 5,000 leads (LinkedIn path) โ $75. 5,000 leads (Email path) โ $100. Claude cost is per consultant, not per lead.
Things to Watch
Apify concurrent run limit
With 5 batches ร 20 leads = up to 100 concurrent Apify runs at peak. If you're on the free Apify plan (25 concurrent limit), you need to reduce batch concurrency from 5 to 2 in the code. Scale plan (128 limit) handles it fine.
OpenAI billing
Each lead costs ~3-4 API calls (summarize + qualify + clean name + clean company). 5,000 leads โ 20,000 API calls โ $50-100 depending on response lengths. Monitor usage at platform.openai.com.
Supabase storage
Free tier = 500MB. The csv_base64 column stores the full original CSV file. Multiple large batches will eat storage. Monitor in Supabase dashboard โ Database โ Usage.
Inngest monthly runs
Free tier = 50,000 runs/month. Each consultant with 5K leads uses ~252 runs (1 coordinator + 250 batches + 1 finalizer). You can onboard ~49 consultants with 5K leads each before hitting the limit.
MillionVerifier credits
Credits never expire. Current balance shown in the MillionVerifier dashboard. Each email verification uses 1 credit. With the "verify all" approach, a lead with 3 emails uses 3 credits.
Questions? Open the dashboard to check a live run โ each submission shows status: scraping โ qualifying โ generating โ complete.