Modern mailbox providers classify mail vs spam using sender reputation (60-70% of weight), authentication results (SPF, DKIM, DMARC), per-user engagement behavior, and content signals — in that order. Pure content scoring (the SpamAssassin model) is mostly historical. For senders, this means reputation and authentication matter far more than subject-line tweaks.
Mail vs Spam: How ISPs Actually Decide
The mail vs spam decision used to be mostly about content. SpamAssassin would score your message on a hundred or so heuristics — capitalized words, exclamation points, money phrases — and tag anything above a threshold as spam. Filtering today is barely about content at all. Reputation, authentication, and per-user engagement do the heavy lifting.
This matters for senders because most "stop landing in spam" advice still circulates as if it's 2010. Removing the word "free" from your subject line will not fix your placement problem. Fixing your DKIM alignment or your complaint rate probably will.
The modern mail and spam classifier
A simplified version of what Gmail, Microsoft, and Yahoo are actually doing:
| Signal layer | Weight | What goes in |
|---|---|---|
| Sender reputation | High | Domain and IP reputation scored continuously |
| Authentication | High | SPF pass, DKIM pass, DMARC alignment |
| Per-user engagement | High | Opens, replies, deletes, marks-as-spam by this recipient |
| Aggregate complaints | Medium | Mark-as-spam rate from all users for this sender |
| Content patterns | Medium-low | Link reputation, attachment types, message structure |
| Origin signals | Low | Reverse DNS, hosting reputation, TLS quality |
Note what's not in the top tier: subject line wording, emoji, image-to-text ratio, the word "free." These factor in at the margin but are not the decision boundary.
How sender reputation works
Every major mailbox provider maintains a reputation score for your sending domain (and for your IPs, though domain weight has grown over time). Reputation is essentially a moving average of how recipients have responded to your mail.
Signals that raise reputation:
- Recipients opening, clicking, replying, marking as not-spam
- Low bounce rates
- Consistent volume (no spikes)
- Authentication passing
- Long history at the provider
Signals that drop reputation:
- Mark-as-spam clicks
- Sending to spam traps (recycled or pristine addresses that should never get mail)
- High bounce rates
- Authentication failures
- Sudden volume increases
Google Postmaster Tools shows your reputation in four tiers (Bad, Low, Medium, High). The Google Postmaster Tools guide walks through how to read the dashboard.
Practitioner note: Reputation is sticky in both directions. It takes weeks to build and weeks to lose. The single worst pattern I see is a brand running a "list reactivation campaign" to dormant addresses — one bad send can drop you from High to Low overnight and it takes 4 to 6 weeks of disciplined sending to climb back.
Where authentication fits
SPF, DKIM, and DMARC don't directly determine whether mail is spam. They determine whether the receiver can trust the sender identity. Without that trust, every other reputation signal is unreliable — the receiver can't be sure who actually sent the message.
The practical rule: unauthenticated mail (or mail that fails DMARC alignment) is treated with suspicion regardless of content. Mail that authenticates cleanly gets the benefit of reputation history.
This is why Gmail and Yahoo's bulk sender requirements made authentication mandatory at 5,000+ sends/day. Without it, providers can't reliably score reputation.
Per-user behavioral filtering
The biggest shift in filtering over the last five years is per-user models. Even with strong domain reputation, individual users see different placement based on their past behavior.
A simplified version of the per-user signal:
- User A: opens every newsletter, occasionally clicks. Future mail → Inbox.
- User B: never opens, deletes within seconds. Future mail → Spam after 5-10 messages.
- User C: marked sender as spam once. All future mail → Spam permanently (until manually whitelisted).
This is why "engagement-based segmentation" works. Sending only to recipients with recent opens or clicks keeps you in the high-engagement bucket per user, which lifts aggregate placement.
Practitioner note: If you're stuck in spam for one provider and inbox elsewhere, the cause is almost always engagement collapse at that provider. Sending less, to fewer addresses, to people who actually want the mail, fixes this faster than any content change.
Where spam comes from
To understand mail vs spam classification, it helps to know what filters are protecting against:
- Botnets and compromised hosts — automated sending from infected machines. High volume, low IP reputation, often fails authentication entirely.
- Snowshoe spam — distributes sending across many low-volume IPs/domains to evade per-source reputation. Often uses freshly registered domains.
- Phishing — targeted impersonation of legitimate brands. DMARC at p=reject is the primary defense.
- Unsolicited bulk mail from "legitimate" senders — purchased lists, scraped contacts, dormant subscribers reactivated without permission. This is the category most well-meaning marketers accidentally fall into.
Filters are calibrated against all four. The reason your legitimate marketing mail gets caught by spam filters is that the pattern (bulk, commercial, low engagement) overlaps with the patterns spam uses.
What this means for senders
Concrete priorities for staying out of the spam folder:
- Authentication first. Get SPF, DKIM, and DMARC right. See DMARC setup guide.
- Manage complaints. Stay below 0.10% aggregate, well under Gmail's 0.30% threshold.
- Engagement-segment your list. Send actively to active recipients. Stop sending to people who haven't engaged in 90+ days without re-permission.
- Watch your volume curve. No sudden spikes without warmup. See IP warming guide for ramp patterns.
- Don't fixate on content. Subject lines and copy matter for opens and conversions — not for placement.
When to worry about content
Content does matter when it triggers specific filter patterns:
- Links to domains with poor reputation
- Mismatch between visible URL and link target
- HTML that resembles phishing templates
- Attachments of risky file types (.exe, .scr, certain .docx with macros)
- All-image emails with no text body
But for routine marketing or transactional mail, content is rarely the root cause of spam placement. If your placement dropped after a normal send to a normal list, look at reputation and engagement first.
If you're trying to figure out why specific campaigns are landing in spam and need an engineer-level audit, book a consultation. I'll pull your DMARC reports, Postmaster Tools data, and engagement segments and tell you the actual root cause.
Sources
- Google — How Gmail's Spam Filter Works
- Microsoft Learn — Anti-Spam Protection in Exchange Online Protection
- RFC 7489: DMARC
- M3AAWG Sender Best Common Practices v3
- Apache SpamAssassin Project Documentation
- Yahoo Postmaster — Best Practices
v1.0 · May 2026
Frequently Asked Questions
Where does spam email come from?
Most spam originates from compromised hosts, botnets, snowshoe networks (many low-volume IPs to evade reputation), and legitimate bulk senders whose hygiene has collapsed. Phishing is a distinct subcategory — typically targeted, often using lookalike domains. Authentication (SPF, DKIM, DMARC) is what separates legitimate senders from spammers technically.
What's the difference between mail and spam?
From a technical standpoint, there is no inherent difference — both are SMTP messages. The distinction is made by the receiving mailbox provider based on sender reputation, authentication, content, and per-user behavior. The same message can land in the inbox at one provider and spam at another.
How do ISPs decide what's spam?
Major mailbox providers (Gmail, Microsoft, Yahoo) use machine-learning classifiers trained on billions of messages. Inputs include sender domain/IP reputation, SPF/DKIM/DMARC pass rates, complaint rates, per-user engagement, content patterns, link reputation, and origin signals. Reputation and engagement dominate; pure content scoring is secondary.
Can good email be marked as spam?
Yes — false positives happen. Common causes: missing or misaligned authentication, sudden volume spikes, sending from shared IPs with poor neighbors, content that matches phishing patterns (urgent CTAs + financial language), or per-user filters from past behavior. Legitimate transactional mail gets flagged occasionally too.
Why does the same email go to inbox for some recipients and spam for others?
Per-user behavioral models. Gmail and Microsoft both score sender-recipient relationships individually. A user who has opened your last 10 messages will get the inbox; a user who deleted your last 10 without opening will get spam. Aggregate reputation sets the baseline; per-user behavior fine-tunes from there.
Want this handled for you?
Free 30-minute strategy call. Walk away with a plan either way.