A/B Testing Emails: The Complete Ecommerce Guide

A/B testing is where email marketing stops being guesswork and starts being a data-driven growth engine. Yet most ecommerce brands either don't test at all, or test randomly without a framework — changing too many variables at once and drawing conclusions from insignificant sample sizes. Done right, systematic A/B testing compounds over time: a 5% improvement here and a 10% improvement there can double your email revenue within a year.

Here's the complete guide to A/B testing emails for ecommerce, from what to test to how to scale your winners.

What to Test (In Priority Order)

Not all tests are created equal. Focus on the variables that have the biggest impact on revenue first, then work your way down to refinements.

1. Subject Lines

Subject lines determine whether your email gets opened at all. They're the highest-leverage test you can run. Test these variables:

Length: Short (3-5 words) vs. long (8-12 words). Short subject lines often win on mobile.
Emoji vs. no emoji: Results vary by audience. Test it — don't assume.
Personalization: First name vs. no first name. "[Name], your cart is waiting" vs. "Your cart is waiting."
Urgency: "Last chance: 24 hours left" vs. "Our sale is still going." Urgency typically wins but can fatigue audiences over time.
Curiosity: "You won't believe what's back" vs. "Our bestseller is back in stock." Curiosity drives opens but can hurt trust if overused.

2. Send Times

When you send matters more than most brands realize. Test morning (8-10 AM) vs. afternoon (1-3 PM) vs. evening (7-9 PM). Also test days of the week — Tuesday and Thursday are conventional wisdom, but your audience may be different. The key insight: optimal send time varies by segment. Your VIP customers may engage at different times than your broader list.

3. Offers and Incentives

This directly impacts revenue. Test:

Discount type: Percentage off (20%) vs. dollar amount ($15 off). For orders under $100, dollar amounts often feel bigger. For orders over $100, percentages tend to win.
Free shipping vs. discount: Free shipping frequently outperforms a dollar-equivalent discount.
Gift with purchase vs. discount: GWP can protect margins while driving similar conversion rates.

4. Email Design and Layout

Single-column vs. multi-column: Single-column is better for mobile; multi-column can showcase more products on desktop.
Image-heavy vs. text-focused: Heavily designed emails look premium but text-based emails often outperform for deliverability and click-through.
Long vs. short emails: Test a 3-section email against a single-hero-image-with-CTA email. Shorter often wins for campaigns; longer works for educational flows.

5. Calls to Action (CTAs)

Button text: "Shop Now" vs. "Get Yours" vs. "Claim Your Discount." More specific CTAs usually outperform generic ones.
Button color: Sounds trivial, but high-contrast buttons can improve click rates by 10-20%.
Number of CTAs: One primary CTA vs. multiple options. For product launches, single-focus wins. For newsletters, multiple CTAs work.

6. Content and Copy

Social proof: Reviews and testimonials vs. product features. Social proof almost always improves conversion.
Storytelling vs. direct: A brand story angle vs. straight product promotion. Test this in your welcome series.
Personalization depth: Product recommendations based on browsing history vs. generic bestsellers.

Statistical Significance Explained Simply

Here's the part most brands get wrong. Statistical significance means your test result is likely real and not just random chance. In practical terms, you need enough people in each test group for the results to be trustworthy.

The magic number most statisticians use is 95% confidence — meaning there's only a 5% chance the winning variant won by luck. To reach 95% confidence, you typically need:

For open rate tests (subject lines): At least 1,000 recipients per variant, ideally 2,500+
For click rate tests (design, CTAs): At least 2,500 recipients per variant, ideally 5,000+
For conversion/revenue tests: At least 5,000 recipients per variant, ideally 10,000+

If your list is smaller than these numbers, focus on testing high-impact variables (subject lines) where differences are large enough to detect with smaller samples. Don't test button colors on a 2,000-person list — you'll never get meaningful data.

The A/B Testing Framework

Follow this 5-step process for every test:

Step 1 — Hypothesize: "I believe [change] will [improve metric] because [reasoning]." Write it down. This prevents post-hoc rationalization.
Step 2 — Isolate: Change ONE variable at a time. If you change the subject line AND the design, you won't know which caused the difference.
Step 3 — Split evenly: Use a 50/50 split for maximum statistical power. If you're nervous about a radical change, use 80/20 — but know it'll take longer to reach significance.
Step 4 — Wait: Don't call the test early. Let it run for at least 24 hours for campaigns, and until both variants have enough volume for flows.
Step 5 — Document and implement: Record every test result in a shared document. Winners become your new default. Losers are valuable data too.

Common A/B Testing Mistakes

Testing too many things at once: You changed the subject line, hero image, and CTA. Open rates went up. Which variable caused it? You have no idea.
Calling winners too early: Variant A is winning after 2 hours with 200 opens. You declare it the winner. But early openers behave differently than afternoon openers — the result may reverse.
Ignoring revenue: Variant A had higher open rates but Variant B generated more revenue. Always track the metric that matters most — and for ecommerce, that's revenue.
Not testing flows: Most brands only A/B test campaigns. Your flows send year-round and have huge cumulative volume. A winning subject line on your abandoned cart flow impacts every future abandoner.
Testing once and forgetting: Audiences evolve. A winning subject line formula from 6 months ago may not win today. Retest your key assumptions quarterly.

Setting Up A/B Tests in Klaviyo

Klaviyo's built-in A/B testing is solid for campaigns. Go to Campaigns, create a new campaign, and toggle on "A/B test this campaign." You can test subject lines, content, or send times. Set your sample size (we recommend 20-25% of your list per variant), define the winning metric and wait time, and the winning variant automatically sends to the remainder.

For flows, Klaviyo supports conditional splits that function as A/B tests. Add a conditional split, set it to a random sample, and route 50% of recipients to each branch. Each branch can have different emails, timing, or content. Monitor performance in the flow analytics and manually route 100% to the winner once you reach significance.

Key Takeaway

A/B test in priority order: subject lines first, then offers, then design and CTAs. Always isolate one variable, wait for statistical significance (95% confidence), and track revenue as your primary success metric. Document every test result and implement winners immediately. The brands that test systematically don't just improve — they compound their advantages over time.

Ready to Scale Your Email Revenue?

Get a free audit of your current email program and see exactly where the opportunities are.

Get Your Free Audit