I Built a Landing Page Conversion Auditor Using Roboflow’s API
- Link to project on Universe (if available): https://universe.roboflow.com/pramanarendra/web-ui-element-detection-ver-3 (pre-trained model I used)
- Project description: A Python tool that screenshots any landing page, uses Roboflow’s API to detect UI elements (buttons, headings, images, forms, text blocks), and scores the page against conversion best practices — outputting a clean HTML report with a letter grade.
- How you used Roboflow: Used the Web UI Element Detection Ver 3 model from Universe via the serverless hosted API to detect and classify page elements from screenshots. The structured JSON response (element class, position, size, confidence) feeds directly into a scoring engine.
I’m a growth marketer who works with automation and data pipelines. I wanted to see if I could use computer vision to solve a real marketing problem: auditing landing pages for conversion best practices — automatically.
The Problem
Conversion rate optimization usually means manually reviewing landing pages — checking for CTAs above the fold, visual hierarchy, form placement, hero images, etc. It’s time-consuming and subjective. I wanted to automate the first pass.
The Solution
A Python script that:
-
Screenshots any landing page using Playwright (headless browser)
-
Sends the screenshot to Roboflow’s API to detect UI elements — buttons, headings, images, forms, text blocks, links
-
Scores the page against conversion best practices using the detection data
-
Generates an HTML report with the grade, element counts, and detailed findings
The Stack
-
Roboflow API — Used the Web UI Element Detection Ver 3 model from Universe to detect page elements
-
Playwright — Headless browser to capture full-page and above-the-fold screenshots
-
Python — Scoring logic and report generation
-
~100 lines of code total across 4 files
How Detection Works
I send the above-the-fold screenshot to Roboflow’s serverless API. The model returns structured JSON with every detected element — its class (button, heading, image, etc.), position (x, y coordinates), size (width, height), and confidence score.
For example, on roboflow’s homepage, the API detected:
-
5 buttons (CTAs) at 89-93% confidence
-
1 heading at 78% confidence
-
2 images (including the hero) at 61-81% confidence
-
6 text blocks and 1 link
The Scoring Engine
I built 7 rules that use the detection data to score conversion readiness:
| Rule | What It Checks | Max Points |
|---|---|---|
| CTA Presence | Are there call-to-action buttons? | 20 |
| CTA Placement | Are CTAs above the fold? | 20 |
| Headline | Is there a clear headline above the fold? | 15 |
| Hero Image | Is there a large visual element? | 10 |
| Lead Capture | Is there a form? | 15 |
| Text Hierarchy | Are headings and text blocks balanced? | 10 |
| Navigation | Are there navigation elements? | 10 |
Results: Auditing roboflow.com
I ran the auditor on Roboflow’s own homepage. Results:
Score: 90/100 — Grade: A
-
Multiple CTAs found (5 buttons detected) -
CTA above the fold (4 buttons visible without scrolling) -
Headline above the fold (clear value proposition) -
Hero image present (1010x395px) -
No form detected (consider adding a lead capture form or email signup) -
Good text hierarchy (1 heading, 6 text blocks) -
Navigation elements present
The only gap: no lead capture form on the homepage. The page relies on CTA buttons (“Get Started” / “Request a Demo”) to drive conversions instead, which is a valid approach for a product-led growth company — but a form could capture visitors who aren’t ready to sign up yet.
What I Learned
Roboflow as a building block. The API is essentially a new input type for automation pipelines. I’m used to pulling text from APIs and databases — Roboflow lets me pull structured data from images instead. The pattern is the same: input → process → decision → output.
Pre-trained models on Universe are powerful. I didn’t train anything custom. I found an existing model that detected web UI elements, tested it in the playground, and built around it. Time from signup to working prototype was about an hour.
Where this could go next:
-
Batch-process competitor pages and compare scores
-
Add more scoring rules (color contrast, whitespace ratio, mobile responsiveness)
-
Schedule automated audits and send weekly reports to Slack
-
Train a custom model on high-converting vs. low-converting pages to improve detection
Full Code:
## Code
**screenshot.py** — Captures the page
```python
from playwright.sync_api import sync_playwright
def take_screenshots(url):
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page(viewport={"width": 1440, "height": 900})
page.goto(url)
page.wait_for_timeout(3000)
# Above the fold — what visitors see first
page.screenshot(path="above_fold.png")
print("Saved above_fold.png")
# Full page
page.screenshot(path="full_page.png", full_page=True)
print("Saved full_page.png")
browser.close()
take_screenshots("https://roboflow.com")
```
**detect.py** — Sends screenshot to Roboflow and runs the scorer
```python
from roboflow import Roboflow
from score import score_page
from report import generate_report
rf = Roboflow(api_key="YOUR_API_KEY")
project = rf.workspace().project("web-ui-element-detection-ver-3")
model = project.version(2).model
results = model.predict("above_fold.png", confidence=40).json()
print(f"Found {len(results['predictions'])} elements:\n")
for pred in results["predictions"]:
print(f" {pred['class']} — {pred['confidence']:.0%} confidence")
print(f" Position: x={pred['x']}, y={pred['y']}")
print(f" Size: {pred['width']}x{pred['height']}\n")
results = score_page(results["predictions"])
print(f"\n{'='*40}")
print(f"CONVERSION SCORE: {results['score']}/{results['max_score']} (Grade: {results['grade']})")
print(f"{'='*40}\n")
for icon, title, detail in results["findings"]:
print(f" {icon} {title}")
print(f" {detail}\n")
generate_report("https://roboflow.com", results)
```
**score.py** — Conversion scoring engine
```python
def score_page(predictions, page_height=900):
score = 0
max_score = 100
findings = []
buttons = [p for p in predictions if p["class"] == "button"]
headings = [p for p in predictions if p["class"] == "heading"]
images = [p for p in predictions if p["class"] == "image"]
forms = [p for p in predictions if p["class"] == "form"]
links = [p for p in predictions if p["class"] == "link"]
texts = [p for p in predictions if p["class"] == "text"]
# CTA buttons present (max 20 pts)
if len(buttons) >= 2:
score += 20
findings.append(("✅", "Multiple CTAs found", f"{len(buttons)} buttons detected"))
elif len(buttons) == 1:
score += 10
findings.append(("⚠️", "Only one CTA found", "Consider adding a secondary CTA"))
else:
findings.append(("❌", "No CTA buttons detected", "Critical: add a clear call-to-action"))
# CTA above the fold (max 20 pts)
buttons_above_fold = [b for b in buttons if b["y"] < page_height * 0.6]
if buttons_above_fold:
score += 20
findings.append(("✅", "CTA above the fold", f"{len(buttons_above_fold)} button(s) visible without scrolling"))
else:
findings.append(("❌", "No CTA above the fold", "Move your primary CTA higher on the page"))
# Clear headline (max 15 pts)
headings_above_fold = [h for h in headings if h["y"] < page_height * 0.5]
if headings_above_fold:
score += 15
findings.append(("✅", "Headline above the fold", "Clear value proposition visible immediately"))
elif headings:
score += 5
findings.append(("⚠️", "Headline exists but below the fold", "Move headline higher for immediate impact"))
else:
findings.append(("❌", "No headline detected", "Add a clear, prominent headline"))
# Hero image (max 10 pts)
large_images = [i for i in images if i["width"] > 200 and i["height"] > 150]
if large_images:
score += 10
findings.append(("✅", "Hero image present", f"Strong visual element ({large_images[0]['width']}x{large_images[0]['height']}px)"))
elif images:
score += 5
findings.append(("⚠️", "Images found but none are hero-sized", "Consider a larger visual element"))
else:
findings.append(("❌", "No images detected", "Add visual content to increase engagement"))
# Form / lead capture (max 15 pts)
if forms:
score += 15
findings.append(("✅", "Lead capture form detected", "Great for conversion"))
else:
score += 5
findings.append(("⚠️", "No form detected", "Consider adding a lead capture form or email signup"))
# Visual hierarchy (max 10 pts)
if headings and texts:
score += 10
findings.append(("✅", "Good text hierarchy", f"{len(headings)} heading(s) and {len(texts)} text block(s) detected"))
elif texts:
score += 5
findings.append(("⚠️", "Text present but no clear heading hierarchy", "Use distinct heading sizes"))
else:
findings.append(("❌", "No text elements detected", "Page may be too image-heavy"))
# Navigation (max 10 pts)
if links or len(buttons) > 2:
score += 10
findings.append(("✅", "Navigation elements present", "Users can explore the site"))
else:
findings.append(("⚠️", "Limited navigation detected", "Ensure users can find key pages"))
return {
"score": score,
"max_score": max_score,
"grade": get_grade(score),
"findings": findings,
"element_counts": {
"buttons": len(buttons),
"headings": len(headings),
"images": len(images),
"forms": len(forms),
"links": len(links),
"text_blocks": len(texts),
}
}
def get_grade(score):
if score >= 90: return "A"
if score >= 80: return "B"
if score >= 70: return "C"
if score >= 60: return "D"
return "F"
```
**report.py** — HTML report generator
```python
from datetime import datetime
def generate_report(url, score_results, screenshot_path="above_fold.png"):
findings_html = ""
for icon, title, detail in score_results["findings"]:
color = "#10b981" if icon == "✅" else "#f59e0b" if icon == "⚠️" else "#ef4444"
findings_html += f"""
<div style="padding: 16px; margin-bottom: 12px; border-left: 4px solid {color}; background: #fafafa; border-radius: 0 8px 8px 0;">
<strong>{icon} {title}</strong>
<p style="margin: 4px 0 0; color: #555;">{detail}</p>
</div>"""
counts = score_results["element_counts"]
grade = score_results["grade"]
score = score_results["score"]
grade_color = {"A": "#10b981", "B": "#3b82f6", "C": "#f59e0b", "D": "#f97316", "F": "#ef4444"}.get(grade, "#666")
html = f"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Landing Page Conversion Audit — {url}</title>
<style>
* {{ margin: 0; padding: 0; box-sizing: border-box; }}
body {{ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; color: #1a1a1a; max-width: 800px; margin: 0 auto; padding: 40px 20px; }}
.header {{ text-align: center; margin-bottom: 40px; }}
.header h1 {{ font-size: 28px; margin-bottom: 8px; }}
.header p {{ color: #666; font-size: 14px; }}
.grade-box {{ text-align: center; margin: 32px 0; padding: 32px; background: linear-gradient(135deg, #f8f9fa, #e9ecef); border-radius: 16px; }}
.grade {{ font-size: 72px; font-weight: 800; color: {grade_color}; }}
.score-text {{ font-size: 20px; color: #555; margin-top: 8px; }}
.screenshot {{ width: 100%; border-radius: 12px; border: 1px solid #ddd; margin: 24px 0; }}
.section-title {{ font-size: 20px; font-weight: 700; margin: 32px 0 16px; }}
.counts {{ display: grid; grid-template-columns: repeat(3, 1fr); gap: 12px; margin: 24px 0; }}
.count-box {{ text-align: center; padding: 16px; background: #f8f9fa; border-radius: 8px; }}
.count-num {{ font-size: 28px; font-weight: 700; color: #6c3ce9; }}
.count-label {{ font-size: 12px; color: #888; text-transform: uppercase; margin-top: 4px; }}
.footer {{ text-align: center; margin-top: 48px; padding-top: 24px; border-top: 1px solid #eee; color: #999; font-size: 13px; }}
</style>
</head>
<body>
<div class="header">
<h1>Landing Page Conversion Audit</h1>
<p>Analyzed: {url}</p>
<p>Date: {datetime.now().strftime("%B %d, %Y")}</p>
</div>
<div class="grade-box">
<div class="grade">{grade}</div>
<div class="score-text">{score} / {score_results['max_score']} points</div>
</div>
<img src="{screenshot_path}" alt="Page screenshot" class="screenshot">
<div class="section-title">Elements Detected</div>
<div class="counts">
<div class="count-box"><div class="count-num">{counts['buttons']}</div><div class="count-label">Buttons / CTAs</div></div>
<div class="count-box"><div class="count-num">{counts['headings']}</div><div class="count-label">Headings</div></div>
<div class="count-box"><div class="count-num">{counts['images']}</div><div class="count-label">Images</div></div>
<div class="count-box"><div class="count-num">{counts['forms']}</div><div class="count-label">Forms</div></div>
<div class="count-box"><div class="count-num">{counts['links']}</div><div class="count-label">Links</div></div>
<div class="count-box"><div class="count-num">{counts['text_blocks']}</div><div class="count-label">Text Blocks</div></div>
</div>
<div class="section-title">Detailed Findings</div>
{findings_html}
<div class="footer">
<p>Powered by <strong>Roboflow</strong> computer vision API + Python</p>
<p>Built by Sharice Wells</p>
</div>
</body>
</html>"""
with open("report.html", "w") as f:
f.write(html)
print("Report saved as report.html")
```