TL;DR: I built an open-source Python library that uses semantic embeddings to heal broken UI test locators including cases where every existing tool silently fails. It's called CANVAS and it's on PyPI.
The Test That Broke Everything
Picture this. Your team just shipped a major redesign. The dev who did it was careful — accessibility labels intact, same user flows, same button functions. Just a new component library. Ant Design replacing hand-rolled HTML.
You run your test suite the next morning.
317 failures.
Not because anything broke for users. Because every locator your tests relied on — IDs, class names, XPath selectors — pointed at a DOM that no longer exists. The #submit-btn is gone. The .form-input class now belongs to every single input on the page. The button text moved into a nested <span>.
Your self-healing tool? It tried. It fell back through its attribute chain. It computed tree edit distances. And then it gave up, because when every input has class="ant-input" and no IDs exist, there is nothing structural left to grab onto.
I've watched this happen too many times across too many projects. So I built something different.
The Problem With How Self-Healing Works Today
Most self-healing test frameworks — including the popular open-source Healenium — work by asking: what does this element look like in the DOM?
They record attributes at test-creation time: the ID, the class name, the XPath position, the tag hierarchy. When a locator breaks, they search the new DOM for the candidate that most closely matches those recorded attributes.
This works fine for conservative changes. An ID rename. A class tweak. A button moving one level up in the hierarchy.
It completely falls apart when a design-system migration:
- Removes all IDs (they were implementation details anyway)
- Assigns one shared class to every element of the same type
- Moves elements across DOM containers
- Restructures the entire page hierarchy
Because at that point, the structural signals are gone. And structural signals are all those tools have.
The Insight That Changes Everything
Here's what I noticed: accessibility metadata survives redesigns by design.
When a developer migrates from raw HTML to Ant Design, they're changing the visual presentation and the DOM structure. But they're not changing what the button does. And if they're doing their job properly, the aria-label on the "Submit Order" button still says "Submit Order". The section heading above the card inputs still says "Payment Details". The landmark region around the form is still a form.
These signals are stable because they're authored for a different consumer — screen readers and assistive technology — that the redesign doesn't touch.
So instead of asking "what does this element look like?", I built something that asks: "what does this element mean?"
How CANVAS Works
CANVAS (Context-Aware Navigation and Visual Anchoring System for Selectors) is a Python library that treats locator healing as a semantic retrieval problem.
At record time, for each element you want to track, CANVAS extracts a semantic descriptor — a structured representation of the element's functional identity:
from canvas_heal import SemanticDescriptor, IntentEmbedder, ConfidenceGatedResolver
# Record an intent against your V1 page
descriptor = extract_from_playwright(page, "#submit-btn")
# Captures: role="button", label="Submit Order",
# heading="Checkout", landmark="form"
That descriptor gets serialised into a natural-language intent string:
button labeled 'Submit Order' inside form under heading 'Checkout'
Then it gets encoded into a 384-dimensional dense vector using a locally-running sentence-transformer model (all-MiniLM-L6-v2). No API calls. No cloud. Runs entirely on your machine.
At resolve time, when your test runs against the redesigned page, CANVAS:
- Extracts the same descriptor for every candidate element in the new DOM
- Embeds all candidates in one batch pass
- Finds the closest semantic match by cosine similarity
- Routes through a three-tier confidence gate
That last part is important. Most self-healing tools make a binary decision: heal or fail. CANVAS has three outcomes:
| Score | Status | What happens |
|---|---|---|
| ≥ 0.92 | HEALED | Test continues automatically |
| 0.75 – 0.92 | NEEDS_CONFIRMATION | Test flagged for human review |
| < 0.75 | FAILED | Test aborts with a clear error |
This matters because silent false positives — where a tool heals to the wrong element — are worse than test failures. A test that passes on the wrong button is a test that gives you false confidence.
The Ant Design Migration: Where Everything Else Fails
Let me show you the scenario that motivated all of this.
V1: Raw HTML checkout
<input id="email-input" class="email-field"
aria-label="Email address" type="email">
<input id="card-input" class="card-field"
aria-label="Card number" type="text">
<button id="submit-btn">Place Order</button>
V2: After Ant Design migration
<input class="ant-input" aria-label="Email address" type="email">
<input class="ant-input" aria-label="Card number" type="text">
<span class="ant-btn-text">Place Order</span>
V3: CTA button moves to sticky footer
<!-- Form no longer contains the submit button -->
<footer class="sticky-footer">
<button class="ant-btn" aria-label="Place Order">
<span class="ant-btn-text">Place Order</span>
</button>
</footer>
Here's what happens to Healenium-style selectors at each stage:
-
V1→V2: ID selectors break immediately (IDs gone). Class selectors can't distinguish inputs (all are
ant-input). Every attribute-based tool fails here. - V2→V3: XPath structural selectors break because the button left the form container. Even if you survived V2, you don't survive V3.
Here's what CANVAS does:
Intent: ds_email_input
V2 confidence: 0.963 → HEALED ✓
Intent: ds_card_input
V2 confidence: 0.984 → HEALED ✓
Intent: ds_submit_btn
V2 confidence: 1.000 → HEALED ✓ (aria-label identical)
V3 confidence: ≥0.75 → HEALED ✓ (moved outside form, still healed)
All five intents heal across both migration stages. The aria-label is the anchor that survives everything.
And when I deliberately test the failure case — a button whose text changed from "Place Order" to "Complete Purchase" with no aria-label — CANVAS correctly returns FAILED (0.619) rather than silently healing to the wrong element. The confidence gate works in both directions.
Getting Started
pip install canvas-heal
Record your intents against your current page:
canvas-heal rerecord \
--url https://yourapp.com/checkout \
--selector "#submit-btn" \
--name checkout_submit \
--db tests/intents.db
Use in your pytest suite with zero boilerplate — the pytest plugin auto-loads:
def test_checkout(page, canvas_resolver):
result = canvas_resolver.resolve("checkout_submit", candidates)
if result.status == "HEALED":
page.click(result.selector)
elif result.status == "NEEDS_CONFIRMATION":
pytest.skip(f"Intent uncertain (confidence={result.confidence:.3f}), needs review")
Works with both Playwright and Selenium. Shadow DOM and iframes supported. Multilingual UI labels supported via model swap.
What Makes This Different
| CANVAS | Healenium | Commercial tools | |
|---|---|---|---|
| Healing approach | Semantic embeddings | XPath tree edit distance | Attribute ensembles |
| Survives shared-class migrations | ✅ | ❌ | ❌ |
| False positive protection | Three-tier confidence gate | None | Varies |
| Runs locally | ✅ No API key | ✅ | ❌ Cloud required |
| Shadow DOM | ✅ | Partial | Varies |
| Open source | MIT | Apache 2.0 | ❌ |
The Research Behind It
I've also written this up as a research paper currently under submission to arXiv (cs.SE). The paper includes a formal evaluation methodology, comparison against baseline approaches, and the full adversarial test suite design. I'll update this post with the arXiv link once it's live.
The test suite has 93 passing tests across unit, adversarial, and two real-world pilot scenarios. The adversarial suite includes 10 scenarios designed specifically to catch false positives — because a self-healing tool that over-heals is more dangerous than one that fails honestly.
What's Next
The roadmap includes PostgreSQL backend for parallel CI, a REST service mode so teams share one model instance instead of each agent downloading 90 MB, TypeScript and Java SDKs, and an automatic intent discovery tool that crawls your page and suggests intents without a manual record pass.
The GitHub backlog is fully public if you want to follow along or contribute.
Try It
-
PyPI:
pip install canvas-heal - GitHub: github.com/mandavillivijay/canvas
- Docs: See the README for full CLI reference and pytest integration guide
If you've ever lost a day to locator maintenance after a redesign, I'd love to hear whether this solves your problem. Issues, PRs, and hard failure cases all welcome.
Tags: #testing #playwright #selenium #python #testautomation #opensource










