An afternoon of iOS perf work, with Claude in the loop

Subtitle: An iOS performance investigation I actually did, end-to-end through Claude Code, with MCP-driven simulator control, xctrace Time Profiler, and the Memory Graph CLI. What the AI handled, what it didn't, and how my workflow changed.

Hey folks, let me walk you through an afternoon I had debugging an iOS perf ticket I'd been postponing for three weeks. Motivations, obstacles, dead ends, and wins included.

Quick setup. Working on an app that uses SwiftUI quite a bit. The SavedItems tab was getting slow after about 15 location detail screens. Probably a memory leak. Probably something related to SwiftUI navigation. Probably "easy" after I stopped to analyze it. Spoiler: None of those hypotheses were correct.

The investigation took an entire afternoon and resulted in three independent corrections and three pull requests. And most of my time at the keyboard was spent reviewing what Claude had just done, not typing.

This text is not an account of how AI changed someone's life (please don't). It's a description of a workflow that I believe many mobile application engineers haven't yet experienced, with its friction points and parts that didn't work. If you work with iOS at a company, as an indie developer, or as a freelancer, this could be a valuable investment of a few hours (or minutes) in tools that will yield benefits when something on the device malfunctions.

The workflow, drawn out

The unusual feature of this diagram is that everything except the Developer node and the artifacts is accessible from a single chat session. Claude reads .memgraph files via leaks, controls the simulator through XcodeBuildMCP, and opens PRs via gh. My job is to (a) capture artifacts that require physical access to a device, (b) review proposed changes, and (c) ensure the integrity of the LLM.

The setup

Three tools, all pre-configured:

Claude Code in the terminal, pointing to the iOS repository.
XcodeBuildMCP, a Sentry Model Context Protocol server that exposes approximately 60 Apple development tools to LLM (build workspace, run in simulator, touch UI elements, capture screenshots, attach LLDB, stream logs). Installed via npm, with a single line of MCP client configuration.
xctrace, the CLI for Instruments. Logs Time Profiler traces on a physical device and exports them to XML, which LLM can parse. Plus the things macOS gives you for free: leaks, heap, vmmap, atos, xcrun simctl. The CLI tooling for iOS performance is a lot better than people think.

How it actually went

I'm going to describe what happened more or less in the order in which it occurred. Not a polished retrospective. The genuine story, including the dead end where I got lost for two hours before getting back on track.

Step 1. Reframing the problem (5 minutes)

I started the conversation with a short sentence: "Opening the SaveItems location details is slow after about 15 cycles." Claude consulted existing memory entries from previous sessions (project-specific information about the source code, file layout, and naming conventions) and immediately questioned the explanation: "Slow how? Memory leak, instance buildup? Or, in real time, does each opening seem to take longer?"

This is a useful first step. "Slow" is a symptom; the problem lies in retention (objects accumulating) or computation (task taking too long). There are different tools for each case. I described the symptom in observable terms: "after 15 openings, the next one takes about 6 seconds, and the Memory Graph shows 12 active instances of DetailViewModel after 12 cycles". And the path forked: investigate the leak first, as it is the simplest problem, and we have the ground truth (the Memory Graph has already shown us that the instances are not being released).

Step 2. Reproducing on the simulator, hands-off (10 minutes)

This is where MCP earns its keep. Claude built and ran the app on the iOS Simulator via a single tool called:

mcp__xcodebuildmcp__build_run_sim()

Next, I used the app: I tapped the Saved Items tab using the accessibility-adapted touch commands, tapped a location card, tapped back, tapped another card, and repeated the process. I captured screenshots between steps so the app could check (and so I could see in the chat history) what state it was in. There were about 12 opening/closing cycles without my interference.

With MCP running, I monitored the Xcode console and copied the relevant output to the chat. It's a bit worse than being fully automated, but it works.
By minute 30, we had a confirmed reproduction and ROOT CYCLE candidates from a Memory Graph file I'd exported and dropped on the Desktop.

Step 3. `leaks ~/Desktop/x.memgraph` (10 minutes that ended the leak)

Here's the part that ended a lot of speculation in two minutes. I exported a .memgraph from Xcode (Debug ➜ View Debugging ➜ Capture View Hierarchy ➜ Memory Graph ➜ File ➜ Export Memory Graph). Saved it to ~/Desktop/example-leaks.memgraph. Sent the path to Claude.

It ran:

leaks ~/Desktop/example-leaks.memgraph 2>&1 | grep -E "ROOT CYCLE|DetailViewModel" | head -40

And produced the chain:

ROOT CYCLE: SwiftUI._DictionaryStorage<AnyHashable, WeakBox<AnyLocationBase>>
  → TagIndexProjection<Int>
    → ForEachState<MediaGalleryItem...>
      → Closure context (.onImageSliderTap)
        → ._viewModel.wrappedValue → DetailViewModel
        → ._coordinator → DetailsCoordinator

That's a SwiftUI internal observation graph that holds a closure capturing self from inside a photo carousel ForEach with a .tag(Int) modifier on items. The closure was onImageSliderTap, passed into MediaCarouselHeaderView. It captured self strongly, which retained the entire view's @ObservedObject viewModel and @State coordinator backings forever.

I'd never seen TagIndexProjection<Int> before. Wouldn't have guessed .tag() caused this. The CLI told me directly. Without leaks, I would have spent another four hours auditing closures.

The fix was 15 lines. I hoisted handlePhotoTap to static, captured [weak viewModel, weak coord = self.coordinator] instead of relying on implicit self. Re-captured a fresh .memgraph. Zero ROOT CYCLEs containing my classes. Done. 🎉

Step 4. The dead end I'd have walked into anyway (90 minutes. Pause here.)

After the leak fix, the screen still felt slow. The next obvious hypothesis: fullScreenCover tears down the SavedItems' SwiftUI tree on dismiss, the parent re-renders a 50-cell grid with AsyncImages, that's why the next open is laggy.

Easy test: swap .fullScreenCover(item:) for .sheet(item:). One-line change. Ran it on the simulator, captured a fresh Memory Graph. Same slowness. Comparable counts. Hypothesis rejected.

This is where my workflow provided me with something I want to highlight specifically: my hypothesis was wrong, and the test cost 10 minutes instead of half a day. I changed one line, Claude recompiled, ran the simulator, captured the artifact, performed the analysis, and provided me with the results.

Reverting the change was easy (Git is wonderful). Total cost of the error: less than fifteen minutes. With a manual workflow, this experiment would have required an hour of work.

Step 5. Time Profiler, comparison-first (45 minutes)

The pivot: if presentation isn't the bottleneck, the cost has to be in the work each open does. I needed CPU samples.

xctrace Time Profiler against my iPhone, attached to the running app:

xcrun xctrace record \
  --template 'Time Profiler' \
  --device <UDID> \
  --attach DemoApp \
  --time-limit 90s \
  --output ~/Desktop/saveditems-tti-device.trace

I drove the device manually for 90 seconds. Same flow, six places opened and closed, plus scrolling. Then a second pass on Browse (the fluid baseline) for comparison.

I exported the time-profile schema of each .trace (this part of xctrace works via --xpath, unlike the Leaks data) to Claude, and it wrote a small analyzer in Python to count the frames that include the main thread and generated this side-by-side comparison:

Frame in `DemoApp` binary	Browse	SavedItems
`GraphQLClient.init`	7.7%	23.7%
`NetworkConnectivityChecker.init` ➜ `CTTelephonyNetworkInfo.init`	low	18.5%
`*Grid.body.getter`	9.9%	19.8%
`ActionsFactory.SavedItemsContext.make`	n/a	18.9%

According to the potential-freezes schema, SavedItems experienced 35 freezes lasting more than 250 ms in 90 seconds, totaling 21.97 seconds of freezing. Browse (another part of the application that uses the same components and some of the same structures; I used it for comparison) experienced 6 freezes, totaling 2.87 seconds. The main thread of SavedItems was hangs for 24% of the time during normal use.

The stack told a clean story. Every GridItemView body recompute was building fresh ItemActionsViewModel instances per cell, each one allocating a fresh GraphQLClient, each one allocating a fresh CTTelephonyNetworkInfo (a CoreTelephony class with a documented 30 to 100ms allocation cost on iOS). Multiply by N visible cells × every recompute. Main thread freezes everywhere.

Claude grepped the relevant factory file and found the smoking gun: four out of five *Context.make enums in ActionsFactory.swift use ViewModelCache.shared.getOrCreateViewModel(...). The fifth, SavedItemsContext.make, bypasses the cache and creates new VMs unconditionally.

Fix: A 50 lines mirroring the existing cache pattern in BrowseContext.make. This allowed me to recapture a trace using Time Profiler.

Metric	Before	After
Hangs >250ms	35	0
Total hang time	21.97s	0s
`GraphQLClient.init`	23.7%	7.9% (parity)
`CTTelephonyNetworkInfo.init`	18.5%	6.1% (parity)

🚀

Step 6. Stacked PRs and app-wide cleanup (30 minutes)

By this point, three independent fixes had emerged: the leak (a PR), the cache parity (a second PR added to the first), and a third that encompassed the entire application. Even after parity, both SavedItems and Browse were consuming about 6% of the main thread in CTTelephonyNetworkInfo.init, because the convenience init of GraphQLClient created a new NetworkConnectivityChecker each time. This NetworkConnectivityChecker should be a singleton across the entire application, not just in SavedItems.

Claude:

Used gh pr create --base feature/leak-fix-001 to stack PR #4406 on top of PR #4405.
Opened PR #4407 against dev directly (independent change).
Wrote each PR description with the before/after tables embedded. When the Time Profiler validation showed the cache fix alone was sufficient (the originally planned 5-step migration was no longer needed), it dropped the unnecessary steps from the PR scope.

I reviewed each PR and each commit message before it went out. The total typing I did on those was about 200 characters of confirmation. Everything was draft-ready.

What changed for me

A few things stand out, and they're not all positive.

The not-so-good first.

The CLI for iOS perf investigation is great, but it's brittle. xctrace --template Leaks --attach silently produces empty data due to a libmalloc not initialized error that you only see if you dig into a SQLite file inside the trace bundle. Some custom logging SDKs don't show in simctl log stream. SourceKit gets confused after tuist generate and reports false-positive errors. I had to know about all of these. Claude doesn't always. And the time savings depend on my catching the wrong path before going deep.

On the other hand, a lot of intuition in mobile engineering is also wrong, and the LLM is faster than I am at testing wrong intuitions. My hypothesis of .fullScreenCover ➜ .sheet was a wrong path that I would have followed even further without the cheap experimentation cycle. The retention cycle hypothesis I started with ("audit each closure looking for [weak self]") was also wrong.

The good.

Treating each artifact (.memgraph, .trace, screenshot) as a programmable input changes the loop. Memory Graph isn't "open Xcode and stare at the sidebar". It's leaks ~/path.memgraph 2>&1 | grep ROOT CYCLE piped through Python that an LLM can write inline. Time Profiler isn't "scrub through the timeline in Instruments GUI". It's xctrace export --xpath '/trace-toc/run/data/table[@schema="time-profile"]' and a 30-line parser. Once the artifacts are in CLI form, the LLM is genuinely useful.

The LLM is especially good at the boring parts. Writing a Python parser to fold thousands of stack frames into a top-20 inclusive table is exactly the kind of task it's fast at. Producing a side-by-side comparison table for a PR description with consistent formatting? Same. The stuff that's not intellectually hard but is attention-tax hard.

I keep memory entries in ~/.claude/projects/<repo>/memory/ for project-specific facts: the ViewModelCache pattern, the CTTelephonyNetworkInfo allocation-cost trap, and the SwiftUI TagIndexProjection pitfall. Next time someone (me or a colleague who picks up the workflow) starts a similar investigation, the LLM begins with that context instead of rediscovering it.

I also wrote a slash command, /perf-investigate, that captures the workflow as a checklist and rejects the natural temptations: don't propose architectural changes before a .memgraph or .trace exists, don't use xctrace --template Leaks --attach because it doesn't work, weak-capture only the closure proven by the memgraph to be the cycle root (not all of them). The slash command is the discipline that keeps me out of dead ends.

What I'd do differently

Three things.

Capture the "fluid" baseline first. When the symptom is "X feels slow", capture Time Profiler on X and on a sibling feature that's known to be fluid. The comparison is ten times more informative than the absolute numbers. I almost skipped the Browse baseline. That comparison was what made the cache-miss diagnosis irrefutable.

Resist sizing the ticket to the size of the original plan. I scoped the migration as a five-step refactor up front. The Time Profiler showed that step 1 alone closed the gap, and steps 2 to 5 were dropped. If your plan is "do A, then B, then C, then validate", validate after A and re-plan. Don't let the size of the original plan anchor the actual scope.

Memory Graph CLI is underused even by people who use the GUI version daily.
The Memory Graph debugger in Xcode is well known, but most devs never realize there are leaks, heap, and vmmap CLI tools that operate on .memgraph files and are fully scriptable. Combine that with an LLM in the loop, and you get a feedback cycle that most teams haven't tried.

I Packaged the Workflow

After this investigation, I sat down and transformed the manual parts into an MCP server, memorydetective. The first cut used 12 tools to cover the workflow above. By v1.8, it had grown to 31 tools, 34 catalog resources, and 5 Investigation Prompts covering the Instruments ecosystem.

v1.8 in particular was born from a real regression. On macOS 26.x, leaks --outputGraph aborts with Failed to get DYLD info for task whenever the target was not launched with MallocStackLogging=1. The new bootAndLaunchForLeakInvestigation absorbs build + boot + install + launch with the pre-propagated env var for capture to work out of the box, and captureMemgraph now returns a structured workaroundNotice pointing to the recordTimeProfile (Allocations) fallback when the regression hits anyway. The agent decides; The tools just stop lying about the failure mode.

memgraph analysis: analyzeMemgraph, findCycles, classifyCycle (which would have gotten the TagIndexProjection cycle in 30 seconds, with fix hint), findRetainers, diffMemgraphs, countAlive, reachableFromCycle
cycle-semantic CI gating: verifyFix, compareTracesByPattern
xctrace coverage: analyzeHangs, analyzeAnimationHitches, analyzeAllocations, analyzeAppLaunch, analyzeTimeProfile
macOS unified logging: logShow, logStream
capture + boot/launch: recordTimeProfile, captureMemgraph, bootAndLaunchForLeakInvestigation (single-call build + boot + launch with MallocStackLogging=1)
verify-fix loop: replayScenario (drives the simulator via tap/swipe/wait/type with a repeat count for leaks that only appear after N iterations), captureScenarioState (composite before/after snapshot: memgraph + screenshot + accessibility tree)
discovery: getInvestigationPlaybook, listTraceDevices, listTraceTemplates
retain-cycle visualization (Mermaid/DOT): renderCycleGraph
leak detection in XCUITest to run in CI: detectLeaksInXCUITest (experimental)
bridge with Swift source via SourceKit-LSP: swiftGetSymbolDefinition, swiftFindSymbolReferences, swiftSearchPattern, swiftGetSymbolsOverview, swiftGetHoverInfo

The cycle catalog covers SwiftUI (including Swift 6/@Observable/ SwiftData/NavigationStack), Combine, Swift Concurrency (including AsyncSequence-on-self), UIKit, Core Animation, Core Data, the Coordinator pattern, RxSwift, and Realm. Every classification carries a staticAnalysisHint pointing to the SwiftLint rule that would catch it in the parsing, or an explicit gap warning when there is no static rule. And a fixTemplate with a Swift before/after snippet that can be directly adapted.

It's Apache 2.0, it's on npm (memorydetective@1.8.0), and it works with Claude Code, Claude Desktop, Cursor, Cline, Kiro, and (experimentally) GitHub Copilot Agent mode.

Two ways to install. The classic MCP path:

npm install -g memorydetective

// ~/.claude/settings.json
{ "mcpServers": { "memorydetective": { "command": "memorydetective" } } }

Or, if you're on Claude Code, the same workflow ships as a one-command plugin install (no JSON edit, no global npm):

/plugin marketplace add carloshpdoc/memorydetective-plugin
/plugin install memorydetective@memorydetective-plugin

This plugin wraps the same MCP server memorydetective@^1.8 (pulled via npx under the hood) and also includes a slash command /perf-investigate with the built-in discipline checklist (don't propose architectural changes before memgraph or trace exists, don't trust xctrace --template Leaks --attach, weak-capture only the closure proven as the root of the cycle, etc.). Same workflow, less typing.

Then you ask the LLM something like "run leaks on ~/Desktop/myapp.memgraph and tell me what's leaking". The agent calls analyzeMemgraph ➜ classifyCycle and you receive a structured diagnosis with a fix hint. Or you can use it via the shell: memorydetective analyze ~/Desktop/myapp.memgraph.

What honestly isn't solid yet in v1.8.0: sample-level Time Profile analysis is still fragile (xctrace export of the time-profile schema crashes on heavy, non-symbolized traces; the tool surfaces a workaround). Hang, and animation-hitches analysis are rock-solid. The leaks --outputGraph regression on macOS 26.x is mitigated via bootAndLaunchForLeakInvestigation, but not 100% resolved (task_for_pid down there needs a fix from Apple); when the workaround fails, captureMemgraph surfaces a structured fallback for Allocations.

Memory Graph capture works for Mac apps + iOS simulator but not on physical devices (limitation of leaks(1), I can't fix it). replayScenario and the captureScenarioState UI tree sub-capture have a soft dependency on axe (brew install cameroncooke/axe/axe); the rest of the plugin works without it. detectLeaksInXCUITest is shipped but marked as experimental until real production runs validate the orchestration. The CHANGELOG is honest about all of this. See CHANGELOG.md.
GitHub: github.com/carloshpdoc/memorydetective. PRs welcome, especially new cycle patterns from real production leaks you've found.

If you want to try this

A short afternoon of setup gets you the whole workflow:

Install either via npm + MCP config (npm install -g memorydetective plus 1 line of JSON in your client) or, on Claude Code, via one-line plugin install: /plugin marketplace add carloshpdoc/memorydetective-plugin then /plugin install memorydetective@memorydetective-plugin.
Install Claude Code in the terminal if you haven't. Point it at one of your iOS projects.
(Optional, but recommended) Install XcodeBuildMCP for the simulator-driving parts. Pairs nicely.
Spend 20 minutes learning Memory Graph + leaks on a .memgraph you generate yourself. Pick a known retain cycle in your codebase, or build a tiny class A { var b: B } cycle in a playground and confirm leaks finds it. Then run memorydetective analyze on it and watch the cycle classified.
Pick a real perf ticket. Don't reach for the Xcode UI first.

The setup is small. The workflow is genuinely faster. The unfair advantage is that most of the iOS engineers I know haven't even tried this loop yet.

Wrapping up

Looking back, the afternoon broke down like this:

Wins

Three independent perf fixes shipped in one afternoon (one leak, one cache parity, one app-wide singleton).
Cost of being wrong dropped from "half a day per hypothesis" to "10 minutes per hypothesis". That single shift mattered more than any individual fix.
Memory Graph + leaks CLI gave me a precise retain chain in seconds, instead of four hours of closure-auditing.
PR housekeeping (descriptions, before/after tables, stack management) was off my plate.

Tradeoffs

The CLI tooling around iOS perf is brittle. Some templates silently produce empty data (xctrace --template Leaks --attach is the worst offender). You have to know the workarounds, and the LLM doesn't always.
LLM accepts proposing wholesale refactors of [weak self] that don't fix the leak (the real cycle is usually in only one closure, not all of them) and even introduce bugs that need fixing: closures that become silent no-ops, lost asynchronous work, races. In worse cases, re-strongification via guard let self recreates the cycle in a different way. Discipline (the slash command /perf-investigate) is what keeps you out of this hole: weak-capture only the closure proven by Memgraph as the root of the cycle, not all of them.
Some custom logging SDKs route around os_log, which means MCP-driven log capture won't see them. You fall back to pasting from Xcode.

What I'd repeat

Capture a "fluid" sibling feature as a baseline before reading any absolute numbers.
Validate after step 1 of any plan, then re-plan. Don't let the original size of a ticket anchor the real scope.
Treat artifacts (.memgraph, .trace) as programmable inputs, not GUI-only files.

And you, what does your iOS perf investigation flow look like today? Are you using the Memory Graph CLI, or staring at the Xcode sidebar? I'd love to hear what tooling actually moves the needle for you.

Thanks for reading, and until next time. 🚀

References

Memory Graph CLI (leaks, heap, vmmap, malloc_history, and .memgraph files)

[weak self] (mechanics, perf overhead, and re-strongification pitfalls)

A deep technical companion piece on the actual leak (a SwiftUI TagIndexProjection<Int> cycle through _DictionaryStorage<AnyHashable, WeakBox<AnyLocationBase>>, with the full retain chain and the wrong wholesale-[weak self] refactor I tried first) is coming next week. Different audience: pure technical, no AI workflow.

An afternoon of iOS perf work, with Claude in the loop

The workflow, drawn out

The setup

How it actually went

Step 1. Reframing the problem (5 minutes)

Step 2. Reproducing on the simulator, hands-off (10 minutes)

Step 3. `leaks ~/Desktop/x.memgraph` (10 minutes that ended the leak)

Step 4. The dead end I'd have walked into anyway (90 minutes. Pause here.)

Step 5. Time Profiler, comparison-first (45 minutes)

Step 6. Stacked PRs and app-wide cleanup (30 minutes)

What changed for me

What I'd do differently

I Packaged the Workflow

If you want to try this

Wrapping up

References

Tags

Author

Stats

Published

You Might Also Like

Liquid Glass, Material 3, And A Lot Of Plumbing

Push notifications, the iceberg under one feature

The loop that changed how I write mobile tests

One Spell, Every Kingdom: Why Flutter Still Matters in the AI Era

🔐 SSL Pinning in Mobile Apps: Android & iOS (Practical Guide + Trade-offs) - Part 2

Claude as a CI Co-pilot: Debugging Apple Signing Hell So You Don't Have To

An afternoon of iOS perf work, with Claude in the loop

The workflow, drawn out

The setup

How it actually went

Step 1. Reframing the problem (5 minutes)

Step 2. Reproducing on the simulator, hands-off (10 minutes)

Step 3. leaks ~/Desktop/x.memgraph (10 minutes that ended the leak)

Step 4. The dead end I'd have walked into anyway (90 minutes. Pause here.)

Step 5. Time Profiler, comparison-first (45 minutes)

Step 6. Stacked PRs and app-wide cleanup (30 minutes)

What changed for me

What I'd do differently

I Packaged the Workflow

If you want to try this

Wrapping up

References

Tags

Author

Stats

Published

You Might Also Like

Liquid Glass, Material 3, And A Lot Of Plumbing

Push notifications, the iceberg under one feature

The loop that changed how I write mobile tests

One Spell, Every Kingdom: Why Flutter Still Matters in the AI Era

🔐 SSL Pinning in Mobile Apps: Android & iOS (Practical Guide + Trade-offs) - Part 2

Claude as a CI Co-pilot: Debugging Apple Signing Hell So You Don't Have To

Step 3. `leaks ~/Desktop/x.memgraph` (10 minutes that ended the leak)