How I audit a website with Claude (the exact process)
This is the exact process I used to audit a live WordPress lead-gen site with Claude: inventory from the sitemap, crawl every URL, build a per-page SEO table, test tracking two ways, submit a real test lead, probe security from the outside, then fix the safe stuff the same day with verification. No tool did this for me. Claude did the fetching, parsing, and verifying; I made the judgment calls.
## the process
- 1
Set up a workspace and ground rules before Claude touches anything
Create one folder for the audit and tell Claude to keep three files in it: FINDINGS.md for every issue, seo-pages.md for the per-page data table, and FINAL-REPORT.md for the deliverable. Give Claude the rules up front: take a database backup before any change, only make reversible changes, and log every finding in a fixed format (Issue, Severity, Location, Root Cause, Recommended Fix, Expected Impact) with a Critical/High/Medium/Low severity scale. If the site is WordPress, connect Claude to a browser tab logged into wp-admin so it can read Site Health and settings screens directly. This structure is what makes a multi-hour session produce a coherent report instead of a chat log.
- 2
Inventory the site from the sitemap
Have Claude fetch sitemap_index.xml and every child sitemap, then list every URL on the site. Anomalies show up before you audit a single page: I found a misspelled slug, duplicate pages with -2 and -new suffixes, a stray page-builder template published as a real page, and nav links pointing to pages the sitemap did not list. Each anomaly goes straight into the findings file as something to investigate. This list also becomes your crawl target for the next steps.
- 3
Fingerprint the real stack, not the advertised one
Pull WordPress Site Health, Info through the logged-in browser: WP, PHP, and database versions, the database charset, and hosting clues like mu-plugins and cache drop-in files. Then fetch a page anonymously and read the raw response headers to find out what is actually serving and caching pages. On my site the headers showed a platform edge cache doing all the work while a LiteSpeed Cache plugin emitted directives the nginx server ignored entirely. Finish by listing every active plugin and flagging duplicate-function pairs: I had two consent managers, two snippet managers, two backup tools, and an overlapping SEO stack all active at once.
- 4
Crawl every URL for errors and redirect chains
Claude fetches every URL from the inventory (64 in my case) and records the HTTP status and full redirect chain for each. Anything that is not a clean 200 in one hop is a finding, including sitemap URLs that 301 somewhere else, which means the sitemap is advertising dead addresses to Google. On a handful of key pages, also check the browser console and network tab for JavaScript errors and failed requests, and be careful to separate site-origin errors from noise caused by your own browser extensions. My crawl came back with zero broken pages, which is itself worth stating in the report.
- 5
Build a per-page SEO table from raw HTML
For every URL, Claude fetches the raw HTML anonymously and parses out: title and its length, meta description and length, canonical, the robots directive, H1 count, schema types in the JSON-LD, Open Graph tags, body word count, and which tracking scripts appear. It writes all of it into one markdown table so patterns jump out: I could see at a glance that three core money pages were noindexed, about 20 pages had zero or multiple H1s, and a set of templated city pages were thin near-duplicates. For the duplicate suspicion, Claude wrote a small script that compared pages with n-gram similarity instead of eyeballing it. One warning: edge caches serve stale HTML, so re-verify any surprising robots value with a logged-in fetch or a cache-buster query before logging it as fact. If you have Semrush connected over MCP, pull its data on the same pages to corroborate what your hand-rolled crawl found.
- 6
Audit tracking two ways and reconcile against the ad platforms
Check tracking at runtime in the browser: are gtag, fbq, and clarity actually defined, and which measurement IDs loaded. Then cross-check against an anonymous raw-HTML scan, because consent gating can make a properly installed pixel look absent at runtime. Finally, compare what is installed against what the ad platforms expect: my site had two GA4 properties firing at once, no Meta pixel in the markup at all, and when I dug into the pixel plugin its configured ID did not match the dataset in Events Manager. Each mismatch is a separate finding because each has a different owner and fix.
- 7
Submit a real, clearly labeled test lead
Fill out the live contact form with a name like TEST - Automated Audit (please delete), a valid-format dummy phone number, and an email inbox you control. Record exactly what happens: the confirmation message word for word, whether a notification lands in an inbox someone actually watches, and whether the lead shows up in the CRM. Read the consent text literally; mine still contained the CRM template placeholders Insert Business Name and YOUR COMAPNY NAME, typo included, at the exact moment a prospect decides whether to trust the site. Skip anything that books a real calendar slot, and note the test lead in the report so it gets deleted.
- 8
Probe security from the outside
From an anonymous session, have Claude hit the standard WordPress exposure points: /wp-json/wp/v2/users for user enumeration, a POST to xmlrpc.php, /readme.html, and ?author=1. Check response headers for HSTS, X-Frame-Options, X-Content-Type-Options, and Referrer-Policy. Inside wp-admin, count how many accounts are administrators; six of my seven users were, while the dashboard showed tens of thousands of blocked login attempts. Most of what this turns up is fixable with one reversible code snippet, but log it all first.
- 9
Fix the safe items the same day, and verify every fix live
Take the database backup, then work through the reversible items: flipping noindex off the money pages, noindexing the stray template, correcting the pixel ID, activating the security hardening snippet. Verify each fix by re-fetching the live page and reading the actual output, never by trusting the plugin settings screen, and remember the cache warning: hit the URL twice or use a cache-buster so you are not reading a stale copy. Anything that touches analytics ownership, consent tooling, paid subscriptions, plugin removal, or content strategy goes on a separate owner-decision list instead of getting fixed silently. That split is the difference between an audit the owner trusts and one that breaks something.
- 10
Write the final report in a shape someone will actually act on
The deliverable is short: an executive summary that says what is healthy and what the real problems are, a table of fixes already applied with how each was verified, a remaining action plan grouped Critical/High/Medium/Low, and a methodology and limitations section. The limitations part matters; mine admitted that PageSpeed scores were unavailable because Google rate-limited anonymous requests, and that true mobile rendering could not be force-emulated. Point each remaining item back to its detailed entry in FINDINGS.md rather than repeating it. The full findings file is the appendix; the report is what gets read.
## hard-won tips
- !Edge caches lie. The first response after a change can be stale, so verify every fix with a second fetch or a cache-buster query before declaring it done.
- !Give Claude the finding format (Issue, Severity, Location, Root Cause, Fix, Impact) before the first check. Retrofitting structure onto a hundred chat messages is miserable.
- !Never trust a plugin's word that it is doing something. Read the actual response headers and rendered HTML; my caching plugin and my SEO plugin were both being overridden by things they did not know about.
- !Keep two lists from the start: fix-now (reversible, low-risk) and owner-decision (analytics, consent, paid tools, content strategy). Fixing silently in the second category is how audits break sites.
- !Label all test data so it is unmissable, write it into the report, and delete it when you are done.
$ follow --the-build
Watch it happen, don't take my word for it
Every build on this site gets documented as it happens — the prompts, the dead ends, the results. No course at the end of this funnel. There is no funnel.
follow on x →