When validation fails — fixing it

Validation failed and you're back on Review & Edit looking at a bounce-back banner. This article walks through reading the failure summary, deciding between manual edits and the AI Wizard, the most common failure modes, and how to re-validate.

Written By Alan Gandy

Last updated 3 months ago

If the autograder check fails — at Tier 1 (sandbox) or Tier 2 (replica) — your assignment lands back on Review & Edit with a yellow or red bounce-back banner at the top. This article walks through what the banner is telling you and how to fix it.

The banner shows four things:

Which tier failed — Tier 1 (sandbox solution check) or Tier 2 (replica workflow check)
A summary of what went wrong — e.g., "Test #3 failed: expected 42 but got 43"
A list of failing tests — each with actual vs. expected output
Two action buttons:
- Edit manually — dismisses the banner so you can edit the relevant tab directly
- Fix with AI Wizard — opens the AI Wizard with the failure summary pre-loaded

The session's status is back to generated (not validated), so the Continue to Deploy button stays locked until you fix the issue and re-run validation.

Decide: edit manually, or AI Wizard?

Both are valid. Here's how to decide:

Edit manually when…

The failure is obvious (one test's expected output is wrong, you can see the fix)
You want to make a structural change (delete a test, add a test, refactor the solution)
You don't trust the AI on this particular fix
You're under time pressure and the AI Wizard would round-trip through review

AI Wizard when…

The failure isn't obvious from the summary alone
The fix would require touching multiple artifacts (e.g., solution + test + instructions all need updating)
You want a suggested diff you can review before applying
The failure mode is one you've seen before and you know the AI handles it well

How the AI Wizard works

Click Fix with AI Wizard in the bounce-back banner. A panel opens showing:

Failure summary (read-only) — what went wrong
Proposed changes — a diff of what the AI wants to change in starter, solution, tests, or instructions
Two buttons:
- Apply changes — accepts the diff, applies it to the artifacts, and re-runs validation automatically
- Discard — closes the wizard without changing anything

The wizard uses one credit per shot (one round-trip to the AI). If you discard and try again, that's another credit. Use deliberately.

Common failure modes — quick triage

"Test #N failed: expected X, got Y"

Most common Tier 1 failure.

If actual output is "almost right" (extra space, missing newline) — the test's comparison is too strict. Switch from Exact match to Output contains in the Tests tab, or fix the trailing whitespace in the solution.
If actual output is structurally different — the solution has a bug. Open the Solution tab and fix it.
If you'd rather change what's expected — open the Tests tab, edit the test's Expected Output to match what the solution actually produces.

"Solution timed out (30 seconds)"

Solution ran longer than the sandbox timeout. Causes:

Infinite loop
Reading from stdin without an EOF marker
Recursive call without a base case
O(n²) or worse on a test input larger than expected

Open the Solution tab, fix the underlying issue, re-validate.

"Solution crashed: ImportError" / "ModuleNotFoundError"

Solution depends on a package that isn't installed in the sandbox. Two paths:

Easiest: rewrite the solution to use only the standard library
Harder: add the missing package via a pip install step in the workflow YAML (Tier 2 territory; not for the faint of heart)

"Workflow syntax error" / "Invalid YAML"

Tier 2 failure. The deterministic workflow builder normally produces valid YAML, but if the AI regenerated the workflow during a fix-attempt, syntax errors can creep in.

CodeTeach gets one shot at a deterministic mechanical fix automatically. If that fails, you're bounced to Review with the YAML pre-loaded for editing — but most instructors will use the AI Wizard here rather than editing YAML by hand.

"Diff with process substitution failed in /bin/sh"

Tier 2 failure. The real GitHub Actions grader runs commands via /bin/sh (which is dash on Ubuntu, not bash). Bash-isms like diff <(cmd) expected.txt fail silently. The fix rewrites the test command using POSIX-portable syntax (cmd | diff - expected.txt).

CodeTeach's mechanical-fix layer catches this automatically; you should rarely see it.

"Permission denied" on workflow

Workflow file is missing the permissions block (checks: write, actions: read, contents: read). The deterministic builder always emits this, so seeing it usually means a regenerated workflow lost the block. Click Fix with AI Wizard.

For more error message catalogue, see Common validation failures and what they mean.

Re-running validation after a fix

After you apply changes (manually or via wizard):

Make sure all your edits are saved (every change auto-saves; the wizard auto-saves what it applied).
Open the Instructions tab — you don't need to re-approve, the previous approval still stands.
Click Run autograder check again.

If validation passes this time, status flips to validated and Continue to Deploy unlocks.

If it fails again with a different error, repeat. The most common cause of repeat failures is fixing one symptom and revealing another.

If it fails the same way twice despite changes, something deeper is wrong — open the Tests tab and consider whether the test itself is what needs to go.

What if I just want to ship a "mostly working" assignment?

You can't deploy until validation hits 100%. Two safety valves exist:

Delete the failing test. If a test is failing because it's testing the wrong thing, just delete it from the Tests tab. The autograder doesn't enforce coverage minimums.

Download as ZIP. The Download Assignment ZIP option (visible in the Deploy step of any validated assignment) packages the artifacts as a zip file. You can hand-deploy without going through CodeTeach's deploy flow — bypassing the validation gate. Use this only if you know what you're doing — the assignment may have undetected issues.

When the AI Wizard makes things worse

The AI Wizard usually helps, but not always. Patterns to watch for:

The wizard keeps proposing the same change that doesn't work. Decline and edit manually instead.
The wizard's proposed diff is huge. Anything more than a few lines should be reviewed carefully — large diffs sometimes break things you didn't realise were intentional.
The wizard rewrites your custom edits. If you've manually polished the solution and the wizard wants to replace large chunks of it, decline and apply only the piece that addresses the failing test.

Trust your eyes. The AI is a starting point, not the final answer.

Lesson capture

When a bounce-back is followed by a successful re-validation, CodeTeach quietly records what went wrong and what fixed it. These "human-validated lessons" feed into future AI generations so the same failure pattern is less likely to recur. You don't have to do anything to participate — it's automatic.

Where to go next

Just need to know what an error message means? → Common validation failures and what they mean is the catalogue.
Validation now passes? → Deploying your assignment.
Validation passed but you're not sure if it should have? → Read your tests carefully on the Tests tab. The autograder only catches what you tell it to catch.

Anatomy of the bounce-back banner