close
Skip to content

Windows compatibility: skill-creator scripts fail (subprocess PATHEXT, cp1252 encoding, select on pipes) #1061

@just2majic

Description

@just2majic

Summary

Running python -m scripts.run_loop (or run_eval directly) on native Windows Python 3.14 hits three compatibility issues, all rooted in Unix-first assumptions in scripts/run_eval.py and adjacent files. The third is the blocker — without it, the optimizer cannot evaluate any query on Windows.

Environment

  • OS: Windows 11 Pro (10.0.26200)
  • Python: 3.14.3 (native Windows, not WSL)
  • Claude CLI: installed via npm at ~/AppData/Roaming/npm/claude.cmd
  • skill-creator: latest from anthropics/skills (commit shipped in current main as of 2026-04-28)

Issue 1 — subprocess.Popen(["claude", ...]) raises [WinError 2]

File: scripts/run_eval.py line 71, scripts/improve_description.py line 26.

Cause: Windows Python's subprocess.Popen does not search PATHEXT for .cmd/.bat/.ps1 extensions when shell=False. The Anthropic CLI on Windows installs as claude.cmd. Bare "claude" doesn't resolve.

Fix (one-liner, platform-conditional):

cmd = [
    "claude.cmd" if os.name == "nt" else "claude",
    "-p", query,
    ...
]

Issue 2 — Path.write_text(...) raises UnicodeEncodeError

Files: scripts/run_loop.py (lines 151, 278, 313, 317, 321), scripts/run_eval.py (line 68), scripts/improve_description.py (line 189), scripts/generate_report.py (line 319), and two open(..., "w") calls in scripts/aggregate_benchmark.py.

Cause: Python on Windows defaults Path.write_text() to the locale codec (cp1252 on most installs), which can't encode characters like that the eval reports use for failed assertions.

Reproduction (excerpt from optimizer log):

File "C:\...\Lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '✗' in position 10183

Fix: Add encoding="utf-8" to all .write_text() and open(..., "w") calls in scripts that produce HTML/Markdown reports. ~10 sites.

Issue 3 (BLOCKING) — select.select([process.stdout], [], [], 1.0) raises [WinError 10038]

File: scripts/run_eval.py line 108.

Cause: Python's select.select() on Windows only operates on sockets, not on subprocess pipe file descriptors. The streaming-detection logic that polls claude -p's stdout for early-exit on skill triggering is fundamentally Unix-only as written.

Reproduction (every query in the optimizer fails immediately with):

Warning: query failed: [WinError 10038] An operation was attempted on something that is not a socket

This blocks all of run_eval, run_loop, and improve_description (which depends on eval data).

Proposed fix — thread-and-queue stdout draining. Works identically on Windows and Unix; preserves the early-exit semantics. Replace the select.select-based loop with:

import queue
import threading

def _drain_stdout_to_queue(stream, q):
    """Read subprocess stdout in chunks and push to a queue.

    Used instead of select.select() because select() on Windows only works on
    sockets, not subprocess pipe FDs. A thread-and-queue pattern works on both
    Windows and Unix. Pushes None as a sentinel on EOF.
    """
    try:
        while True:
            chunk = stream.read1(8192) if hasattr(stream, "read1") else stream.read(8192)
            if not chunk:
                break
            q.put(chunk)
    except Exception:
        pass
    finally:
        q.put(None)


# In run_single_query, after subprocess.Popen(...):
chunk_queue: queue.Queue = queue.Queue()
reader_thread = threading.Thread(
    target=_drain_stdout_to_queue,
    args=(process.stdout, chunk_queue),
    daemon=True,
)
reader_thread.start()

# Main loop:
while time.time() - start_time < timeout:
    try:
        chunk = chunk_queue.get(timeout=1.0)
    except queue.Empty:
        if process.poll() is not None:
            break
        continue
    if chunk is None:  # EOF sentinel
        break
    buffer += chunk.decode("utf-8", errors="replace")
    # ... existing line-buffered JSON parsing logic unchanged ...

Verified locally: with this port (plus issues 1 and 2 fixed), python -m scripts.run_loop runs successfully on native Windows Python 3.14, completing the full optimization loop with iterating descriptions and the train/test split.

Willing to submit a PR

I have all three fixes applied locally and the optimizer running end-to-end on Windows. Happy to open a PR with the changes if that's useful — the changes are small and platform-conditional where appropriate (issues 1 and 3 don't change Unix behavior).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions