Date: 2026-04-27 Milestone: WebR
follow-up (Session 4 of WebR follow-up sessions)
Predecessors: - 2026-04-20 WebR Integration (Session 1
— typed framework + lm_robust) - 2026-04-22 WebR Opaque
Nodes (Session 2 — opaque path end-to-end)
The opaque-nodes session unlocked paste-as-written for ~70 unique R
functions by emitting webr-opaque nodes for unrecognized
assignments. The single remaining wall is file I/O inside the
WebR worker: read_xlsx("../Data/MasterData.xlsx"),
readRDS("results/m1.rds"),
haven::read_sav("survey.sav"), even
read.csv("foo.csv") on uploaded files all fail with “file
not found” because the WebR worker’s filesystem is empty — uploads only
ever reach the TS-side dataset registry.
Today’s binary-input papers (INTERLYSE-RUN-STATUS papers #2/#3/#4)
require a manual workaround: convert
.xlsx/.rds/.sav/.dta
to CSV outside the app, place the CSV in examples/, and
point the R code at it. This is a hard barrier: any paper using a binary
input format fails on first paste, and there’s no in-app path
forward.
Today’s worker also has no story for files written by
R: ggsave(), write.csv(),
writeLines(), etc. produce files in the worker’s VFS that
die silently when the worker terminates. Replication papers routinely
write tables and figures to disk; users have no way to retrieve
them.
This session bridges both directions. Uploaded files (binary and
otherwise) get mirrored into the WebR worker’s VFS so opaque R code can
read them; files written by the worker get surfaced in an artifacts
panel so users can download them. The is.data.frame probe
added in session 2 already auto-marshals worker results to TS Datasets
when applicable, so the moment read_xlsx(...) succeeds
inside R, the data flows downstream into the existing pipeline
machinery.
A future “editor mode” UI (Scripts/Data side-panels for authoring, separate from replication tree-view) will sit on top of the same workspace store this session establishes; that UI work is split into a follow-up spec.
WorkspaceStore holding path-keyed
Map<string, Uint8Array> for all uploaded files
(regardless of type)..xlsx/.xls/.rds/.rdata/.sav)
extracted from ZIPs into the workspace store (today they’re
discarded).UploadZone./workspace/<original-zip-relative-path>.WebRRequest carries a
cwd string; worker prefixes eval with
setwd(cwd).originFile threading: AnalysisCall → node
params → WebRRequest.cwd./workspace/; single-file uploads append.originalUploads: Set<string>; after each Run, walk
/workspace/ and surface anything not in the set..xlsx input).WorkspaceView editor UI (Scripts/Data
side-panels for from-scratch authoring) — separate spec, builds on this
workspace store.originFile + existing sourceSpan provide the
data; UI feature is a separate milestone.setwd() emulation (line 72 of BACKLOG)
— the per-script CWD scheme covers automatic CWD-from-origin, but
explicit setwd("subdir") calls in user R code are not yet
rewritten/intercepted. Tracked separately.read.csv("/Users/author/data.csv")) — out of
scope; only relative paths resolve, and absolute paths produce a clear
error.Reset WebR session button
(kill+respawn worker without losing workspace) — would benefit from this
work but is its own UX feature.ggsave("foo.pdf") into /workspace/output/) —
track-by-exclusion handles arbitrary write paths without intercept; no
shim needed.src/ui/store/workspace.ts [new]
Path-keyed Uint8Array store + sync queue;
originalUploads set; lifecycle ops
src/ui/store/files.ts [modified]
Build WorkspaceStore from extracted ZIP;
feed binary files (today discarded)
src/ui/components/toolbar/upload-zone.tsx [modified]
Accept .xlsx/.xls/.rds/.rdata/.sav;
wipe-workspace confirmation
src/core/zip/extractor.ts [modified]
Extract binary file bytes (today excluded);
drop per-file size cap
src/core/parsers/shared/analysis-call.ts [modified]
Add originFile?: string
src/core/parsers/file-registry.ts [modified]
Thread originFile into recognizer call
src/core/parsers/r/recognizer.ts [modified]
Pass originFile through to AnalysisCall
src/core/pipeline/types.ts [modified]
Add originFile?: string to webr-typed
and webr-opaque params
src/core/pipeline/mapper.ts [modified]
Carry originFile from AnalysisCall to node
src/core/webr/protocol.ts [modified]
Add cwd?: string to WebRRequest
dispatch-typed and dispatch-opaque
src/core/webr/dispatch.ts [modified]
Plumb cwd through dispatcher API
src/workers/webr-worker.ts [modified]
Handle FS-write requests;
setwd(cwd) before each eval
src/workers/worker-manager.ts [modified]
Sync workspace bytes on worker init;
incrementally sync on new uploads;
compute cwd from originFile;
post-Run artifact discovery
src/ui/store/artifacts.ts [new]
Discovered artifacts; preview cache;
download orchestration
src/ui/components/panels/artifacts-panel.tsx [new]
Collapsible artifacts panel
// src/ui/store/workspace.ts
interface WorkspaceState {
files: Map<string, Uint8Array>; // path → bytes (path-keyed flat map; '/' in keys forms tree)
originalUploads: Set<string>; // paths present after last upload (for artifact diff)
syncedToWebR: Set<string>; // subset of files that have been pushed to /workspace/
totalSize: number; // running sum for the 1.5GB cap
addFiles: (entries: Array<{ path: string; bytes: Uint8Array }>) => void;
wipe: () => Promise<void>; // also wipes WebR /workspace/
removeFile: (path: string) => void;
getPendingSync: () => Array<{ path: string; bytes: Uint8Array }>;
markSynced: (paths: string[]) => void;
markUnsynced: () => void; // called when WebR worker is recreated
}The store is the single source of truth for “what’s in the
workspace.” Both the existing TS-side dataset registry (parsed
CSVs/DTAs) and the new VFS sync read from it. CSV bytes live here even
though parsed Datasets exist elsewhere — the duplication is bounded (CSV
bytes only, not parsed columns) and lets opaque R code resolve
read.csv("foo.csv") without surprise.
Three triggers, one code path
(workerManager.syncWorkspaceToWebR):
Worker boot (ensureWebRWorker):
after init-ready, iterate
workspace.getPendingSync() and post FS-write messages
before resolving the webrReady promise. Status stays at
loading until sync completes; only then transitions to
ready. Callers of ensureWebRWorker() can
therefore assume the FS is populated when the promise resolves.
New file added while worker is up:
workspace.addFiles(...) notifies the worker-manager, which
posts FS-write messages immediately. No status change (worker stays
ready).
Workspace wipe: post a
wipe-workspace message; worker calls
webR.FS.unlink() over each path under
/workspace/. After ack, mark all files unsynced; subsequent
ensureWebRWorker/addFiles calls will
re-sync.
Protocol additions (all four FS operations consolidated here for
reference; fs-list and fs-read are used in
§3.5):
// src/core/webr/protocol.ts — additions to WebRRequest
| { type: 'fs-write'; id: string; entries: Array<{ path: string; bytes: Uint8Array }> }
| { type: 'fs-wipe'; id: string }
| { type: 'fs-list'; id: string; root: string }
| { type: 'fs-read'; id: string; path: string }
// additions to WebRResponse
| { type: 'fs-ack'; id: string; written?: string[]; error?: string }
| { type: 'fs-list-result'; id: string; entries: Array<{ path: string; size: number; mtime: number }> }
| { type: 'fs-read-result'; id: string; bytes?: Uint8Array; error?: string }For fs-write, the worker creates parent directories as
needed (webR.FS.mkdir recursive), then writes each file
with webR.FS.writeFile. Paths in the request are
workspace-relative (e.g., code/01-prep.R); the worker
prepends /workspace/ to form absolute paths.
The fs-list root argument is
/workspace; the worker walks recursively and returns one
entry per file (not directory). fs-read round-trips bytes
for downloads/previews.
// src/core/parsers/shared/analysis-call.ts — modified
interface AnalysisCall {
// ... existing fields
originFile?: string; // path relative to workspace root, e.g. "code/01-prep.R"
}FileRegistry.processFiles() already iterates per-file;
the recognizer just needs to know the current entry.path
and stamp it onto every emitted call. For inline-paste / single-file
mode where there’s no meaningful “origin file” (just the editor’s tab
content), originFile is left undefined.
// src/core/pipeline/types.ts — extend params
interface WebRTypedParams { /* ... */ originFile?: string; }
interface WebROpaqueParams { /* ... */ originFile?: string; }// src/workers/worker-manager.ts — derive cwd before dispatch
function cwdFor(originFile: string | undefined): string {
if (!originFile) return '/workspace';
const slash = originFile.lastIndexOf('/');
return slash < 0 ? '/workspace' : `/workspace/${originFile.slice(0, slash)}`;
}Each WebRRequest carries cwd; the worker
prefixes eval with setwd(cwd) (sticky — no restoration).
Cost: one extra evalRVoid per dispatch — negligible
compared to the actual eval.
After a Run completes (all in-flight dispatches settled),
worker-manager posts an fs-list request (see protocol
additions in §3.3). The worker walks /workspace/
recursively (webR.FS.readdir + stat for each entry),
returns one entry per file. Worker-manager diffs the result paths
against workspace.originalUploads; any path not in the set
is an artifact. Each artifact in the store carries: path, size, mtime,
mime-type guess (from extension).
The artifacts store keeps the latest snapshot. The UI panel renders it grouped by parent directory; new-since-previous-Run paths get a “new” dot for one render cycle (cleared on next Run start).
Downloads: clicking an artifact triggers an fs-read
request → bytes round-trip back to TS → Blob +
<a download> synthetic link. Previews for
text/CSV/SVG/PNG/PDF under 5MB use the same round-trip but render inline
(text: pre-wrap; SVG: inline; PNG: img src=data URI; CSV: small table
with first 50 rows; PDF: object embed).
UploadZone with a ZIP, when the workspace already has
files: show a modal with “This will replace your current workspace.
Continue?” and Cancel/Replace buttons. On Replace, call
workspace.wipe() (which wipes WebR FS too) before
extracting the new ZIP. On Cancel, abort the upload.
Single-file uploads (any type) skip the prompt and append. Editor mode (future spec) will create files programmatically; same append path.
The originalUploads set is recomputed at the end of
every upload completion (i.e.,
originalUploads = new Set(workspace.files.keys())). After
the wipe-and-replace flow, the new uploads are the new originals;
previously-discovered artifacts are gone (they were in the wiped VFS)
and the artifacts panel resets.
workspace.addFiles(...) synchronously
stages bytes.ready: post
fs-write immediately. No UI status change.loading/installing:
queue; sync runs as part of ensureWebRWorker’s init
tail.onPipelineChange prewarm path triggers boot, and sync runs
as part of that boot’s init tail. If the user’s pipeline never hits a
webr-typed/webr-opaque node, WebR never boots,
and the bytes sit in the TS workspace store unused —
acceptable, costs only what the user uploaded.The key invariant: when webrReady promise resolves,
all currently-staged files are present in
/workspace/. Dispatches downstream of
ensureWebRWorker() await it, so they can safely assume
their inputs are readable.
fs-ack carries error; worker-manager
surfaces it as a toast and marks the failed file unsynced. Subsequent
dispatches that depend on the file fail with a “file not found” R error,
which propagates as a normal opaque dispatch error.writeFile overwrites).
Document this as a soft invariant; not enforced.webrReady
promise rejects; status goes to error; user has to
re-upload. Documented limitation; rare.originFile is undefined (single-file
paste, inline editor): cwd = /workspace.analysis.R directly in ZIP root): originFile =
"analysis.R"; lastIndexOf(‘/’) < 0; cwd =
/workspace.originFile = "code/sub/03-merge.R"; cwd =
/workspace/code/sub./workspace/ until the worker dies (page reload) or
workspace is wiped.Steady state per file: bytes exist in TS-side
WorkspaceStore and in WebR’s WASM heap
simultaneously (Copy semantics, not Transferable). For typical
replication packages with 50–200MB of data files, this is 50–200MB extra
vs. today. Modern browsers handle this comfortably; the existing 1.5GB
total cap covers it.
Per file budget: none (per-file cap removed). Total cap stays at 1.5GB.
Future optimization (out of scope): switch to
Transferable ArrayBuffer when the dataset-marshal Transferable backlog
item lands. Both paths share the same postMessage envelope
at that point.
[Upload] button — accept attribute extended:
".zip,.R,.r,.csv,.dta,.xlsx,.xls,.rds,.rdata,.sav"
Drop and click handlers route binary types into
workspace.addFiles(...) directly (no parsing) and trigger
sync if WebR is up.
Plain modal, two buttons: - “Replace workspace” (primary, destructive) - “Cancel” (secondary)
Body: lists the files currently in the workspace that will be removed (collapsed to “X files (Y MB)” if >5).
Sits in the right sidebar (alongside the existing properties/results panels), collapsed by default. Header shows artifact count and total size. When expanded:
▼ Artifacts (3 files, 1.2 MB)
output/
▸ tables/main.tex [download]
▸ figs/coef-plot.svg [preview] [download]
results/
▸ m1-summary.csv [preview] [download]
Preview opens an inline overlay (SVG inline; PNG via data URI;
text/CSV with row truncation; PDF via <object>).
Artifacts are orthogonal to typed pipeline outputs; they don’t appear in the DAG. (A future “promote artifact to dataset” feature is out of scope — the auto-marshal probe already handles the case where R loads a file and assigns to a binding, which is the normal data-input path.)
workspace.ts lifecycle
(add/wipe/append, originalUploads recomputation, totalSize accounting);
CWD derivation from originFile (4 cases above)./workspace/... after ensureWebRWorker()
resolves.read_xlsx("../Data/foo.xlsx") from
originFile = "code/run.R" → assert dispatch succeeds and
result is a marshaled Dataset (uses real WebR worker, gated behind same
env flag as existing webr integration tests).output/x.csv;
run; assert artifact panel store contains output/x.csv
exactly once and not in originalUploads.webR.FS is part of the standard webr npm
package — no new dependencies.cwd) still work for backward compat
in tests, but production code always sets cwd.AnalysisCall.originFile is optional — existing
recognizer call sites without the field continue to work; only the
FileRegistry walk is updated to populate it.MAX_FILE_SIZE constant is removed from
extractZip; only MAX_TOTAL_DATA_SIZE
remains.unsupportedDataFiles list in
ClassifiedFiles becomes empty (since binaries are now
first-class) — keep the field for protocol compatibility but document
its emptiness.fs-ack; status goes to error; user can wipe +
re-upload smaller. No silent corruption.output/x.csv in the upload, then R writes to the same
path): documented soft invariant; uploaded copy wins on first sync,
R-written copy wins after first Run. Not exposed as a config; can
revisit if observed.VITE_DISABLE_WEBR=1 env flag as the rest of WebR. With WebR
disabled, the workspace store still exists (CSVs still parse TS-side,
single-file binaries are simply unusable but don’t error on upload —
they sit in the store forever).setwd("subdir") in user R code → no rewrite this
session.