Session isolation in a GxP environment: what Kubernetes-backed Workbench actually buys you

Reading time:

time

min

June 10, 2026

This is a conversation we're consistently having with different pharma IT leads. They're evaluating Posit Workbench, they like the product, and then the main question from the Quality Team comes: "how do we know one user's session can't affect another's?" If we were using a traditional shared server, the honest answer would be: "we can't assure that, we just create policies and guidelines and hope that users follow them."

Such an approach works fine for internal tooling. It stops being a good solution when an FDA inspector requires proof that a specific analysis ran in a controlled, documented environment. 21 CFR Part 11 is explicit about individual accountability. If two statisticians share a runtime and one of them upgrades a package mid-study, you can't prove the other person's results weren't affected. The team's gentleman's agreement about package management is not a valid proof for the inspector.

To resolve this problem thoroughly, our approach is to run Posit Workbench on Kubernetes. This ensures the resolution comes earlier, at the infrastructure level, instead of just "hoping for the best" at the process level. Session isolation is one layer of a modern statistical computing environment; this piece focuses on the runtime.

How Kubernetes-backed session isolation works in Posit Workbench

We're assuming you're already familiar with Posit Workbench. One of its components, called the Launcher, is an essential part of this solution. When a user clicks "New Session," the Launcher creates a Kubernetes pod, rather than a process on a shared server. Each pod runs its own container with dedicated memory and CPU allocation. Once the user finishes their work and closes the session, the pod is deleted. The next session starts from a clean image.

What does "isolation" mean in this context precisely? Each session pod operates within its own process namespace, meaning users cannot see each other's processes. The filesystem is based on a read-only container image with ephemeral changes, and resource ceilings are enforced by Kubernetes. If a job attempts to consume more memory than the pod permits, Kubernetes terminates it. Other sessions remain unaffected.

Resource profiles: predictable CPU and memory allocation

Users do not select arbitrary CPU and memory values. Instead, we define multiple named resource profiles, ranging from 1 CPU / 2 GB to 30 CPU / 240 GB, which users choose from a dropdown menu. Custom resource requests are disabled. This approach ensures that every session's resource allocation is a known, logged quantity.

These profiles map to different node pools. Smaller interactive sessions run on shared general-purpose nodes. When a user selects a larger profile, the cluster autoscaler provisions a dedicated memory-optimized instance. The user simply picks a size from a menu without needing to understand the underlying infrastructure. Every session launch is logged with the exact profile selected, and that entry becomes part of the compliance record.

Larger sessions can run on spot instances, which reduces the overall cost. The tradeoff is that the node may be reclaimed by the cloud provider. For interactive work, this is generally acceptable since the session reconnects automatically. For long-running batch jobs, this requires more careful consideration.

Study-specific container images for package version control

This is an aspect that often gets overlooked. We do not run a single generic R environment. Each study receives its own container image with the specific R and Python package versions it requires. These images are built from git-tracked Dockerfiles, scanned for vulnerabilities (Trivy, Hadolint), tagged with the build SHA, and pushed to a private container registry. Two different studies can use two different versions of the same package without conflict because they run in entirely separate containers.

Users cannot install packages into the base image from within a session. If a study requires a new package, the request goes through the image build pipeline, which produces a new scanned and tagged image. The profile configuration is then updated accordingly. The image list per cluster tier is fixed. Users receive only the images that have been validated for their environment.

Where user files persist: EFS-backed home directories

Ephemeral pods raise a practical question: where does a user's work persist when the session ends? Each user's home directory is stored on EFS (Elastic File System), which is mounted into every session pod they launch. When a session terminates, the home directory remains intact. When a new session starts, it mounts the same home directory.

This provides persistence without shared state. Each user's home directory is separate from every other user's. The EFS filesystem is backed up through AWS Backup with retention policies that satisfy GxP requirements. The session container is disposable, but the underlying data is not.

A shared storage volume is also available for collaborative work, but access to it is explicit. Files do not leak between users by default.

Audit trails for 21 CFR Part 11 compliance

Workbench captures R console input and output, session lifecycle events, user identity, container image used, and resource profile selected, all in structured JSON format. These logs are then shipped to CloudWatch. In the validated environment, log retention is set to seven years in accordance with FDA record-keeping requirements, and the log groups are encrypted with a dedicated KMS key.

Separately, CloudTrail captures S3 data events at the object level. This provides two independent audit trails that do not depend on users documenting anything manually. When an inspector asks what environment a particular analysis ran in, the session log provides the image tag and resource profile. The container image manifest provides every package version. The S3 audit trail shows which data files were accessed. All of this is immutable and timestamped.

Operational considerations and tradeoffs

We should be transparent about the operational cost. This architecture requires managing EKS clusters, node groups with autoscaling, EFS volumes, a container image build pipeline with security scanning, Helm charts, and ArgoCD for deployment synchronization. A shared Linux server with RStudio installed is significantly simpler to operate. Managing this as infrastructure as code is what keeps it operable rather than a pile of hand-maintained configuration.

Cold starts are a real consideration for the user experience. When a user selects a large resource profile and the cluster autoscaler needs to provision a new node, the wait can be two to three minutes. Maintaining warm nodes during business hours mitigates this, but adds ongoing cost regardless of utilization.

Additionally, when issues arise, the platform team needs to work with Kubernetes events and pod logs rather than standard server diagnostics. This requires a skill set that most pharma IT organizations are still developing.

Why architectural isolation beats process-based controls

The alternative is having to explain to an auditor why we cannot prove that two concurrent sessions did not interfere with each other. With per-session pods, immutable container images, and seven years of structured audit logs, that question answers itself. The isolation is architectural. It does not depend on anyone following a process or exercising caution. We have run this pattern in production for pharma teams; one scalable Posit deployment built entirely as code shows it operating at scale.

For teams running exploratory analytics without regulatory requirements, this level of infrastructure is unnecessary. For any organization where an inspector will eventually ask for proof that the analysis environment was controlled, documented, and isolated from other concurrent work, building that assurance into the infrastructure from the start is considerably easier than attempting to retrofit it later.

This is part of our Data in the SCE series on building compliant, usable clinical data infrastructure. See also designing clinical data storage for submission and exploration, on the three-layer data lake behind this runtime. For a broader picture of how the runtime, validation, storage, and audit layers fit together, see our guide to the modern statistical computing environment or the Modern SCE for Pharma ebook. If you are working through similar GxP infrastructure decisions and want a second pair of eyes, talk to our pharma team.