How We Handle Upload Session State in Our Apps
When we ship a chunked upload feature at 137Foundry, the part that takes the most design care is the upload session state. The chunked-protocol layer (whether Tus or S3 multipart) handles the wire-level resumability. The session state layer handles everything around it: authorization, quotas, progress tracking, cleanup, and the user-visible upload status across the application.
This is a walk-through of how we structure that state, what the tradeoffs are, and what we have changed over time.
Photo by Đào Hiếu on Unsplash
What "Session State" Actually Means
An upload session is the bookkeeping for one in-flight upload. The state record tracks:
The upload ID (a UUID we control).
The owning user ID (for authorization).
The destination path (where the file will live after assembly).
The declared content type and size from the client.
The current state (pending, uploading, processing, ready, rejected, expired).
Timestamps (created, last activity, completed).
A reference to the underlying protocol session (Tus session ID, S3 multipart upload ID, or equivalent).
The session record is what the application server uses to authorize chunk operations, report progress to the user, and clean up after abandoned uploads. It is separate from the actual file content, which lives in object storage.
Where the State Lives
We store session state in our primary application database (Postgres in most projects). The alternative is a key-value store (Redis), which is faster but loses the data on restart unless persistence is configured.
Postgres wins for us because the session state is small (a few hundred bytes per session), the query rate is low (one query per chunk on average, which is well within database capacity), and the consistency guarantees match what we want (no lost session records, transactional updates when state transitions).
For very high-throughput upload systems, Redis or a dedicated KV store can make sense. The session state becomes a hot path and the in-memory access pattern is much cheaper than database round-trips. Most of our projects do not reach that scale, so the Postgres pattern stays.
State Transitions
The session record moves through a small number of states. The transitions we use:
pending -> uploading: The client sends the first chunk. Once any chunk has been received, the session is actively uploading.
uploading -> processing: The client signals completion (or the protocol detects all chunks have arrived). The application server validates the chunk count, triggers post-upload processing (virus scan, content validation, metadata extraction), and the user sees a "processing" indicator.
processing -> ready: All post-upload processing has completed successfully. The file is now accessible to the user.
processing -> rejected: Post-upload processing flagged the file as invalid (virus scan failed, content type mismatch, format check failed). The file is moved to a quarantine location or deleted, the session is marked rejected, and the user is notified.
pending or uploading -> expired: No activity for the configured expiration window (we use 7 days). The session is marked expired and the partial data is cleaned up.
Each transition writes to the database with a timestamp. The audit trail is useful for debugging support tickets and for understanding upload patterns over time.
Per-User Quotas
The session state record is where per-user upload quotas are enforced. Before creating a new session, we check the user's current usage against their plan limits.
The check happens at two levels. Storage quota: the total size of all files this user owns. If the new upload would push the user past their limit, we refuse to create the session. Concurrent upload quota: the number of in-progress uploads for this user. We limit this (typically to 5 to 10 concurrent uploads per user) to prevent abuse and to keep our upload-session table from growing unbounded.
For users on enterprise plans with higher limits, the quota values are pulled from the user's plan tier. For users on free or trial plans, the limits are tighter to encourage upgrade. The session state layer is where these business rules attach to the underlying upload infrastructure.
Progress Reporting
The session state record tracks the current upload offset (in bytes). Every time the client successfully uploads a chunk, the offset advances. The application server can report progress to other clients (e.g., a multi-window user, a real-time UI on another tab) by reading the current offset.
For real-time progress display, we use server-sent events or WebSockets to push offset updates to interested clients. This is mostly used for collaboration features (showing another team member that an upload is in progress, displaying the progress in a shared workspace).
For single-user progress display, the upload progress is reported by the client itself based on the bytes it has sent. The server-side offset is mainly for cross-client visibility and for resume support.
Resume Support
The session record makes resume possible. When a client reconnects after a network drop and wants to resume an upload, the flow is:
Client queries the application server for the session's current state.
Application server checks authorization (this user owns this session) and returns the current offset.
Client resumes upload from the reported offset.
For Tus-based uploads, the resume protocol is built into the wire format (HEAD request returns the offset) per the Tus specification. For S3 multipart uploads, the equivalent is ListParts which returns the parts that have been uploaded so far.
The session state record allows resume across longer time gaps. A user who closes the browser and returns days later can have their upload resume because the session record persists. The Tus session and the S3 multipart upload may have a shorter lifetime than the session record, but the application database always has the source of truth on what was happening.
Cleanup of Abandoned Uploads
Without cleanup, the upload table and the storage location fill up with partial uploads from users who never finished. We run two cleanup jobs.
The first is a scheduled job that scans for sessions in pending or uploading state with no activity for more than 7 days. These get marked expired and the partial chunks are deleted from storage.
The second is a per-user limit. If a user has more than the maximum concurrent uploads (10 in our default config), the oldest stale session gets expired to make room. This catches users who repeatedly start uploads without finishing them.
The cleanup is observability-friendly. We log every expiration with the user ID, file size, and stale duration. The data is useful for identifying users with broken upload flows (frequent abandons) and for capacity planning.
Authorization at Every Step
The session record is the anchor for authorization. Every operation against the upload (chunk uploads, status queries, completion calls, cancellation) checks that the requesting user matches the session's owner.
For multi-tenant applications, the session also encodes the tenant. Cross-tenant access is impossible because the URL paths in object storage include the tenant ID, and the session record's owner check would reject any cross-tenant attempt before reaching the storage layer.
The OWASP file upload cheat sheet covers the broader authorization considerations. The pattern we use enforces ownership at the session level (one user owns a session) and tenant isolation at the storage path level (no two tenants can share storage paths).
Recovery From Server-Side Failures
If our application server crashes mid-upload, the session state record survives because it is in the database. When the server comes back up, in-progress sessions can continue from where they left off. The chunk-level resume protocol (Tus HEAD, S3 ListParts) recovers the actual upload state.
If the database itself fails, the upload sessions are unavailable until recovery completes. We use Postgres' standard replication and failover patterns for this. The mean time to recovery is low enough that most uploads can pause and resume successfully.
If the storage backend (S3) has an outage, the chunks cannot be uploaded. The session stays in uploading state, and the client retries with exponential backoff. Once S3 recovers, the uploads resume. This pattern survives multi-hour S3 outages cleanly, which has been useful during the few S3 region-wide events we have seen.
Photo by Christina Morillo on Pexels
Schema We Use Now
The Postgres schema for upload sessions, simplified for readability:
CREATE TABLE upload_sessions ( id UUID PRIMARY KEY, user_id UUID NOT NULL REFERENCES users(id), tenant_id UUID NOT NULL, destination_path TEXT NOT NULL, declared_size BIGINT NOT NULL, declared_content_type TEXT NOT NULL, current_offset BIGINT NOT NULL DEFAULT 0, state TEXT NOT NULL CHECK (state IN ('pending', 'uploading', 'processing', 'ready', 'rejected', 'expired')), protocol TEXT NOT NULL CHECK (protocol IN ('tus', 's3_multipart')), protocol_session_id TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), last_activity_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), completed_at TIMESTAMPTZ ); CREATE INDEX upload_sessions_user_active ON upload_sessions (user_id, last_activity_at) WHERE state IN ('pending', 'uploading'); CREATE INDEX upload_sessions_cleanup ON upload_sessions (last_activity_at) WHERE state IN ('pending', 'uploading');
The two indices support the most common query patterns: finding active uploads for a user (for quota checks and progress reporting) and finding stale uploads for cleanup. Both are partial indices that only include the relevant states, which keeps them small.
What We Changed Over Time
Looking back at the upload-session schema we shipped in the first project versus now:
We added the tenant_id column once we built our first multi-tenant product. The original single-tenant schema was missing it and we had to migrate.
We added the protocol and protocol_session_id columns when we started using S3 multipart alongside Tus. The original schema only knew about Tus session IDs.
We added the processing state when we added asynchronous virus scanning. Before that, sessions went straight from uploading to ready once the bytes were assembled, and the scan results were not represented in the state machine.
We added the last_activity_at index for cleanup once the table grew enough that scanning it became slow. The original cleanup query was a full table scan that worked fine for the first year and started to hurt by month 18.
Each addition was driven by a specific operational pain point. Building this from the start would have saved some migration work, but we did not know about most of these pain points until we hit them.
Where 137Foundry Helps
Upload pipelines are one of the recurring engineering surfaces we ship for clients. The 137Foundry services page covers the broader engineering work that surrounds these. The 137Foundry web development services page is the specific landing point for upload work. The full architectural framing is in the longer guide on resumable file uploads.
A Closing Note on Boring Code
The upload session state code is some of the most boring code we write. It is also some of the most important. Mistakes here produce hard-to-debug bugs in production months after launch. Getting the state machine right, the indices right, the cleanup right, and the authorization right at the start saves a lot of pain later.
For teams building this for the first time, the time investment in the session state layer pays back many times over the lifetime of the product. The chunked protocol gets attention because the wire format is interesting. The session state often gets shortchanged because the schema is mundane. The boring schema is what makes the interesting protocol actually work.















