# OIDC Authentication Implementation - Executive Summary ## Overview This plan details the complete implementation of OIDC authentication for Claude Web UI using Authentik as the identity provider. The implementation adds secure, production-ready authentication with group-based access control while preserving the existing WebSocket architecture. ## Current State **Backend:** - Single-file Express server (`backend/server.js`, 999 lines) - WebSocket-based Claude Code CLI interface - No authentication or authorization - In-memory session storage **Frontend:** - React application with SessionContext for state management - Direct WebSocket connections - localStorage for session persistence - No authentication UI ## Target State **Authentication:** - OIDC provider: Authentik (auth.sneakercloud.de) - Authorization flow: Authorization Code with PKCE - Token storage: httpOnly cookies (XSS-safe) - Session backend: Redis (persistent, scalable) **Authorization:** - Groups: `agent-admin` (full access), `agent-users` (standard access) - Protected REST endpoints - Authenticated WebSocket connections - User-isolated sessions ## Implementation Approach ### 5 Phases, 11-16 Days #### Phase 1: Foundation (2-3 days) - Configure Authentik OIDC Provider/Application - Refactor monolithic backend into modules - Add dependencies (express-session, openid-client, Redis) - Implement Redis session store #### Phase 2: Authentication Flow (2-3 days) - Implement OIDC client wrapper - Create auth routes (login, callback, logout) - Configure secure session cookies - Handle token refresh #### Phase 3: API Protection (2-3 days) - Create authentication middleware - Protect all REST endpoints - Secure WebSocket connections - Associate sessions with users #### Phase 4: Frontend UI (3-4 days) - Create AuthContext for authentication state - Build login page component - Add protected route wrapper - Integrate with existing SessionContext - Add user menu/logout #### Phase 5: Production Hardening (2-3 days) - Security enhancements (CSRF, rate limiting, CSP) - Logging and monitoring - Error handling and edge cases - Documentation - Testing ## Key Technical Decisions ### 1. Backend Refactoring **Decision:** Split monolithic `server.js` into modular structure **Rationale:** - Improves maintainability - Separates concerns - Easier testing - Supports future growth **Structure:** ``` backend/ ├── server.js # Entry point ├── config/auth.js # Auth configuration ├── middleware/ # Auth & session middleware ├── routes/ # API, auth, WebSocket routes └── utils/ # OIDC client wrapper ``` ### 2. Session Storage **Decision:** Redis with express-session **Rationale:** - Persistent across restarts - Scalable to multiple backend instances - Fast session lookups (<10ms) - Industry standard **Alternative Considered:** In-memory sessions - **Rejected:** Lost on restart, not scalable ### 3. Token Storage **Decision:** httpOnly cookies with encrypted tokens in Redis **Rationale:** - httpOnly prevents XSS attacks - Secure flag ensures HTTPS-only - sameSite=lax prevents CSRF - Encryption at rest protects Redis compromise **Alternative Considered:** localStorage - **Rejected:** Vulnerable to XSS ### 4. WebSocket Authentication **Decision:** Cookie-based session validation on upgrade **Rationale:** - Cookies automatically included in upgrade request - Consistent with REST API auth - No need for separate token mechanism **Implementation:** - Parse cookies from upgrade request headers - Load session from Redis - Validate user before accepting connection - Reject upgrade with 1008 code if unauthorized ### 5. OIDC Library **Decision:** `openid-client` (certified OIDC client) **Rationale:** - Official OpenID Foundation certified - Actively maintained - Built-in PKCE support - Automatic discovery **Alternative Considered:** `passport-openidconnect` - **Rejected:** More complex, passport overhead not needed ## Security Considerations ### Authentication - Authorization Code flow with PKCE - State parameter for CSRF protection - Nonce validation in ID token - Token signature verification - HTTPS-only in production ### Session Management - httpOnly cookies (no JavaScript access) - Secure flag (HTTPS-only) - sameSite=lax (CSRF protection) - 24-hour expiry with refresh - Encrypted tokens at rest ### API Protection - All endpoints require authentication - Group-based authorization - Rate limiting on auth endpoints - Origin validation on WebSocket upgrade - Content Security Policy headers ### Token Handling - Access tokens encrypted in Redis - Refresh tokens encrypted in Redis - Auto-refresh 5 minutes before expiry - Tokens cleared on logout - No tokens in logs ## Group-Based Access Control ### agent-users (Standard Access) - View hosts and projects - Create and manage own Claude sessions - Upload files to own sessions - View own session history - Standard WebSocket access ### agent-admin (Full Access) - All `agent-users` permissions - View all users' sessions (future) - Access admin endpoints (future) - Modify system configuration (future) **Note:** Initial implementation focuses on authentication and basic group enforcement. Fine-grained permissions can be expanded post-MVP. ## Integration Points ### Authentik Configuration ```yaml Provider: Name: Claude Web UI Type: OAuth2/OIDC Client Type: Confidential Flow: Authorization Code Scopes: openid, profile, email, groups Application: Name: Claude Web UI Provider: [linked] Launch URL: https://agents.sneakercloud.de Redirect URI: https://agents.sneakercloud.de/auth/callback ``` ### Environment Variables ```bash # OIDC OIDC_ISSUER=https://auth.sneakercloud.de/application/o/claude-web-ui/ OIDC_CLIENT_ID= OIDC_CLIENT_SECRET= OIDC_REDIRECT_URI=https://agents.sneakercloud.de/auth/callback # Session SESSION_SECRET= SESSION_MAX_AGE=86400000 # 24 hours REDIS_URL=redis://redis:6379 # App NODE_ENV=production FRONTEND_URL=https://agents.sneakercloud.de ``` ### Docker Compose Updates - Add Redis service - Mount Redis volume for persistence - Add environment variables - Configure service dependencies ## Risk Assessment ### High Risk **WebSocket Authentication Complexity** - Mitigation: Thorough testing, fallback to polling - Contingency: Feature flag to disable auth temporarily **Authentik Downtime** - Mitigation: Session caching, graceful degradation - Contingency: Allow continued use of valid sessions **Session Store Failure (Redis)** - Mitigation: Redis persistence, backup strategy - Contingency: Fallback to in-memory (degraded mode) ### Medium Risk **Token Refresh Failures** - Mitigation: Retry logic, clear error messages - Contingency: Force re-login **CORS/Cookie Issues** - Mitigation: Proper domain configuration, testing - Contingency: Documented troubleshooting steps ### Low Risk **Group Mapping Errors** - Mitigation: Default to `agent-users` if no groups - Contingency: Manual group assignment in Authentik ## Testing Strategy ### Unit Tests - OIDC client wrapper - Authentication middleware - Session management functions ### Integration Tests - Full login flow - Token exchange - Session persistence - WebSocket authentication ### Manual Tests - Login/logout flow - Session refresh - Token expiry handling - Group-based access - Multiple concurrent users - Admin vs user permissions ### Security Tests - XSS prevention (httpOnly cookies) - CSRF protection - Unauthorized API access - Session hijacking attempts - Token replay attacks ## Rollback Plan **Feature Flag:** `AUTH_ENABLED` environment variable **Rollback Steps:** 1. Set `AUTH_ENABLED=false` 2. Restart backend service 3. Authentication middleware skipped 4. Frontend shows app without login 5. Investigate and fix issues 6. Re-enable with `AUTH_ENABLED=true` **Data Preservation:** - Sessions stored in Redis persist - User data not lost - Can re-enable seamlessly ## Success Criteria ### Functional - [ ] Unauthenticated users cannot access app - [ ] Login redirects to Authentik and back successfully - [ ] WebSocket connections authenticated - [ ] Sessions persist across browser refresh - [ ] Logout clears session completely - [ ] Group-based access control enforced ### Performance - [ ] Auth middleware overhead <100ms - [ ] Session lookup <10ms - [ ] No WebSocket latency impact - [ ] Token refresh transparent to user ### Security - [ ] Zero XSS vulnerabilities - [ ] Zero unauthorized access - [ ] Tokens encrypted at rest - [ ] HTTPS-only in production - [ ] Rate limiting effective ### User Experience - [ ] Login flow <5 seconds - [ ] Clear error messages - [ ] No unnecessary re-authentication - [ ] Seamless session refresh ## Future Enhancements **Post-MVP Features:** - Multi-factor authentication via Authentik - API keys for CLI/programmatic access - Session management UI (view/revoke sessions) - Audit log for all actions - Fine-grained RBAC expansion - User preferences/settings storage - SSO with other internal services ## Dependencies ### NPM Packages - `express-session@^1.18.0` - Session management - `connect-redis@^7.1.0` - Redis session store - `redis@^4.6.0` - Redis client - `openid-client@^5.6.0` - OIDC client - `cookie-parser@^1.4.6` - Cookie parsing ### Infrastructure - Redis (session storage) - Authentik (identity provider) - Docker (containerization) ### External Services - Authentik @ auth.sneakercloud.de - Redis instance (new or existing) ## Documentation Deliverables 1. **AUTHENTICATION.md** - System overview and architecture 2. **SETUP.md** - Step-by-step Authentik configuration 3. **CONFIGURATION.md** - Environment variables reference 4. **TROUBLESHOOTING.md** - Common issues and solutions 5. **README.md** - Updated with authentication section ## Timeline Summary | Phase | Duration | Key Deliverables | |-------|----------|------------------| | 1: Foundation | 2-3 days | Authentik setup, backend refactor, Redis | | 2: Auth Flow | 2-3 days | OIDC routes, token handling, callbacks | | 3: API Protection | 2-3 days | Middleware, protected endpoints, WebSocket auth | | 4: Frontend UI | 3-4 days | AuthContext, login page, user menu | | 5: Hardening | 2-3 days | Security, logging, testing, docs | | **Total** | **11-16 days** | Production-ready OIDC authentication | ## Approval & Next Steps **This Plan:** - Provides complete roadmap for OIDC implementation - Addresses all security requirements - Maintains existing functionality - Enables phased rollout - Includes rollback strategy **Next Steps:** 1. Review and approve plan 2. Set up Authentik provider/application 3. Begin Phase 1 implementation 4. Schedule checkpoints after each phase 5. Plan production deployment **Questions/Concerns:** - Authentik already configured or needs setup? - Redis instance available or needs provisioning? - Preferred timeline/priority adjustments? - Additional requirements or constraints?