feat: gdrive export and encryption service integration#5250
Conversation
… DB schema - Add user_oauth_token table to store encrypted OAuth refresh tokens per provider - Add TokenEncryptionService using jose4j AES-256-GCM for encrypting auth blobs - Add AuthConfig.encryptionSecretKey reading from auth.encryption.256-bit-secret - Add GoogleDriveAuthResource with /connect, /callback, and /token endpoints - Add GoogleAuthResource config endpoint exposing client ID and redirect URI - Add DriveTokenIssueResponse and GoogleAuthConfigResponse HTTP models - Wire GoogleDriveAuthResource into TexeraWebApplication and GuestAuthFilter - Add google.client-id, client-secret, and app-domain to UserSystemConfig - Update k8s values with new config keys Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…nd error case Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5250 +/- ##
============================================
- Coverage 49.11% 49.07% -0.04%
- Complexity 2378 2380 +2
============================================
Files 1051 1050 -1
Lines 40342 40301 -41
Branches 4277 4266 -11
============================================
- Hits 19812 19777 -35
+ Misses 19373 19368 -5
+ Partials 1157 1156 -1
*This pull request uses carry forward flags. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
xuang7
left a comment
There was a problem hiding this comment.
Thanks for the PR! Left a few comments. Please follow the formatting instructions in the
contributing guide and fix the formatting issues.
| logger.error("Google token exchange failed in callback", e) | ||
| Response.status(Response.Status.BAD_GATEWAY).build() | ||
| case e: Exception => | ||
| logger.error("Unexpected error in OAuth callback", e) |
There was a problem hiding this comment.
We could consider returning an error message to the opener window and closing the OAuth popup.
| @QueryParam("reauth") @DefaultValue("false") reauth: Boolean | ||
| ): Response = { | ||
| val user = sessionUser.getUser | ||
| val state = JwtAuth.jwtToken(jwtClaims(user, TOKEN_EXPIRE_TIME_IN_MINUTES)) |
There was a problem hiding this comment.
We should avoid using the normal session JWT as the OAuth state. Since it is still a valid login token before expiration, it may be safer to use a dedicated short-lived OAuth state token instead.
|
|
||
| try { | ||
| val blob = mapper.readTree(TokenEncryptionService.decrypt(record.getAuthBlob)) | ||
| val refreshToken = blob.get("refreshToken").asText() |
There was a problem hiding this comment.
Suggest using path("refreshToken").asText("") here to avoid a possible NPE when the field is missing.
There was a problem hiding this comment.
@xuang7 I can add this, but seeing as this is wrapped in a try-catch, I feel like the error is fine/more defined, compared to getting "", sending a request to google and getting an error there.
…ogleDriveAuthResource OAuth state is now a UUID stored in a ConcurrentHashMap with a 10-minute TTL, consumed exactly once on callback. Removes JwtParser/JwtAuth dependency from the Drive resource and avoids encoding user info in the callback URL. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Removed random secret key for eSecretKey
Added default asText("") to avoid NPE
…_token - Add DELETE /api/auth/google/drive/disconnect to remove stored OAuth token - Add created_at and updated_at columns to user_oauth_token table - Set updated_at on token refresh in callback Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
What changes were proposed in this PR?
Adds the backend required for Google Drive OAuth integration.
Schema changes: Adds a new
user_oauth_tokentable (sql/updates/23.sql) to store encrypted OAuth tokens per provider. Theprovidercolumn (google_drive, etc.) is intentionally generic so future integrations (AWS, Microsoft) can reuse the same table without a schema change. The auth blob is stored as a JWE-encrypted JSON string rather than a raw token.Token encryption: Adds
TokenEncryptionServiceusing jose4j AES-256-GCM (DIRECTkey management) to encrypt auth blobs at rest. The encryption key is read fromauth.encryption.256-bit-secretinauth.conf, withAUTH_ENCRYPTION_SECRETas the env-var override. This follows the same pattern as the existing JWT secret key.New endpoints —
GoogleDriveAuthResource:GET /api/auth/google/drive/connect— Returns a Google OAuth authorization URL for the frontend to open in a popup. Accepts areauthquery param; whentrue, setsprompt=consentto force Google to re-issue a refresh token (used when a previous token has returnedinvalid_grant). RequiresREGULARorADMINrole.GET /api/auth/google/drive/callback— Called by Google's OAuth redirect. Not role-gated (noAuthorizationheader is present on a browser redirect). Authenticates the user via a short-lived JWT in thestatequery parameter, exchanges thecodefor tokens, encrypts the auth blob, and upserts intouser_oauth_token.GET /api/auth/google/drive/token— Decrypts the stored auth blob, uses the refresh token to fetch a short-lived access token from Google, and returns it to the frontend. Returnsno_refresh_tokenif no record exists, orinvalid_grantif Google rejects the refresh token. RequiresREGULARorADMINrole.GET /api/auth/google/config— ExposesclientIdandredirectUrito the frontend so the Drive service doesn't need to hardcode them.Config: Adds
google.client-id,google.client-secret, andapp-domaintoUserSystemConfiganduser-system.conf. These must be configured on the Texera GCP project before Drive integration will work.Any related issues, documentation, discussions?
Closes #4240 (partial — frontend in follow-up PRs)
Google Documentation to enable Google Picker: https://developers.google.com/workspace/drive/picker/guides/overview
How was this PR tested?
sbt "Auth/testOnly org.apache.texera.auth.TokenEncryptionServiceSpec"— 2 unit tests covering encrypt/decrypt round-trip and invalid-input error casesbt amber/compile/callbackendpoint was tested manually via the full OAuth flow in a local dev environmentWas this PR authored or co-authored using generative AI tooling?
Commit messages and some implementation co-authored with Claude Sonnet 4.6