[INS-455] Unify common logic in Atlassian Data Center detectors#4907
[INS-455] Unify common logic in Atlassian Data Center detectors#4907mustansir14 wants to merge 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 7c19046. Configure here.
| &billomat.Scanner{}, | ||
| &bingsubscriptionkey.Scanner{}, | ||
| &bitbar.Scanner{}, | ||
| &bitbucketdatacenter.Scanner{}, |
There was a problem hiding this comment.
Bitbucket detector silently added to default scanner list
Medium Severity
The bitbucketdatacenter.Scanner{} is being newly registered in the default detector list (both import and scanner instantiation are pure additions with no corresponding removal). The detector previously existed as code but was not included in the default scan engine. This PR's stated goal is to "unify common logic," but silently enabling a previously-unregistered detector in production scans could produce unexpected new findings and has performance implications. The addition to noCloudEndpointDetectors in the test file confirms this is a net-new registration.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit 7c19046. Configure here.
|
This is completely irrelevant to the goal of this PR. But can you please add similar logic in |
MuneebUllahKhan222
left a comment
There was a problem hiding this comment.
Couple of changes that need to be addressed, Other than that the PR is good to go.
| @@ -170,35 +136,13 @@ func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (result | |||
|
|
|||
| func verifyPAT(ctx context.Context, client *http.Client, baseURL, token string) (bool, error) { | |||
| endpoint := strings.TrimRight(baseURL, "/") + "/rest/api/user/current" | |||
There was a problem hiding this comment.
No need to do TrimRight here because we are already doing that in FindEndpoints function.
| u, err := detectors.ParseURLAndStripPathAndParams(baseURL) | ||
| if err != nil { | ||
| return false, err | ||
| } | ||
| u.Path = "rest/api/1.0/projects" | ||
| q := u.Query() | ||
| q.Set("limit", "1") | ||
| u.RawQuery = q.Encode() |
There was a problem hiding this comment.
We can get rid of all this and do
baseURL + "/rest/api/1.0/projects?limit=1"
| u, err := detectors.ParseURLAndStripPathAndParams(baseURL) | ||
| if err != nil { | ||
| return false, nil, err | ||
| } | ||
| u.Path = "/rest/api/2/myself" |
There was a problem hiding this comment.
We can get rid of all this and do
baseURL + "/rest/api/2/myself"


Background
Three detectors target Atlassian Data Center (self-hosted) products:
JiraDataCenterPAT,ConfluenceDataCenter, andBitbucketDataCenter. Reviewing them side-by-side revealed significant duplication: identical structural-validation logic, the same bearer-auth HTTP plumbing, and the same URL-extraction pipeline repeated verbatim in each file.What changed
New shared package:
pkg/detectors/atlassiandatacenter/Following the pattern established by
pkg/detectors/aws/, the three detectors are now co-located under a common parent that also houses shared utilities incommon.go.GetDCTokenPat(prefixes []string) *regexp.RegexpReturns the compiled PAT regex for Jira/Confluence DC tokens (44-char base64 strings whose decoded form is
<numeric-id>:<random-bytes>). Previously each detector hand-rolled the same pattern with inline literal strings.IsStructuralPAT(candidate string) boolValidates that a base64 candidate decodes to the
<digits>:<bytes>structure. This function was copy-pasted verbatim between Jira and Confluence.GetURLPat(prefixes []string) *regexp.RegexpReturns the compiled URL regex for self-hosted Atlassian instance URLs. Each detector calls this once at package init time and stores the result in a package-level var, preserving the same compile-once behaviour as before.
go-re2compiles via CGo/FFI so doing this per-chunk would be a meaningful regression.FindEndpoints(data string, urlPat *regexp.Regexp, resolve func(...string) []string) []stringExtracts keyword-scoped URLs from a chunk using the pre-compiled
urlPat, passes them throughs.Endpoints(which merges configured endpoints), deduplicates, and strips trailing slashes. All three detectors had their own multi-step version of this pipeline.MakeVerifyRequest(ctx, client, fullURL, token string) (bool, map[string]any, error)Sends a Bearer-authenticated GET and interprets the response:
200→(true, decoded-JSON-body, nil),401→(false, nil, nil), other →(false, nil, error). Previously each detector built the request inline and contained its own copy of the200 / 401 / defaultstatus-code switch. Jira readsbody["displayName"]andbody["emailAddress"]from the returned map; Confluence and Bitbucket discard it.Detector changes
Each detector now imports the shared package and delegates to it:
isStructuralPAT(Jira + Confluence)atlassiandatacenter.IsStructuralPATatlassiandatacenter.GetDCTokenPat(keywords)atlassiandatacenter.FindEndpoints(...)atlassiandatacenter.MakeVerifyRequest(...)The URL regex is also standardised across all three detectors. Confluence and Bitbucket previously used an unbounded
\d+port pattern and allowed hostnames starting with.or-; all three now use[a-zA-Z0-9][a-zA-Z0-9.\-]*with\d{1,5}for the port (matching the stricter Jira original).A
keywordspackage-level variable is introduced in Jira (Confluence and Bitbucket already had one) so the keyword list is defined once and reused by the regex,FindEndpoints, andKeywords().TestIsStructuralPATmigratedThe structural-PAT test previously lived in
jiradatacenterpat_test.goand called the private function. It is now incommon_test.goand covers tokens from all three products.common_test.goalso adds tests forGetDCTokenPat,FindEndpoints, andMakeVerifyRequest.What did not change
Scannertype,Keywords(),Type(),Description(), andFromData()logic.displayName,emailAddress) is unchanged.invalidHostscache anderrNoHostsentinel are unchanged.?limit=1query parameter is unchanged.Corpora Tests
Checklist:
make test-community)?make lintthis requires golangci-lint)?Note
Medium Risk
Mostly a refactor, but it changes shared URL matching/endpoint extraction and verification plumbing across three detectors, which could impact detection/verification behavior and false positive/negative rates.
Overview
Consolidates duplicated logic across Atlassian Data Center detectors by introducing a new shared
atlassiandatacenterhelper package for PAT/URL regexes, endpoint extraction, structural PAT validation, and a common Bearer-auth verification request helper.Updates the Jira/Confluence/Bitbucket DC detectors to delegate to these shared helpers (including a standardized, slightly stricter URL pattern), removes their inlined structural/HTTP code, and relocates detector imports in
defaults.go/tests (plus registering Bitbucket DC as a no-cloud-endpoint detector). Adds a comprehensivecommon_test.gocovering the new helpers and moves structural-PAT tests out of Jira-specific tests.Reviewed by Cursor Bugbot for commit 7c19046. Bugbot is set up for automated code reviews on this repo. Configure here.