Skip to content

feat(users): add GDPR data provider registry#3894

Open
SirYadav1 wants to merge 1 commit into
pierreb-devkit:masterfrom
SirYadav1:fix/gdpr-tasks-data-provider
Open

feat(users): add GDPR data provider registry#3894
SirYadav1 wants to merge 1 commit into
pierreb-devkit:masterfrom
SirYadav1:fix/gdpr-tasks-data-provider

Conversation

@SirYadav1

@SirYadav1 SirYadav1 commented Jun 17, 2026

Copy link
Copy Markdown

What

Adds a config-free, import-safe leaf registry that optional modules can use to self-register GDPR data providers. This is the foundation piece needed for the GDPR export and erasure endpoints (issue #3884).

Why

Following the same pattern as organizations/lib/orgRemoval.registry.js, but using a Map keyed by stable string keys instead of a Set of function identities. This prevents double-registration when inline arrows are used in *.init.js files.

Changes

  • New file: modules/users/lib/dataProvider.registry.js

    • registerDataProvider({ key, axis, retention, export, erase }) — validates all inputs, throws TypeError on bad params
    • runDataExport(payload) — runs all providers sequentially, returns { data, modules }
    • runDataErasure(payload) — runs all providers sequentially, errors propagate (fail-closed)
    • getProviders() — returns a read-only Map for inspection
    • _reset() — test helper to clear all registrations
  • New file: modules/users/tests/dataProvider.registry.test.js

    • 15 unit tests covering registration, validation, export, erasure, error propagation, key dedup, and reset

Testing

All 15 tests pass. You can run them with:

npm test -- --testPathPattern=dataProvider.registry

Notes

This registry intentionally imports no services — it follows the same pattern as orgRemoval.registry.js to avoid import cycles. Individual modules (tasks, uploads, etc.) will self-register their providers in their respective *.init.js files once this is merged.

Closes #3883

Adds a config-free, import-safe leaf registry that optional modules
can use to self-register GDPR data providers. This is the foundation
for the GDPR export and erasure endpoints.

Key features:
- Map-keyed by stable string key (prevents double-registration)
- Validates key, axis, retention, export, and erase parameters
- Sequential execution with error propagation (fail-closed)
- _reset() helper for testing

Closes pierreb-devkit#3883
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

A new Map-keyed GDPR data provider registry module is created at modules/users/lib/dataProvider.registry.js. It provides registerDataProvider, runDataExport, runDataErasure, getProviders, and _reset. A companion Jest suite at modules/users/tests/dataProvider.registry.test.js exercises all behaviors with 293 lines of tests.

Changes

GDPR Data Provider Registry

Layer / File(s) Summary
Registry storage, registration, and runners
modules/users/lib/dataProvider.registry.js
Introduces a Map-backed provider store keyed by stable string key. registerDataProvider validates key, axis, retention, and requires export/erase as functions before storing. runDataExport iterates providers in insertion order, awaits each export, and returns { data, modules }. runDataErasure iterates providers sequentially, awaits each erase with fail-closed semantics, and returns { results }. getProviders returns a copied Map; _reset clears the store. A default export bundles all functions.
Jest test suite
modules/users/tests/dataProvider.registry.test.js
Covers valid registration, TypeError for all invalid inputs (key, axis, retention, non-function export/erase), key-dedup overwrite semantics, single- and multi-provider export with ordered modules list, empty-registry no-throw, single and multi-provider erasure, fail-closed error propagation on erasure throw, sequential execution order, and _reset clearing the registry.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

  • #3883 — This PR directly implements the full scope described in that issue: the Map-keyed dataProvider.registry.js leaf module with registerDataProvider, runDataExport, runDataErasure, _reset, and the complete unit suite including key-dedup, error-propagate-and-abort, zero-provider no-throw, and non-fn rejection cases.
  • #3885 — That issue depends on this registry as a foundation: tasks.init.js will call registerDataProvider({ key: 'tasks', ... }) from the module introduced here.
  • #3884 — That issue depends on this registry: its controller implementation will import and invoke runDataExport and runDataErasure from this module.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(users): add GDPR data provider registry' clearly and concisely summarizes the main change—adding a new GDPR data provider registry module.
Linked Issues check ✅ Passed The implementation fully satisfies all functional requirements from #3883: Map-keyed registry with validation, sequential export/erasure execution, error propagation, key deduplication, test helpers, and comprehensive unit tests (15 passing).
Out of Scope Changes check ✅ Passed All changes are scoped to the GDPR registry foundation: two new files (implementation + tests) with no modifications to existing modules, remaining within #3883's stated scope.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The pull request description comprehensively covers all required template sections including summary, scope, validation, guardrails, and notes for reviewers.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@modules/users/lib/dataProvider.registry.js`:
- Around line 24-25: The validation check in the registerDataProvider function
currently only checks if the key is a non-empty string, but it allows
whitespace-only strings like '   ' to pass through. Update the validation
condition to also reject keys that consist only of whitespace characters by
using the trim() method to remove leading and trailing whitespace and then
checking if the resulting string is empty, ensuring that provider keys are truly
meaningful and non-whitespace.
- Line 86: The `getProviders()` function creates a shallow copy of the Map, but
the provider objects inside remain mutable references that can be altered by
callers, allowing modification of `axis`, `retention`, `export`, or `erase`
properties outside of the intended registry control. Modify `getProviders()` to
return a deeply immutable version of the Map by either cloning each provider
object and freezing them, or by using Object.freeze() on cloned copies of the
provider objects before returning them in the new Map. This ensures that callers
cannot mutate the provider objects and bypass the registration control flow.

In `@modules/users/tests/dataProvider.registry.test.js`:
- Around line 10-293: Add JSDoc headers with `@param` and `@returns` documentation
to all functions introduced or modified in this test file to comply with
repository documentation standards. For each test callback function, inline
async function (such as exportFn, eraseFn, tasksExport, uploadsExport,
tasksErase, uploadsErase, eraseFn1, eraseFn2, etc.), and helper function like
_reset, registerDataProvider, getProviders, runDataExport, and runDataErasure,
include JSDoc blocks that clearly document what parameters the function accepts
and what it returns, even for test files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: bcf10f0b-6273-46ae-ac17-c70b86abb31a

📥 Commits

Reviewing files that changed from the base of the PR and between f9fbedd and 91fcb47.

📒 Files selected for processing (2)
  • modules/users/lib/dataProvider.registry.js
  • modules/users/tests/dataProvider.registry.test.js

Comment on lines +24 to +25
if (typeof key !== 'string' || !key) {
throw new TypeError('registerDataProvider: key must be a non-empty string');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject whitespace-only provider keys.

At Line 24, ' ' passes validation and becomes a valid registry key. This makes module manifests/audit output ambiguous and defeats the “non-empty string” intent.

Suggested fix
-  if (typeof key !== 'string' || !key) {
+  if (typeof key !== 'string' || !key.trim()) {
     throw new TypeError('registerDataProvider: key must be a non-empty string');
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (typeof key !== 'string' || !key) {
throw new TypeError('registerDataProvider: key must be a non-empty string');
if (typeof key !== 'string' || !key.trim()) {
throw new TypeError('registerDataProvider: key must be a non-empty string');
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/users/lib/dataProvider.registry.js` around lines 24 - 25, The
validation check in the registerDataProvider function currently only checks if
the key is a non-empty string, but it allows whitespace-only strings like '   '
to pass through. Update the validation condition to also reject keys that
consist only of whitespace characters by using the trim() method to remove
leading and trailing whitespace and then checking if the resulting string is
empty, ensuring that provider keys are truly meaningful and non-whitespace.

* @description Get all registered providers (read-only view for testing/inspection).
* @returns {Map<string, Object>}
*/
export const getProviders = () => new Map(providers);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

getProviders() leaks mutable provider objects despite “read-only view”.

At Line 86, new Map(providers) only copies the map container. The provider objects are still shared references, so callers can mutate axis, retention, export, or erase and silently alter runtime behavior outside registerDataProvider.

Suggested fix
-export const getProviders = () => new Map(providers);
+export const getProviders = () => new Map(
+  Array.from(providers.entries(), ([key, provider]) => [key, Object.freeze({ ...provider })]),
+);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/users/lib/dataProvider.registry.js` at line 86, The `getProviders()`
function creates a shallow copy of the Map, but the provider objects inside
remain mutable references that can be altered by callers, allowing modification
of `axis`, `retention`, `export`, or `erase` properties outside of the intended
registry control. Modify `getProviders()` to return a deeply immutable version
of the Map by either cloning each provider object and freezing them, or by using
Object.freeze() on cloned copies of the provider objects before returning them
in the new Map. This ensures that callers cannot mutate the provider objects and
bypass the registration control flow.

Comment on lines +10 to +293
describe('DataProvider Registry', () => {
beforeEach(() => {
_reset();
});

describe('registerDataProvider', () => {
it('should register a valid provider', () => {
const exportFn = async () => ({});
const eraseFn = async () => ({});

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: exportFn,
erase: eraseFn,
});

const providers = getProviders();
expect(providers.size).toBe(1);
expect(providers.has('tasks')).toBe(true);
});

it('should throw TypeError for empty key', () => {
expect(() => registerDataProvider({
key: '',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: async () => ({}),
})).toThrow('registerDataProvider: key must be a non-empty string');
});

it('should throw TypeError for non-string key', () => {
expect(() => registerDataProvider({
key: 123,
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: async () => ({}),
})).toThrow('registerDataProvider: key must be a non-empty string');
});

it('should throw TypeError for invalid axis', () => {
expect(() => registerDataProvider({
key: 'tasks',
axis: 'invalid',
retention: 'delete',
export: async () => ({}),
erase: async () => ({}),
})).toThrow('registerDataProvider: axis must be "user" or "org"');
});

it('should throw TypeError for invalid retention', () => {
expect(() => registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'invalid',
export: async () => ({}),
erase: async () => ({}),
})).toThrow('registerDataProvider: retention must be "delete" or "anonymize"');
});

it('should throw TypeError for non-function export', () => {
expect(() => registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: 'not a function',
erase: async () => ({}),
})).toThrow('registerDataProvider: export must be a function');
});

it('should throw TypeError for non-function erase', () => {
expect(() => registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: 'not a function',
})).toThrow('registerDataProvider: erase must be a function');
});

it('should overwrite provider with same key (key-dedup)', () => {
const exportFn1 = async () => ({ version: 1 });
const eraseFn1 = async () => ({});
const exportFn2 = async () => ({ version: 2 });
const eraseFn2 = async () => ({});

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: exportFn1,
erase: eraseFn1,
});

registerDataProvider({
key: 'tasks',
axis: 'org',
retention: 'anonymize',
export: exportFn2,
erase: eraseFn2,
});

const providers = getProviders();
expect(providers.size).toBe(1);
expect(providers.get('tasks').axis).toBe('org');
expect(providers.get('tasks').retention).toBe('anonymize');
});
});

describe('runDataExport', () => {
it('should run single provider export', async () => {
const exportFn = async (payload) => ({
tasks: [{ id: 1, title: 'Test Task' }],
userId: payload.userId,
});

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: exportFn,
erase: async () => ({}),
});

const result = await runDataExport({ userId: 'user123' });

expect(result.data.tasks).toEqual({
tasks: [{ id: 1, title: 'Test Task' }],
userId: 'user123',
});
expect(result.modules).toEqual(['tasks']);
});

it('should run multiple providers sequentially', async () => {
const order = [];

const tasksExport = async () => {
order.push('tasks');
return { tasks: [] };
};

const uploadsExport = async () => {
order.push('uploads');
return { uploads: [] };
};

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: tasksExport,
erase: async () => ({}),
});

registerDataProvider({
key: 'uploads',
axis: 'user',
retention: 'delete',
export: uploadsExport,
erase: async () => ({}),
});

const result = await runDataExport({ userId: 'user123' });

expect(order).toEqual(['tasks', 'uploads']);
expect(result.modules).toEqual(['tasks', 'uploads']);
});

it('should return empty data for zero providers', async () => {
const result = await runDataExport({ userId: 'user123' });

expect(result.data).toEqual({});
expect(result.modules).toEqual([]);
});
});

describe('runDataErasure', () => {
it('should run single provider erase', async () => {
const eraseFn = async (payload) => ({
deleted: 5,
userId: payload.userId,
});

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: eraseFn,
});

const result = await runDataErasure({ userId: 'user123' });

expect(result.results.tasks).toEqual({
deleted: 5,
userId: 'user123',
});
});

it('should propagate errors (fail-closed)', async () => {
const eraseFn1 = async () => {
throw new Error('Provider 1 failed');
};

const eraseFn2 = async () => ({
deleted: 3,
});

registerDataProvider({
key: 'failing',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: eraseFn1,
});

registerDataProvider({
key: 'success',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: eraseFn2,
});

await expect(runDataErasure({ userId: 'user123' }))
.rejects.toThrow('Provider 1 failed');
});

it('should run providers sequentially', async () => {
const order = [];

const tasksErase = async () => {
order.push('tasks');
return { deleted: 1 };
};

const uploadsErase = async () => {
order.push('uploads');
return { deleted: 2 };
};

registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: tasksErase,
});

registerDataProvider({
key: 'uploads',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: uploadsErase,
});

await runDataErasure({ userId: 'user123' });

expect(order).toEqual(['tasks', 'uploads']);
});
});

describe('_reset', () => {
it('should clear all registered providers', () => {
registerDataProvider({
key: 'tasks',
axis: 'user',
retention: 'delete',
export: async () => ({}),
erase: async () => ({}),
});

expect(getProviders().size).toBe(1);

_reset();

expect(getProviders().size).toBe(0);
});
});
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Add JSDoc headers for all new/modified functions in this test file.

Multiple functions introduced in this file (test callbacks and helper functions) do not include JSDoc blocks with @param and @returns, which violates the repository JS documentation rule.

As per coding guidelines, "**/*.js: Every function must have a JSDoc header with @param and @returns" and "**/*.js: Every new or modified function must have a JSDoc header...".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/users/tests/dataProvider.registry.test.js` around lines 10 - 293, Add
JSDoc headers with `@param` and `@returns` documentation to all functions introduced
or modified in this test file to comply with repository documentation standards.
For each test callback function, inline async function (such as exportFn,
eraseFn, tasksExport, uploadsExport, tasksErase, uploadsErase, eraseFn1,
eraseFn2, etc.), and helper function like _reset, registerDataProvider,
getProviders, runDataExport, and runDataErasure, include JSDoc blocks that
clearly document what parameters the function accepts and what it returns, even
for test files.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🔒 GDPR registry leaf — modules/users/lib/dataProvider.registry.js (Map-keyed)

1 participant