Skip to content

feat: refactor data masking module#589

Merged
LordofAvernus merged 16 commits intomainfrom
commit/data-masking
Apr 2, 2026
Merged

feat: refactor data masking module#589
LordofAvernus merged 16 commits intomainfrom
commit/data-masking

Conversation

@winfredLIN
Copy link
Copy Markdown
Collaborator

关联的 issue

https://github.com/actiontech/dms-ee/issues/736
link https://github.com/actiontech/dms-ee/pull/765

assign @LordofAvernus

描述你的变更

实现脱敏模块1期需求,主要功能包括:

  • 自动扫描数据库的敏感数据字段
  • 配置配置字段的脱敏规则
  • 脱敏应用于数据导出以及CloudBeaver两个模块

测试

⏩进展:数据脱敏1期 的产品验收和自测部分

确认项(pr提交后操作)

Tip

请在指定复审人之前,确认并完成以下事项,完成后✅


  • 我已完成自测
  • 我已记录完整日志方便进行诊断
  • 我已在关联的issue里补充了实现方案
  • 我已在关联的issue里补充了测试影响面
  • 我已确认了变更的兼容性,如果不兼容则在issue里标记 not_compatible
  • 我已确认了是否要更新文档,如果要更新则在issue里标记 need_update_doc

…ities

- cloudbeaver 通过依赖注入集成数据脱敏
- Introduced `SQLResultMasker` interface for masking SQL results during execution.
- Updated `CloudbeaverUsecase` to utilize the new `SQLResultMasker` for enhanced data privacy.
- Refactored `buildTaskIdAssocDataMasking` to accept a `taskMaskingContext` for improved masking context management.
- Integrated masking task checks within the GraphQL distributor to ensure sensitive data is appropriately handled during operations.
- Enhanced the initialization of `CloudbeaverService` to include the new SQL result masking functionality.
- 新增配置脱敏任务的权限
- Introduced a new operation permission for masking audit with UID `700038`.
- Updated the permission name for desensitization to "配置脱敏任务".
- Enhanced localization support by adding descriptions and names for the new masking audit permission.
- Refactored permission handling to accommodate the new permission type in various service components.
… repository

- 从数据源配置中移除是否开启脱敏,移动到脱敏任务,由数据源是否开启脱敏任务判断是否开启脱敏
- Removed `IsMaskingSwitch` from `DBService`, `BizDBServiceArgs`, and related methods to streamline the configuration.
- Introduced `MaskingTaskRepo` interface for managing masking task existence and status.
- Updated `DBServiceUsecase` to include `maskingTaskRepo` for enhanced masking task management.
- Adjusted various service methods to reflect the removal of masking switch logic, ensuring cleaner code and improved maintainability.
- Introduced a new method `GetURL` in the `ProxyTarget` struct to safely retrieve the URL as a string.
- The method checks for a nil URL and returns an empty string if it is not set, enhancing the usability of the `ProxyTarget` type.
…k repository

- 依赖注入
- Added `maskingConfigRepo` and `maskingTaskRepo` to the `DataExportWorkflowUsecase` for improved data masking capabilities.
- Updated the constructor for `DataExportWorkflowUsecase` to include new dependencies, facilitating better management of masking configurations and tasks.
- Introduced a new file `data_masking_ce.go` to define methods related to data masking, returning errors for unsupported operations in the current context.
- Refactored `DMSService` to initialize the new masking configuration repository and update the data masking use case, ensuring a cohesive integration of data masking functionalities.
…g templates

- 接口定义新增与调整
- Removed `IsEnableMasking` field from `DBService`, `UpdateDBService`, and `ImportDBService` to streamline configuration.
- Introduced new API endpoints for managing masking rules and templates, enhancing data masking capabilities.
- Added methods for listing, adding, updating, and deleting masking templates and sensitive data discovery tasks in the `DMSController`.
- Updated routing to accommodate new masking functionalities, ensuring better organization and maintainability of the codebase.
- Revised descriptions for existing endpoints to enhance clarity, including responses for listing masking rules and approval requests.
- Introduced new endpoints for managing masking approval requests, including listing pending requests and processing decisions.
- Added definitions for new request and response structures related to masking templates and sensitive data discovery tasks in the Swagger documentation.
- Improved organization of API routes under the `/masking` namespace for better maintainability.
- 移除不再需要的旧代码
- Removed `data_masking_ce.go`, `data_masking_ee.go`, and associated configuration files to streamline the codebase.
- Eliminated `data_masking_conf_ee.yml` and `data_masking_rule_in_ee.yml` as part of the cleanup process.
- Deleted `masking_ce.go`, `masking_ee.go`, and related service files to enhance maintainability and focus on core functionalities.
…overy task repositories

- 补充依赖注入接口和空实现
- Added `DataExportMaskingConfigRepo` interface to define data masking configuration methods.
- Implemented `SensitiveDataDiscoveryTaskRepo` with methods for checking task existence and listing task statuses.
- Enhanced `dataMaskingUsecase` struct to include a discovery task use case for better task management.
- Changed the x-go-package path in Swagger files to reflect the new structure.
- Removed unused properties from ListMemberRoleWithOpRange definition to streamline the API.
- Added new properties for database service host and port in ListSensitiveDataDiscoveryTasksData to enhance data discovery capabilities.
- Updated related Go structs to include new fields for database service host and port, ensuring consistency across the codebase.
- 新增接口,用于获取可以创建脱敏任务的数据源列表
- 调整接口,支持获取全局中脱敏支持的数据源类型
- Added a new endpoint to list creatable database services for sensitive data discovery tasks, allowing users to specify project UID and optional query parameters for pagination and filtering.
- Introduced a function support filter in the existing global DB services tips endpoint to return database types based on specified functionality.
- Updated Swagger documentation to reflect new API definitions and parameters, ensuring clarity and consistency across the API.
- Enhanced related Go structs and service methods to support the new functionality, improving the overall data masking capabilities.
- 增加敏感数据扫描任务的终止状态
- Changed the x-go-package path in Swagger files to the new structure for consistency.
- Added "STOPPED" status to the task status descriptions and enums in both Swagger JSON and YAML files.
- Updated related Go constants and methods to include the new "STOPPED" status for sensitive data discovery tasks, enhancing the API's functionality.
…urations

- 优化敏感数据扫描流程,不重复扫描用户已确认字段
- 排除Oracle系统表
- 区分用户更新和系统更新
- Updated the `ConfigureMaskingRules` method to accept a `currentUserUid` parameter, allowing for user-specific masking rule configurations.
- Modified the `BatchUpsertDiscoveryResults` method to handle user updates distinctly from system updates, ensuring better control over masking configurations.
- Introduced a new method `ConfigureMaskingRulesByUser` in the `MaskingRuleManagementUsecase` to facilitate user-specific rule management.
- Enhanced the `SensitiveDiscoveryUsecase` to exclude previously configured columns during sensitive data discovery, improving the accuracy of the discovery process.
- Updated the Oracle metadata collector to include additional default schemas for exclusion, refining the schema collection process.
…task management

- 调整更新敏感数据扫描任务的接口,改为Action模式,简化状态流转。移除状态机
- 修复swagger定义错误
- Corrected enum values in Swagger files for consistency, changing "[data_masking]" to "data_masking".
- Updated x-go-package paths to reflect the new structure across multiple definitions.
- Enhanced the UpdateSensitiveDataDiscoveryTaskReq to use an action-based approach, replacing the task requirement with an action field for better clarity in task management.
- Added new properties to ListMemberRoleWithOpRange for member group and operation permissions, improving the API's functionality.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 30, 2026

PR Reviewer Guide 🔍

(Review updated until commit 978b650)

⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

API定义一致性

新增的脱敏模板与规则数据结构定义较多,请确认各字段(例如“masking_type”、“effect_example_after”等)的注释、示例及验证规则与系统其它接口保持一致,并确保Swagger文档能准确反映这些改动。

// swagger:model ListMaskingRulesData
type ListMaskingRulesData struct {
	// masking type
	// Example: "MASK_DIGIT"
	MaskingType string `json:"masking_type"`
	// description
	// Example: "mask digits"
	Description string `json:"description"`
	// effect description for users
	// Example: "保留开头2位和结尾2位,中间字符替换为*"
	Effect string `json:"effect"`
	// effect example before masking
	// Example: "13812345678"
	EffectExampleBefore string `json:"effect_example_before"`
	// effect example after masking
	// Example: "138******78"
	EffectExampleAfter string `json:"effect_example_after"`
	// masking rule id
	// Example: 1
	Id int `json:"id"`
}

// swagger:model ListMaskingRulesReply
type ListMaskingRulesReply struct {
	// list masking rule reply
	Data []ListMaskingRulesData `json:"data"`

	base.GenericResp
}
任务关联错误处理

在构建任务ID与数据脱敏关联的过程(例如buildTaskIdAssocDataMasking)中,新引入的taskMaskingContext类型需要确保类型断言和错误处理逻辑完善,便于后续问题追踪和调试。

type taskMaskingContext struct {
	Enabled      bool
	DBServiceUID string
	SchemaName   string
}

var (
	taskIDAssocUid     sync.Map
	taskIdAssocMasking sync.Map
)

func (cu *CloudbeaverUsecase) buildTaskIdAssocDataMasking(raw []byte, maskingCtx taskMaskingContext) error {
	var taskInfo TaskInfo

	if err := UnmarshalGraphQLResponse(raw, &taskInfo); err != nil {
		cu.log.Errorf("extract task id err: %v", err)

		return fmt.Errorf("extract task id err: %v", err)
	}

	taskIdAssocMasking.Store(taskInfo.Data.TaskInfo.ID, maskingCtx)

	return nil
}
权限文案更新

新增了多个与脱敏、审核相关的权限描述,请确认所有描述与系统内其它权限文案保持风格一致,便于用户识别并减少混淆。

NameOpPermissionMangeAuditSQLWhiteList  = &i18n.Message{ID: "NameOpPermissionMangeAuditSQLWhiteList", Other: "审核SQL例外"}
NameOpPermissionManageSQLMangeWhiteList = &i18n.Message{ID: "NameOpPermissionManageSQLMangeWhiteList", Other: "管控SQL例外"}
NameOpPermissionManageRoleMange         = &i18n.Message{ID: "NameOpPermissionManageRoleMange", Other: "角色管理权限"}
NameOpPermissionDesensitization         = &i18n.Message{ID: "NameOpPermissionDesensitization", Other: "配置脱敏任务"}
NameOpPermissionMaskingAudit            = &i18n.Message{ID: "NameOpPermissionMaskingAudit", Other: "脱敏审核"}

DescOpPermissionGlobalManagement        = &i18n.Message{ID: "DescOpPermissionGlobalManagement", Other: "具备系统最高权限,可进行系统配置、用户管理等操作"}
DescOpPermissionGlobalView              = &i18n.Message{ID: "DescOpPermissionGlobalView", Other: "负责系统操作审计、数据合规检查等工作"}

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
更正数组验证关键字

请将该数组属性的约束关键字从 minLength 替换为 minItems,因为在 JSON Schema 中数组的最小项数量应使用
minItems。此修改可以防止客户端或服务端对数组数据验证出错,是一个关键问题。

api/swagger.yaml [168-179]

 rule_ids:
     description: masking rule id list
     example:
         - 1
         - 2
         - 3
     items:
         format: int64
         type: integer
-    minLength: 1
+    minItems: 1
     type: array
     x-go-name: RuleIDs
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly detects that the JSON Schema validation for arrays should use minItems instead of minLength, preventing potential validation errors.

High
替换数组验证关键字

同样建议将此数组属性中的 minLength 修改为 minItems,以符合 JSON Schema
对数组最小元素数量的要求。此修改可以防止验证时出现不可预期的错误,确保数据正确传递。

api/swagger.yaml [1085-1090]

 masking_rule_configs:
     description: masking rule configurations for batch create or update
     items:
         $ref: '#/definitions/MaskingRuleConfig'
-    minLength: 1
+    minItems: 1
     type: array
     x-go-name: MaskingRuleConfigs
Suggestion importance[1-10]: 9

__

Why: The improvement accurately replaces minLength with minItems for the array in masking_rule_configs, aligning with proper JSON Schema specifications and ensuring correct data validation.

High
添加错误处理逻辑

建议对调用 CheckMaskingTaskExist
返回的错误值进行处理,而不是直接忽略错误。这样可以防止因查询失败而导致误判数据脱敏功能状态,从而引发潜在的异常或逻辑错误。

internal/dms/biz/cloudbeaver.go [466]

-isMaskingEnabled, _ := cu.maskingTaskRepo.CheckMaskingTaskExist(c.Request().Context(), dbService.UID)
+isMaskingEnabled, err := cu.maskingTaskRepo.CheckMaskingTaskExist(c.Request().Context(), dbService.UID)
+if err != nil {
+    cu.log.Error("CheckMaskingTaskExist error", err)
+    // 根据业务场景决定是否返回错误或者使用默认值
+}
Suggestion importance[1-10]: 8

__

Why: 此建议通过增加错误处理逻辑来避免忽略 CheckMaskingTaskExist 返回的错误,提高了数据脱敏功能状态判断的可靠性,改善了代码的健壮性。

Medium

- Modified the pattern in the check-pr-files.yml to exclude _ee.md files from changes, enhancing the file change prevention mechanism.
@github-actions
Copy link
Copy Markdown

Persistent review updated to latest commit 978b650

@github-actions
Copy link
Copy Markdown

Failed to generate code suggestions for PR

@LordofAvernus LordofAvernus merged commit f5265e6 into main Apr 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants