Skip to content

fix: support MySQL user-level lock functions#24909

Open
jiangxinmeng1 wants to merge 4 commits into
matrixorigin:4.0-devfrom
jiangxinmeng1:fix-user-level-lock-functions
Open

fix: support MySQL user-level lock functions#24909
jiangxinmeng1 wants to merge 4 commits into
matrixorigin:4.0-devfrom
jiangxinmeng1:fix-user-level-lock-functions

Conversation

@jiangxinmeng1

@jiangxinmeng1 jiangxinmeng1 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) does this PR fix or relate to?

Fixes #24728

What this PR does / why we need it:

What

  • add GET_LOCK, RELEASE_LOCK, and IS_FREE_LOCK built-in functions
  • implement lightweight connection-owned named lock semantics for MySQL compatibility
  • cover lock acquire/release/free/null behavior in function tests

Why

Fixes #24728. Prisma Migrate calls GET_LOCK before applying migrations.

Tests

  • CGO_CFLAGS="-I/home/jiangxinmeng/workspace/matrixone-24728-lock-funcs/thirdparties/install/include" go test ./pkg/sql/plan/function

@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@XuPeng-SH XuPeng-SH left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found multiple blocking issues in the current implementation.

  1. The named-lock state is only a package-global in-memory map inside one CN process. In MatrixOne this means two sessions routed to different CNs can both acquire the same lock name at the same time, so the core exclusivity guarantee is broken for real cross-session coordination.

  2. Locks are never released when the owning session disconnects. The only mutations of the new lock map are explicit acquire/release calls, and this PR does not add any session-teardown cleanup path. So a client that exits without RELEASE_LOCK() can leave the lock stuck forever until the CN restarts, which is not compatible with MySQL named-lock semantics.

  3. GET_LOCK(name, timeout) handles timeout semantics incorrectly: timeout <= 0 returns 0 immediately, but MySQL compatibility requires timeout = 0 to mean no-wait and negative timeout to mean wait indefinitely (subject to session cancellation).

  4. The unhappy-path coverage is still thin around the new blocking behavior. There are no tests for disconnect cleanup, successful wait-then-acquire, timeout expiry, context cancellation while waiting, or the reentrant refcount path.

Because the first two are real behavior bugs and the third is a compatibility bug in the exposed SQL contract, I am requesting changes.

@jiangxinmeng1 jiangxinmeng1 force-pushed the fix-user-level-lock-functions branch from c6ab3b7 to 5ca0b17 Compare June 10, 2026 03:01
@jiangxinmeng1

Copy link
Copy Markdown
Contributor Author

Updated the PR to address the review request:

  • GET_LOCK now uses the CN lockservice instead of a package-local map, so different CNs coordinate through the existing distributed lock path.
  • Lock ownership is keyed by session id and lock name, with local refcounts only used for reentrant RELEASE_LOCK semantics.
  • Session.Close now calls ReleaseUserLevelLocks to clean up held named locks on disconnect.
  • Timeout behavior is updated: 0 uses FastFail/no-wait, positive values use context timeout, negative values wait indefinitely until session/query cancellation.
  • Added tests for cross-service contention via a shared LockService fake, wait-then-acquire, timeout expiry, context cancellation, reentrant refcount, and cleanup.

Tests run:

  • CGO_CFLAGS="-I/home/jiangxinmeng/workspace/matrixone-24728-lock-funcs/thirdparties/install/include" go test ./pkg/sql/plan/function
  • CGO_CFLAGS="-I/home/jiangxinmeng/workspace/matrixone-24728-lock-funcs/thirdparties/install/include" go test ./pkg/frontend -run TestNonExistent

@XuPeng-SH XuPeng-SH left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-reviewed the latest head 9f3f04ff0baa3047317559d27e95ed83388ac777. The previous distributed-lock/session-cleanup issues are mostly addressed, but I still found blocking MySQL-compatibility issues in the exposed SQL contract.

  1. RELEASE_LOCK() cannot distinguish “not owned by this session” from “lock does not exist”. In releaseUserLevelLock, the local-refcount miss path returns 0, nil immediately (pkg/sql/plan/function/func_unary.go:6960-6963). MySQL requires different results here: 0 only when the named lock exists but was not established by this thread, and NULL when the named lock does not exist / was already released (see the official MySQL locking-functions doc: https://dev.mysql.com/doc/refman/8.4/en/locking-functions.html). I verified this locally with a temporary test expecting RELEASE_LOCK('missing_lock') to return NULL; current code returns non-NULL 0 (the 1th row expected NULL, but get not null). This will make compatibility checks and application logic unable to tell “someone else owns it” from “there is no lock”. Please make release return nullable state, e.g. probe the lockservice with FastFail when the local refcount is absent so free/nonexistent returns NULL while conflict returns 0, and add tests for never-created and already-released locks.

  2. Lock names are not validated against MySQL's 64-character limit. The current paths (getUserLevelLock, releaseUserLevelLock, and isUserLevelLockFree) pass name directly into the row/txn keys without any length check (pkg/sql/plan/function/func_unary.go:6924, 6954, 6975). MySQL enforces a maximum lock-name length of 64 characters. With this PR, overlong names are accepted and can acquire/release real locks, which is a visible behavior divergence from the stated MySQL compatibility goal. Please enforce the same limit, define the intended NULL/error behavior for incorrect arguments, and cover GET_LOCK/RELEASE_LOCK/IS_FREE_LOCK overlong-name cases.

Tests run locally with the PR worktree and main-worktree thirdparty CGO flags:

  • go test ./pkg/sql/plan/function -run 'TestUserLevelLock|TestReleaseUserLevelLocksCleanup|TestUserLevelLockBuiltinRegistration|Test_funids' -count=1
  • go test ./pkg/frontend -run TestNonExistent -count=1
  • git diff --check upstream/4.0-dev...HEAD

I also noted the latest GitHub CI still has failures in Ubuntu UT and target upgrade; I could not get the failed log segment through gh run view --log-failed, so I am not using that as the primary blocker here.

1. RELEASE_LOCK now returns NULL (via isNull flag) when the named lock
   does not exist or was already released, matching MySQL behavior.
   When local refcount is 0, the code probes the lockservice with
   FastFail to distinguish 'lock does not exist' (NULL) from 'lock
   exists but held by another session' (0).

2. Enforce MySQL-compatible 64-character limit on lock names in
   getUserLevelLock, releaseUserLevelLock, and isUserLevelLockFree.

3. Add tests:
   - RELEASE_LOCK on never-created lock returns NULL
   - RELEASE_LOCK on already-released lock returns NULL
   - RELEASE_LOCK on lock held by another session returns 0
   - Overlong lock names are rejected
   - Exactly-max-length lock names work

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

@XuPeng-SH XuPeng-SH left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed latest head 5b3f328768fdf241787f6ca7f21b4030bbba4b8f. The previous RELEASE_LOCK() NULL semantics issue is fixed, but there are still correctness gaps in lock-name handling that can break MySQL compatibility and mutual exclusion.

Blocking issues:

  1. pkg/sql/plan/function/func_unary.go:6824: validateUserLevelLockName uses len(name), so the 64 limit is enforced in bytes, not characters. MySQL documents the limit as 64 characters, and the MySQL WL for this feature says lock names are converted to UTF-8 before applying the new 64-character semantics: https://dev.mysql.com/doc/refman/8.4/en/locking-functions.html and https://dev.mysql.com/worklog/task/?id=1159. A name of 64 Chinese characters currently returns internal error: user-level lock name exceeds maximum length of 64 characters, even though it is exactly 64 characters. MatrixOne already has UTF-8 length helpers in this package; this should validate character/code-point length, and add a multibyte boundary test.

  2. pkg/sql/plan/function/func_unary.go:6838 / pkg/sql/plan/function/func_unary.go:6938: lock names are used as raw strings for the distributed row key and the local refcount key, making the implementation case-sensitive. MySQL's user-lock implementation performs case-insensitive comparison and uses a lowercased lock name as the MDL key (see the same WL above). Today two sessions can both acquire what MySQL treats as the same lock: session A GET_LOCK('case_lock', 0) returns 1, then session B GET_LOCK('CASE_LOCK', 0) also returns 1. That breaks the advisory-lock mutual exclusion guarantee. Please normalize the lock name consistently before validating/tracking/locking and add contention tests across case variants.

  3. pkg/sql/plan/function/func_unary.go:6824: empty lock names are accepted because validation only checks > 64. MySQL's WL calls NULL, empty string, and longer-than-64 names wrong-name inputs. Current GET_LOCK('', 0) creates a real distributed lock, which leaves MatrixOne with non-MySQL state and release semantics for an invalid name. Please reject '' consistently in GET_LOCK, RELEASE_LOCK, and IS_FREE_LOCK, with tests.

Local verification:

  • PR tests pass: go test ./pkg/sql/plan/function -run 'TestUserLevelLock|TestReleaseUserLevelLocksCleanup|TestUserLevelLockBuiltinRegistration|Test_funids' -count=1
  • Frontend compile check passes: go test ./pkg/frontend -run TestNonExistent -count=1
  • git diff --check upstream/4.0-dev...HEAD passes
  • Added temporary review-only tests for the three cases above; all failed as described, then removed them before submitting this review.

- Replace validateUserLevelLockName with normalizeUserLevelLockName that
  lowercases names for case-insensitive matching (MySQL requirement).
- Enforce 64-character limit using utf8.RuneCountInString (character
  count), not len() (byte count), matching MySQL's documented behavior.
- Reject empty lock names in GET_LOCK, RELEASE_LOCK, and IS_FREE_LOCK.
- Add tests for empty-name rejection, case-insensitive contention, and
  multibyte character boundary (64/65 Chinese characters).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Something isn't working size/L Denotes a PR that changes [500,999] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants