Skip to content

MDEV-38814 Reduce pessimistic update fallbacks#4909

Draft
iMineLink wants to merge 3 commits intoMariaDB:11.8from
iMineLink:MDEV-38814-alt
Draft

MDEV-38814 Reduce pessimistic update fallbacks#4909
iMineLink wants to merge 3 commits intoMariaDB:11.8from
iMineLink:MDEV-38814-alt

Conversation

@iMineLink
Copy link
Copy Markdown
Contributor

Alternative PR to address MDEV-38814.
This attempt, compared to #4887, tries to address the issue at a higher level, changing the optimistic rejection and pessimistic fallback handling in InnoDB btr updates, without changing locking rules.

This PR is composed of 3 commits:

  1. The first commit adds debug-only counters to monitor the pessimistic update fallbacks and respective index lock upgrades in the current code, as verified in the innodb.index_lock_upgrade test.
  2. The second commit avoids in btr_cur_optimistic_update() that a freshly split page, interested by an update that is growing a record's size, becomes victim of underflow due to its post-update size being less than BTR_CUR_PAGE_COMPRESS_LIMIT(index) (to trigger potential page merge in the pessimistic path); Do trigger underflow in such cases only for updates that shrink the record size; This reduces to 0 the underflows in the test (which only issues NULL to not-NULL updates) and respective (failed) page merge attempts.
  3. The last commit enforces the reorganization limit BTR_CUR_PAGE_REORGANIZE_LIMIT in btr_cur_pessimistic_update(), avoiding btr_cur_insert_if_possible() calls when the page's post-reorganize free space is below BTR_CUR_PAGE_REORGANIZE_LIMIT for uncompressed pages, letting the code fall through the pessimistic insert path; This reduces the overflows and reorganizations in the test, at the cost of one extra page split.

This is opened as a draft to get some CI coverage and allow initial discussion.
I believe this PR addresses the MDEV's issue in a better way than the previous attempt, trying to optimize index maintenance operations.

Discussion points:

  1. Is it OK to make only record shrinking UPDATE(s) pay the price of potential page merge through underflow? I think so, it seems in any case that all the previous merge attempts failed in the test. Is there a mismatch between the btr_cur_optimistic_update() test and the effective page merge logic? I have still to verify current behavior on record shrinking/same-size updates.
  2. Both changes in optimistic and pessimistic update paths tend to create sparser trees, reducing merge attempts (even if they all failed in the test) and increasing split attempts. Can this be detrimental in some cases?

I will continue testing the behavior of this branch and possibly add other UPDATE patterns in the mix (same record size, shrinking record, compressed pages, varying page size)

Add 4 debug-only counters:

1. Innodb_btr_cur_n_index_lock_upgrades
2. Innodb_btr_cur_pessimistic_update_calls
3. Innodb_btr_cur_pessimistic_update_optim_err_underflows
4. Innodb_btr_cur_pessimistic_update_optim_err_overflows

to track pessimistic update fallbacks and monitor how often the
exclusive index lock is acquired, exposed via SHOW GLOBAL
STATUS in debug builds.

Add a test that inserts and updates 1000 rows in a 4K-page table,
verifying the current status of the counters.
Issue:
In btr_cur_optimistic_update(), freshly split pages are subject
to DB_UNDERFLOW due to the new size (after delete+insert) being
less than BTR_CUR_PAGE_COMPRESS_LIMIT(index) target, even if the record
itself is growing.

Fix:
In btr_cur_optimistic_update(), avoid this behavior by gating the
DB_UNDERFLOW error condition behind record shrinkage check.

Nothing is changed if the record is not shrinking.

The counters in the index_lock_upgrade.result file are updated
accordingly.
The new count of DB_UNDERFLOW optimistic update errors during the test
is reduced to 0.
The index lock upgrades are reduced accordingly.
Issue:
In btr_cur_optimistic_update(), DB_OVERFLOW error condition is returned
if the BTR_CUR_PAGE_REORGANIZE_LIMIT is not met, to prevent CPU trashing
due to excessive reorganization attempts in an almost full page.
In btr_cur_pessimistic_update() though, many of these errors were not
properly followed by a page split attempt, and therefore many
pessimistic fallbacks occur for the same page, which has less
than the BTR_CUR_PAGE_REORGANIZE_LIMIT free space.

Fix:
In btr_cur_pessimistic_update(), if the optimistic update error
is DB_OVERFLOW, the page is uncompressed and the
BTR_CUR_PAGE_REORGANIZE_LIMIT is not met, avoid attempting
btr_cur_insert_if_possible() and fallthrough to attempt a page split.

In the index_lock_upgrade.result file, the counters are updated
accordingly: one extra page split is recorded, and both the numbers of
reorganization attempts and index lock upgrades is reduced.
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants