Skip to content

Conversation

@ngan
Copy link
Contributor

@ngan ngan commented Feb 5, 2026

Summary

  • Fix race condition when multiple bundle install processes run concurrently
  • Clear Gem::Specification and source caches after acquiring the process lock
  • Add clear_cache method to Source::Rubygems and SourceList

Problem

When two bundle install processes run concurrently:

  1. Process A acquires the lock and starts installing gems
  2. Process B populates its Gem::Specification.stubs and @installed_specs caches (showing no gems installed)
  3. Process B tries to acquire the lock and waits
  4. Process A finishes installing and releases the lock
  5. Process B acquires the lock but uses its stale cache from step 2
  6. Process B doesn't see the gems installed by Process A

This can lead to issues where gems appear to be missing or need reinstallation.

Solution

After acquiring the process lock, clear the caches before proceeding with installation:

ProcessLock.lock do
  Gem::Specification.reset
  @definition.sources.clear_cache
  # ... rest of installation
end

This ensures that any gems installed by another process while waiting for the lock are properly detected.

Related

Similar to #8539 which addressed a related cache invalidation issue for the bundle update command.

Diagram

⏺ bundle install
      │
      ├─ 1. Bundler.definition()                    ← BEFORE LOCK
      │      │
      │      └─ Definition.initialize()
      │           ├─ converge_sources()
      │           ├─ converge_paths()
      │           │    └─ specs_changed?(source)
      │           │         └─ source.specs         ← CACHE POPULATED HERE
      │           │              └─ installed_specs
      │           │                   └─ Gem::Specification.stubs  ⚠️
      │           └─ converge_locals()
      │                └─ specs_changed?(source)
      │                     └─ source.specs         ← CACHE POPULATED HERE
      │
      ├─ 2. definition.validate_runtime!            ← BEFORE LOCK
      │
      ├─ 3. Installer.install()
      │      │
      │      └─ run()
      │           │
      │           └─ ProcessLock.lock do            ← LOCK ACQUIRED
      │                 │
      │                 ├─ Gem::Specification.reset        ← FIX: Clear stale cache
      │                 ├─ sources.clear_cache             ← FIX: Clear @installed_specs
      │                 │
      │                 ├─ setup_domain!()
      │                 │    └─ install_needed?()
      │                 │         └─ resolve()      ← Uses fresh cache now
      │                 │
      │                 └─ install()                ← ACTUAL GEM INSTALLATION

  The race condition:
  Process A                              Process B
  ─────────────────────────────────────────────────────────────────
  definition()
    └─ cache: "myrack not installed"
                                         definition()
                                           └─ cache: "myrack not installed"
  ProcessLock.lock ✓ (acquired)
                                         ProcessLock.lock ⏳ (waiting...)
  install myrack
  ProcessLock.unlock
                                         ProcessLock.lock ✓ (acquired)
                                         ❌ Still sees stale cache: "myrack not installed"
                                         ❌ Tries to reinstall myrack

  With the fix:
  Process A                              Process B
  ─────────────────────────────────────────────────────────────────
  definition()
    └─ cache: "myrack not installed"
                                         definition()
                                           └─ cache: "myrack not installed"
  ProcessLock.lock ✓ (acquired)
                                         ProcessLock.lock ⏳ (waiting...)
  install myrack
  ProcessLock.unlock
                                         ProcessLock.lock ✓ (acquired)
                                         ✅ clear_cache + Gem::Specification.reset
                                         ✅ Fresh cache: "myrack IS installed"
                                         ✅ "Using myrack" (no reinstall)

Test plan

  • Added unit tests for Source::Rubygems#clear_cache
  • Added unit test for SourceList#clear_cache delegation
  • Added integration test verifying cache refresh after waiting for lock

🤖 Generated with Claude Code

When multiple `bundle install` processes run concurrently, a race
condition can cause issues. The second process populates its
`Gem::Specification.stubs` and `@installed_specs` caches before
acquiring the ProcessLock. While waiting for the lock, the first
process installs gems. After acquiring the lock, the second process
uses its stale cache and may not see the newly installed gems.

This fix clears the caches immediately after acquiring the process
lock, ensuring that any gems installed by another process while
waiting for the lock are properly detected.

Similar to ruby#8539 which addressed a related cache invalidation issue
for the `bundle update` command.

Fixes ruby#8473

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ngan
Copy link
Contributor Author

ngan commented Feb 7, 2026

An inefficiency here is that a reset happens after every lock acquisition, even when the lock wasn't held by some other process in the first place. I was thinking the ProcessLock can yield whether or not it was able to acquire the lock immediately or it had to wait. Then the reset can happen based off of that. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants