Upstream VssStoreBuilder and VssStore to lightning-persister#4323
Upstream VssStoreBuilder and VssStore to lightning-persister#4323tnull wants to merge 13 commits intolightningdevkit:mainfrom
VssStoreBuilder and VssStore to lightning-persister#4323Conversation
|
🎉 This PR is now ready for review! |
0817a96 to
2e8b7cd
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4323 +/- ##
==========================================
- Coverage 86.18% 86.17% -0.01%
==========================================
Files 160 160
Lines 107536 107536
Branches 107536 107536
==========================================
- Hits 92680 92670 -10
- Misses 12231 12237 +6
- Partials 2625 2629 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The former is done here lightningdevkit/vss-client#56, for |
0b39d8d to
c633078
Compare
c633078 to
a343406
Compare
Review SummaryTwo new issues found beyond prior review comments: Inline comments posted
Previously flagged issues still open (not re-posted)
|
|
Now rebased / updated to incorporate latest VSS changes from LDK Node. Should be good for review. LDK Node draft PR: lightningdevkit/ldk-node#837 |
d72fb40 to
124e08d
Compare
Since the beginning of VSS we intended to upstream the corresponding `KVStore` implementation to `lightning-persister`. Here, we finally do it. Signed-off-by: Elias Rohrer <dev@tnull.de>
Bump `vss-client-ng` 0.4 -> 0.5, add `build_with_sigs_auth()` method, rename `build()` to `build_with_lnurl()`, replace `RandEntropySource` with `VssEntropySource` backed by LDK's `RandomBytes`, drop `rand` from the `vss` feature in favor of `getrandom`, re-export `vss_client` from the crate root, remove `AuthProviderSetupFailed` error variant, and update tests to use `VssStoreBuilder`. Co-Authored-By: HAL 9000 Signed-off-by: Elias Rohrer <dev@tnull.de>
Fix test import path: `crate::io::test_utils` -> `crate::test_utils`. Co-Authored-By: HAL 9000
Stop `list_all_keys` on empty page tokens Handle both `None` and empty `next_page_token` values so VSS pagination terminates regardless of which sentinel the server uses. Co-Authored-By: HAL 9000
Reject missing VSS object values explicitly Convert malformed responses into `io::Error`s instead of panicking when `value` is missing from `get_object` replies. Co-Authored-By: HAL 9000
124e08d to
dd583b0
Compare
Use an MSRV-compatible closure in `write_internal` Replace the async closure syntax in `write_internal` with the older `move || async move` pattern supported on Rust 1.75. Co-Authored-By: HAL 9000
Move owned buffers into `StorableBuilder` Pass the existing `Vec<u8>` to `StorableBuilder::build` instead of cloning it before every write. Co-Authored-By: HAL 9000
Use an MSRV-compatible closure in `remove_internal` Replace the async closure syntax in `remove_internal` with the older `move || async move` pattern supported on Rust 1.75. Co-Authored-By: HAL 9000
Signed-off-by: Elias Rohrer <dev@tnull.de>
Wait for the VSS server and use `checkout@v4` Poll the server before running tests so the workflow does not race the startup path, and align the new job with the repository's current `actions/checkout` version. Co-Authored-By: HAL 9000
Drop the leftover workflow debug listing Remove the stray `ls -la` output so the VSS test job only emits the signal needed for debugging failures. Co-Authored-By: HAL 9000
Keep the V0 prefix and extraction logic aligned with stores that write empty-namespace keys without any `#` separator. Co-Authored-By: HAL 9000
dd583b0 to
9686cb9
Compare
| bitcoin = "0.32.2" | ||
| lightning = { version = "0.3.0", path = "../lightning" } | ||
| lightning-macros = { version = "0.2.0", path = "../lightning-macros", optional = true } | ||
| tokio = { version = "1.35", optional = true, default-features = false, features = ["rt-multi-thread"] } |
There was a problem hiding this comment.
Bug: The code uses tokio::sync::Mutex extensively (7+ call sites in vss_store.rs), but the tokio dependency only declares features = ["rt-multi-thread"]. The sync feature is required for tokio::sync::Mutex to be available.
This likely compiles today only because vss-client-ng transitively enables tokio/sync. If that crate ever changes its tokio feature requirements, this will break.
| tokio = { version = "1.35", optional = true, default-features = false, features = ["rt-multi-thread"] } | |
| tokio = { version = "1.35", optional = true, default-features = false, features = ["rt-multi-thread", "sync"] } |
| .worker_threads(INTERNAL_RUNTIME_WORKERS) | ||
| .max_blocking_threads(INTERNAL_RUNTIME_WORKERS) | ||
| .build() | ||
| .unwrap(); |
There was a problem hiding this comment.
Bug: .unwrap() will panic if the OS fails to create the runtime (e.g., resource limits, thread creation failure). Since this function returns io::Result<Self>, the error should be propagated instead:
| .unwrap(); | |
| .build()?; |
| #[test] | ||
| fn vss_read_write_remove_list_persist() { | ||
| let vss_base_url = std::env::var("TEST_VSS_BASE_URL").unwrap(); | ||
| let mut rng = rng(); | ||
| let rand_store_id: String = (0..7).map(|_| rng.sample(Alphanumeric) as char).collect(); | ||
| let mut vss_seed = [0u8; 32]; | ||
| rng.fill_bytes(&mut vss_seed); | ||
| let vss_xprv = Xpriv::new_master(bitcoin::Network::Testnet, &vss_seed).unwrap(); | ||
| let vss_store = VssStoreBuilder::new(vss_xprv, vss_base_url, rand_store_id) | ||
| .build_with_fixed_headers(HashMap::new()) | ||
| .unwrap(); | ||
| do_read_write_remove_list_persist(&vss_store); |
There was a problem hiding this comment.
Bug: This test will panic at runtime. VssStore::new (line 128) and all KVStoreSync methods (lines 207, 238, 272, 290) use tokio::task::block_in_place, which panics if called from outside a multi-threaded tokio runtime. This #[test] runs on a plain OS thread with no tokio runtime context.
The second test (vss_read_write_remove_list_persist_in_runtime_context) correctly uses #[tokio::test(flavor = "multi_thread", worker_threads = 1)].
Either make this test #[tokio::test(flavor = "multi_thread")] as well, or handle the non-runtime case in the constructor/sync methods:
fn block_on_internal<F: Future>(runtime: &tokio::runtime::Runtime, fut: F) -> F::Output {
match tokio::runtime::Handle::try_current() {
Ok(_) => tokio::task::block_in_place(|| runtime.handle().block_on(fut)),
Err(_) => runtime.handle().block_on(fut),
}
}If the intent is that VssStore always requires a tokio runtime context (even for sync usage), this needs to be documented on the struct/builder, and this test should be updated.
| let wallet_data_present = !response.key_versions.is_empty(); | ||
| if wallet_data_present { | ||
| // If the wallet data is present, it means we're not running for the first time. | ||
| Ok(VssSchemaVersion::V0) |
There was a problem hiding this comment.
Potential data loss: When V0 is detected, the schema version is NOT written to storage. This means every future startup re-runs the detection heuristic (checking for bdk_wallet# prefix). If a V0 store ever loses all its bdk_wallet data (e.g., wallet reset, selective key deletion) while retaining other data (channel monitors, etc.), the next startup will misclassify it as a fresh V1 store. This causes:
- All existing V0 keys become inaccessible (V1 uses obfuscated namespace prefixes, V0 uses plaintext)
- Decryption fails (V1 uses the store key as AAD, V0 uses empty AAD)
Consider persisting VssSchemaVersion::V0 to storage (like V1 does) to make the detection durable:
if wallet_data_present {
// Persist V0 schema version so future startups don't rely on the heuristic
let schema_version = VssSchemaVersion::V0;
let encoded_version = schema_version.encode();
let storable_builder = StorableBuilder::new(VssEntropySource(entropy_source));
let vss_version = -1;
let aad = store_key.as_bytes();
let storable = storable_builder.build(encoded_version, vss_version, &data_encryption_key, aad);
let request = PutObjectRequest {
store_id: store_id.clone(),
global_version: None,
transaction_items: vec![KeyValue { key: store_key, version: vss_version, value: storable.encode_to_vec() }],
delete_items: vec![],
};
// Best-effort write; if it fails, we'll re-detect next time
let _ = client.put_object(&request).await;
Ok(schema_version)
} else {| // `PeerManager::process_pending_events` -> `ChannelManager::get_and_clear_pending_msg_events` | ||
| // would deadlock when trying to acquire sync `Mutex` locks that are held by the thread | ||
| // currently being blocked waiting on the VSS operation to finish. | ||
| internal_runtime: Option<tokio::runtime::Runtime>, |
There was a problem hiding this comment.
Hmm, does it make sense to upstream this hack or should we go ahead and split the store struct into an async and sync version? I assume pretty soon LDK Node will only need/want the async version anyway?
| Ok(keys) | ||
| } | ||
|
|
||
| async fn execute_locked_write< |
There was a problem hiding this comment.
Can we DRY this up with the filesystem store at all?
| // Check if any `bdk_wallet` data was written by listing keys under the respective | ||
| // (unobfuscated) prefix. | ||
| const V0_BDK_WALLET_PREFIX: &str = "bdk_wallet#"; | ||
| let request = ListKeyVersionsRequest { |
There was a problem hiding this comment.
Hmm, its a bit weird to upstream LDK Node-specific migration logic.
Since the beginning of VSS we intended to upstream the corresponding
KVStoreimplementation tolightning-persister.Here, we finally do it. I'll put this in draft until lightningdevkit/ldk-node#755 lands so we can directly upstream the version already upgraded for pubkey-auth.