Releases: simplyblock/sbcli
Releases · simplyblock/sbcli
26.2-PRE
Release Notes:
- Full FTT2 support: We support now three instead of two paths from the initiator (NVMe-oF) volume to the targets. There is a primary, secondary and now also tertiary target with identical subsystems able to process IO. This allows the loss or maintainance of any two nodes in the cluster, regardless of the combination, at a point in time.
- Support for NVMe-oF-dhchap: dhchap can now be configured on pool-level; allowed host NQNs need to managed on the pool; the autentication used is bi-directional. Encryption is based on FIPS-140-3 certified library.
Support for TLS in Openshift: All management communication within the simplyblock cluster and to the cluster endpoints uses encryption with Openshift-managed certificates. - Image upgrade and CVE clean-up: Images have been upgraded to python 3.12 and RHEL10 and cleared from any CVEs.
- Volume Backup/Restore: Mechanisms to take backups of snapshots and snapshot-chains to an S3-compatible storage; We support backup retention and merge policies and auto-backup schedules on the operator. Snapshots can be restored into volumes (PVCs) on the same cluster or into a different cluster. Backup is storage-efficient (delta-only, compressed).
- Remote Snapshot Replication: Snapshots can be asynchronously replicated btw. nodes and sites in a network-efficient manner.
- Asynchronous Replication: Support for automatically failing-over and failing-back of selected volumes across sites with a certain time-gap/backlog (e.g. 5 minutes). This is useful for slow links and provides significantly better RTO (zero) and RPO (minimum: 1 minute) than traditional backups.
- Namespace Volumes: In k8s, we auto-create subystems and namespaces within those subsystems automatically in a pre-defined ratio (e.g. 8 or 16 namespaces per subsystem). The user only has to set the ratio.
- Multipathing-Support: We support now multipathing within the cluster and from initiators (clients) to the same node in the cluster. This means that clients can have four (FTT=1) or six (FTT=2) connections, two to each node, via separate VLANs using separate networking paths. This feature is also used for all cluster-internal communications. It has advantages over bond and MLAG.
26.1.2
What's Changed
- R25.10 hotfix fix cpu topology by @wmousa in #939
- R25.10 hotfix lvol clone by @Hamdy-khader in #934
Full Changelog: 26.1.1...26.1.2
26.1.1
R25.10-Hotfix: pin 26.1.1 and fix clone path regression.
25.10.5
What's Changed
- Remove remote object from node when receiving distrib events by @Hamdy-khader in #743
- Add req id to RPC logs for spdk_proxy by @Hamdy-khader in #752
- Main lvol sync delete (#734) by @Hamdy-khader in #757
- R25.10 hotfix sfam 2471 by @Hamdy-khader in #758
- fix linter and type checker issues by @Hamdy-khader in #759
- Prometheus hostpath by @geoffrey1330 in #761
- Format nvme devices when run sbcli sn configure with --force by @wmousa in #760
- Nsocket env by @geoffrey1330 in #764
- Fix fw connection error handling not to set node status down by @Hamdy-khader in #765
- Fix sfam-2473 by @Hamdy-khader in #766
- Do not install pip package on cluster update by @Hamdy-khader in #749
- Fix calculate total_mem for multi sn nodes on same numa by @wmousa in #767
- use max_size instead as hugepage memory when set by @geoffrey1330 in #754
- refactor node add task runner by @Hamdy-khader in #768
- Fix sfam-2483 by @Hamdy-khader in #773
- fix logger by @Hamdy-khader in #776
- fix SFAM-2476 by @Hamdy-khader in #775
- Fix SFAM-2482 by @Hamdy-khader in #769
- use emptyDir memory medium as socket directory by @geoffrey1330 in #783
- addd endpoint bind_device_to_nvme in kubernetes by @geoffrey1330 in #784
- Create partitions and alcemls on node add in parallel by @Hamdy-khader in #763
- Remove stats from fdb and get it from Prometheus by @Hamdy-khader in #762
- R25.10 hotfix isolate by @wmousa in #792
- Fix sfam-2515 by @Hamdy-khader in #810
- R25.10 hotfix spdk proxy stats by @Hamdy-khader in #812
- Add --cores-percentage to sbctl sn configure and support oracle OS fo… by @wmousa in #813
- Use diff fw port per node by @Hamdy-khader in #817
- Return back max number of distribs to be 12 distribs by @wmousa in #827
- Fix sfam-2555 by @Hamdy-khader in #831
- Fix sfam-2556 by @Hamdy-khader in #835
- Fix node auto restart by @Hamdy-khader in #836
- Adds failed device back to cluster as a new one by @Hamdy-khader in #841
- Fix remove device function by @Hamdy-khader in #840
- fix sfam-2557 by @Hamdy-khader in #843
- Fix sfam-2564 by @Hamdy-khader in #838
- Fix node restart jm raid config by @Hamdy-khader in #851
- Fix sfam-2578 by @Hamdy-khader in #852
- Fix sfam-2579 by @Hamdy-khader in #850
- Adds sbctl -v/--version by @Hamdy-khader in #853
- R25.10 hotfix cluster id from main by @Hamdy-khader in #854
- Add --nvme-names to sn configure and --format-4k to sn add-node by @wmousa in #859
- Improve sync delete handling for lvol bdevs in lvol_monitor.py by @Hamdy-khader in #863
- R25.10 hotfix client data nic by @Hamdy-khader in #874
- Enhance graceful-startup and graceful-shutdown by @wmousa in #876
- R25.10 hotfix dump tree by @Hamdy-khader in #879
- fix-description of max-size by @wmousa in #881
- Add --calculate-hp-only to calculate the minimum required huge pages by @wmousa in #884
- Fix sfam-2620 by @Hamdy-khader in #886
- Adds link detection when pinging data nic by @Hamdy-khader in #897
Full Changelog: 25.10.3...25.10.5
25.10.4.2
Full Changelog: 25.10.4.1...25.10.4.2
25.10.4.1
What's Changed
- Remove remote object from node when receiving distrib events by @Hamdy-khader in #743
- Add req id to RPC logs for spdk_proxy by @Hamdy-khader in #752
- Main lvol sync delete (#734) by @Hamdy-khader in #757
- R25.10 hotfix sfam 2471 by @Hamdy-khader in #758
- fix linter and type checker issues by @Hamdy-khader in #759
- Prometheus hostpath by @geoffrey1330 in #761
- Format nvme devices when run sbcli sn configure with --force by @wmousa in #760
- Nsocket env by @geoffrey1330 in #764
- Fix fw connection error handling not to set node status down by @Hamdy-khader in #765
- Fix sfam-2473 by @Hamdy-khader in #766
- Do not install pip package on cluster update by @Hamdy-khader in #749
- Fix calculate total_mem for multi sn nodes on same numa by @wmousa in #767
- use max_size instead as hugepage memory when set by @geoffrey1330 in #754
- refactor node add task runner by @Hamdy-khader in #768
- Fix sfam-2483 by @Hamdy-khader in #773
- fix logger by @Hamdy-khader in #776
- fix SFAM-2476 by @Hamdy-khader in #775
- Fix SFAM-2482 by @Hamdy-khader in #769
- use emptyDir memory medium as socket directory by @geoffrey1330 in #783
- addd endpoint bind_device_to_nvme in kubernetes by @geoffrey1330 in #784
- Create partitions and alcemls on node add in parallel by @Hamdy-khader in #763
- Remove stats from fdb and get it from Prometheus by @Hamdy-khader in #762
- R25.10 hotfix isolate by @wmousa in #792
- Fix sfam-2515 by @Hamdy-khader in #810
- R25.10 hotfix spdk proxy stats by @Hamdy-khader in #812
- Add --cores-percentage to sbctl sn configure and support oracle OS fo… by @wmousa in #813
- Use diff fw port per node by @Hamdy-khader in #817
- Return back max number of distribs to be 12 distribs by @wmousa in #827
- Fix sfam-2555 by @Hamdy-khader in #831
- Fix sfam-2556 by @Hamdy-khader in #835
- Fix node auto restart by @Hamdy-khader in #836
- Adds failed device back to cluster as a new one by @Hamdy-khader in #841
- Fix remove device function by @Hamdy-khader in #840
- fix sfam-2557 by @Hamdy-khader in #843
Full Changelog: 25.10.3...25.10.4.1
25.10.4
What's Changed
- Remove remote object from node when receiving distrib events by @Hamdy-khader in #743
- Add req id to RPC logs for spdk_proxy by @Hamdy-khader in #752
- Main lvol sync delete (#734) by @Hamdy-khader in #757
- R25.10 hotfix sfam 2471 by @Hamdy-khader in #758
- fix linter and type checker issues by @Hamdy-khader in #759
- Prometheus hostpath by @geoffrey1330 in #761
- Format nvme devices when run sbcli sn configure with --force by @wmousa in #760
- Nsocket env by @geoffrey1330 in #764
- Fix fw connection error handling not to set node status down by @Hamdy-khader in #765
- Fix sfam-2473 by @Hamdy-khader in #766
- Do not install pip package on cluster update by @Hamdy-khader in #749
- Fix calculate total_mem for multi sn nodes on same numa by @wmousa in #767
- use max_size instead as hugepage memory when set by @geoffrey1330 in #754
- refactor node add task runner by @Hamdy-khader in #768
- Fix sfam-2483 by @Hamdy-khader in #773
- fix logger by @Hamdy-khader in #776
- fix SFAM-2476 by @Hamdy-khader in #775
- Fix SFAM-2482 by @Hamdy-khader in #769
Full Changelog: 25.10.3...25.10.4
25.10.3
What's Changed
- 🐛 Optimise Storage node monitor
- 🐛 Fix fdb value exceed limit
- 🐛 Other miner bug fixes, see more in the Full Changelog
Full Changelog: 25.10.2...25.10.3
25.10.2
New Features
- Control Plane: Can alternatively deploy into existing Kubernetes clusters and co-locate on workers with storage nodes.
- Kubernetes Support Matrix: Added OpenShift starting from version XX.XX.
- OpenStack Driver: Now available. Supports most optional features and tested from OpenStack 25.1 (Epoxy). (Older OpenStack versions may be supported on request.)
- Lower Memory Footprint: Required memory on storage nodes reduced from 0.2% of storage capacity to 0.05%.
- QoS (Pool-level): Added pool-level QoS controls.
- QoS Service Classes: Assign a service class to a volume; service classes provide full performance isolation within the cluster.
- Flexible Erasure Coding: Support for flexible erasure-coding schemas within a cluster.
- Fabrics: Support for RDMA fabric and mixed fabrics (RDMA, TCP).
- Write Performance: Improvements during first write to volume and during node outage.
- Namespace Volumes: A single NVMe-oF subsystem can now expose up to 32 namespace volumes.
Fixes
- Control Plane: Fixed an issue that could lead to stuck deletes.
Upgrade Considerations
- Upgrades are supported from 25.7.6 and 25.7.7.
- It’s possible to add RDMA support to the fabric during an online upgrade.
Known Issues
- Using different erasure-coding schemas per cluster is available but experimental (not GA) and, in some tests, can cause I/O interrupt issues.
25.7.7
What's changed:
- 🐛 Bug fix: QOS setting between lvol and pool must be consistent and not accept negative values
- 🐛 Bug fix: On bare metal, node auto restart was not triggered after container crash but node is made online
- 🐛 Bug fix: Crypto LVOL delete: first delete crypto, then Lvol
Full Changelog: 25.7.6...25.7.7