RDSC-4633 How to perform HA failover#2818
RDSC-4633 How to perform HA failover#2818ilianiliev-redis wants to merge 1 commit intoredis:mainfrom
Conversation
dcdc014 to
7cba538
Compare
andy-stark-redis
left a comment
There was a problem hiding this comment.
Just a few style suggestions, but otherwise LGTM.
| rdi-reloader-77df5f7854-lwmvz 1/1 Running 0 71m | ||
| ``` | ||
|
|
||
| 2. Identify the leader node - this is the one that has a running `collector-source` pod |
There was a problem hiding this comment.
| 2. Identify the leader node - this is the one that has a running `collector-source` pod | |
| 2. Identify the leader node - this is the one that has a running `collector-source` pod. |
|
|
||
| To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following command on the leader node: | ||
|
|
||
| 1. Identify the database IP (if you are using it with hostname): |
There was a problem hiding this comment.
| 1. Identify the database IP (if you are using it with hostname): | |
| 1. Identify the database IP (replace `<hostname>` with your own hostname): |
|
|
||
| ## Performing the HA Failover Testing | ||
|
|
||
| To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following command on the leader node: |
There was a problem hiding this comment.
| To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following command on the leader node: | |
| To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following commands on the leader node: |
| 54.78.220.161 | ||
| ``` | ||
|
|
||
| 2. For each of the IPs returned by the above command, run the following command to block the traffic: |
There was a problem hiding this comment.
Does this mean that you expect dig to return more than one IP address for a hostname? (Presumably you only need to run the command once on the leader node.) If so, maybe say that explicitly in step 1, because it currently says "Identify the database IP", which makes it sound like there is only one address, but "IP" might potentially be plural here.
| In about 10 second you will start seeing logs from the leader that it could not acquire the leadership. | ||
| Once the leader lock expires, the second node will acquire the leadership and you will see logs from the second node indicating that it has become the leader. |
There was a problem hiding this comment.
| In about 10 second you will start seeing logs from the leader that it could not acquire the leadership. | |
| Once the leader lock expires, the second node will acquire the leadership and you will see logs from the second node indicating that it has become the leader. | |
| In about 10 seconds you will start seeing log entries from the leader saying that it could not acquire the leadership. | |
| When the leader lock expires, the second node will acquire the leadership and you will see log entries from the second node indicating that it has become the leader. |
|
|
||
| ## Cleanup | ||
|
|
||
| To clean up after the test, you can remove the iptables rule that you added to block the traffic: |
There was a problem hiding this comment.
| To clean up after the test, you can remove the iptables rule that you added to block the traffic: | |
| To clean up after the test, remove the `iptables` rule that you added to block the traffic: |
| @@ -0,0 +1,82 @@ | |||
| --- | |||
| Title: How to perform HA failover testing | |||
There was a problem hiding this comment.
Your original title was fine, but this fits our usual style a bit more closely.
| Title: How to perform HA failover testing | |
| Title: Test HA failover |
| description: Learn how to perform HA failover testing for Redis Data Integration (RDI) to ensure high availability and reliability of your data integration setup. | ||
| group: di | ||
| hideListLinks: false | ||
| linkTitle: Testing HA failover |
There was a problem hiding this comment.
| linkTitle: Testing HA failover | |
| linkTitle: Test HA failover |
| will take place. After the failover, the secondary instance will become the primary one, | ||
| and the RDI pipeline will be active on that VM. | ||
|
|
||
| You can see how to test HA failover in the [HA failover testing page]({{< relref "/integrate/redis-data-integration/installation/ha-test" >}}). |
There was a problem hiding this comment.
| You can see how to test HA failover in the [HA failover testing page]({{< relref "/integrate/redis-data-integration/installation/ha-test" >}}). | |
| You may find it useful to trigger a failover deliberately to check that RDI is correctly configured to handle it. See [Test HA failover]({{< relref "/integrate/redis-data-integration/installation/ha-test" >}}) to learn how to do this. |
Ticket: https://redislabs.atlassian.net/browse/RDSC-4633
Document how to perform an HA failover test.