Untangling and Restructuring CTDB Martin Schwenke < - - PowerPoint PPT Presentation

untangling and restructuring ctdb
SMART_READER_LITE
LIVE PREVIEW

Untangling and Restructuring CTDB Martin Schwenke < - - PowerPoint PPT Presentation

Untangling and Restructuring CTDB Martin Schwenke < martin@meltin.net > Samba Team IBM (Australia Development Laboratory, Linux Technology Center) Martin Schwenke Untangling and Restructuring CTDB What are we talking about? Martin


slide-1
SLIDE 1

Untangling and Restructuring CTDB

Martin Schwenke <martin@meltin.net>

Samba Team IBM (Australia Development Laboratory, Linux Technology Center)

Martin Schwenke Untangling and Restructuring CTDB

slide-2
SLIDE 2

What are we talking about?

Martin Schwenke Untangling and Restructuring CTDB

slide-3
SLIDE 3

What are we talking about?

What is CTDB?

Martin Schwenke Untangling and Restructuring CTDB

slide-4
SLIDE 4

What are we talking about?

What does CTDB do?

Martin Schwenke Untangling and Restructuring CTDB

slide-5
SLIDE 5

What are we talking about?

What does CTDB do? Cluster membership and leadership

Martin Schwenke Untangling and Restructuring CTDB

slide-6
SLIDE 6

What are we talking about?

What does CTDB do? Cluster membership and leadership Cluster database and database recovery

Martin Schwenke Untangling and Restructuring CTDB

slide-7
SLIDE 7

What are we talking about?

What does CTDB do? Cluster membership and leadership Cluster database and database recovery Cluster-wide messaging transport for Samba

Martin Schwenke Untangling and Restructuring CTDB

slide-8
SLIDE 8

What are we talking about?

What does CTDB do? Cluster membership and leadership Cluster database and database recovery Cluster-wide messaging transport for Samba Service management and monitoring

Martin Schwenke Untangling and Restructuring CTDB

slide-9
SLIDE 9

What are we talking about?

What does CTDB do? Cluster membership and leadership Cluster database and database recovery Cluster-wide messaging transport for Samba Service management and monitoring IP address management, failover and consistency checking

Martin Schwenke Untangling and Restructuring CTDB

slide-10
SLIDE 10

The plan?

Martin Schwenke Untangling and Restructuring CTDB

slide-11
SLIDE 11

The plan?

What is the goal?

Martin Schwenke Untangling and Restructuring CTDB

slide-12
SLIDE 12

The plan?

What is the goal? CTDB scalability and performance

Martin Schwenke Untangling and Restructuring CTDB

slide-13
SLIDE 13

The plan?

What is the goal? CTDB scalability and performance Reduce barrier to entry for new CTDB developers

Martin Schwenke Untangling and Restructuring CTDB

slide-14
SLIDE 14

The plan?

What is the goal? CTDB scalability and performance Reduce barrier to entry for new CTDB developers Encourage wider use

Martin Schwenke Untangling and Restructuring CTDB

slide-15
SLIDE 15

The plan?

What is the goal? CTDB scalability and performance Reduce barrier to entry for new CTDB developers Encourage wider use Parallelise CTDB database daemon?

Martin Schwenke Untangling and Restructuring CTDB

slide-16
SLIDE 16

The plan?

What is the goal? CTDB scalability and performance Reduce barrier to entry for new CTDB developers Encourage wider use Parallelise CTDB database daemon? Remove non-database functions from database daemon

Martin Schwenke Untangling and Restructuring CTDB

slide-17
SLIDE 17

The plan?

What is the goal? CTDB scalability and performance Reduce barrier to entry for new CTDB developers Encourage wider use Parallelise CTDB database daemon? Remove non-database functions from database daemon Cleanly split out cluster, service, IP management

Martin Schwenke Untangling and Restructuring CTDB

slide-18
SLIDE 18

The plan?

How do we get there?

Martin Schwenke Untangling and Restructuring CTDB

slide-19
SLIDE 19

The plan?

How do we get there? I told you last year!

Martin Schwenke Untangling and Restructuring CTDB

slide-20
SLIDE 20

The plan?

How do we get there? I told you last year! So far it has looked very little like I described. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-21
SLIDE 21

The plan?

How do we get there? I told you last year! So far it has looked very little like I described. . . Slow progress. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-22
SLIDE 22

The plan?

How do we get there? I told you last year! So far it has looked very little like I described. . . Slow progress. . . . . . one bite at a time. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-23
SLIDE 23

Twelve months of untangling

Martin Schwenke Untangling and Restructuring CTDB

slide-24
SLIDE 24

Twelve months of untangling

What has been happening? Recovery helper NFS support factoring IP allocation NAT gateway LVS support TCP connection killing Recovery lock Monitoring in recovery daemon

Martin Schwenke Untangling and Restructuring CTDB

slide-25
SLIDE 25

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-26
SLIDE 26

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . . . . . more from Amitay later

Martin Schwenke Untangling and Restructuring CTDB

slide-27
SLIDE 27

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . . . . . more from Amitay later New protocol and client code to support

Martin Schwenke Untangling and Restructuring CTDB

slide-28
SLIDE 28

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . . . . . more from Amitay later New protocol and client code to support New helper ctdb_recovery_helper

Martin Schwenke Untangling and Restructuring CTDB

slide-29
SLIDE 29

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . . . . . more from Amitay later New protocol and client code to support New helper ctdb_recovery_helper All new code — no nested event loops!

Martin Schwenke Untangling and Restructuring CTDB

slide-30
SLIDE 30

Twelve months of untangling

Recovery helper Actually a bug fix to avoid recovery deadlock. . . . . . more from Amitay later New protocol and client code to support New helper ctdb_recovery_helper All new code — no nested event loops! Drop in replacement for existing recovery code

Martin Schwenke Untangling and Restructuring CTDB

slide-31
SLIDE 31

Twelve months of untangling

NFS support This change is confined to scripts. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-32
SLIDE 32

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha

Martin Schwenke Untangling and Restructuring CTDB

slide-33
SLIDE 33

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs

Martin Schwenke Untangling and Restructuring CTDB

slide-34
SLIDE 34

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs Refactored into single 60.nfs

Martin Schwenke Untangling and Restructuring CTDB

slide-35
SLIDE 35

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs Refactored into single 60.nfs Now have CTDB_NFS_CALLOUT configuration variable

Martin Schwenke Untangling and Restructuring CTDB

slide-36
SLIDE 36

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs Refactored into single 60.nfs Now have CTDB_NFS_CALLOUT configuration variable Default is nfs-linux-kernel-callout

Martin Schwenke Untangling and Restructuring CTDB

slide-37
SLIDE 37

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs Refactored into single 60.nfs Now have CTDB_NFS_CALLOUT configuration variable Default is nfs-linux-kernel-callout Sample nfs-ganesha-callout

Martin Schwenke Untangling and Restructuring CTDB

slide-38
SLIDE 38

Twelve months of untangling

NFS support This change is confined to scripts. . . We had 60.nfs and 60.ganesha We had a request for 60.glusternfs Refactored into single 60.nfs Now have CTDB_NFS_CALLOUT configuration variable Default is nfs-linux-kernel-callout Sample nfs-ganesha-callout Jos´ e has been working on nfs-ganesha-callout recently

Martin Schwenke Untangling and Restructuring CTDB

slide-39
SLIDE 39

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states

Martin Schwenke Untangling and Restructuring CTDB

slide-40
SLIDE 40

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states CTDB data structures were deep in the code

Martin Schwenke Untangling and Restructuring CTDB

slide-41
SLIDE 41

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states CTDB data structures were deep in the code Several interface points between IP allocation algorithm and surrounding code

Martin Schwenke Untangling and Restructuring CTDB

slide-42
SLIDE 42

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states CTDB data structures were deep in the code Several interface points between IP allocation algorithm and surrounding code Introduced more abstract data structures

Martin Schwenke Untangling and Restructuring CTDB

slide-43
SLIDE 43

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states CTDB data structures were deep in the code Several interface points between IP allocation algorithm and surrounding code Introduced more abstract data structures IP allocation is now separate “module”

Martin Schwenke Untangling and Restructuring CTDB

slide-44
SLIDE 44

Twelve months of untangling

IP allocation IP allocation algorithm depends on IP addresses and node states CTDB data structures were deep in the code Several interface points between IP allocation algorithm and surrounding code Introduced more abstract data structures IP allocation is now separate “module” Next step: IP allocation daemon?

Martin Schwenke Untangling and Restructuring CTDB

slide-45
SLIDE 45

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability

Martin Schwenke Untangling and Restructuring CTDB

slide-46
SLIDE 46

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node

Martin Schwenke Untangling and Restructuring CTDB

slide-47
SLIDE 47

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only”

Martin Schwenke Untangling and Restructuring CTDB

slide-48
SLIDE 48

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword

Martin Schwenke Untangling and Restructuring CTDB

slide-49
SLIDE 49

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword No capability needed, so no daemon support!

Martin Schwenke Untangling and Restructuring CTDB

slide-50
SLIDE 50

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword No capability needed, so no daemon support! New helper script: “ctdb_natgw master|list|status”

Martin Schwenke Untangling and Restructuring CTDB

slide-51
SLIDE 51

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword No capability needed, so no daemon support! New helper script: “ctdb_natgw master|list|status” “ctdb natgw master|list|status” runs helper

Martin Schwenke Untangling and Restructuring CTDB

slide-52
SLIDE 52

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword No capability needed, so no daemon support! New helper script: “ctdb_natgw master|list|status” “ctdb natgw master|list|status” runs helper NAT gateway event script also calls out to helper

Martin Schwenke Untangling and Restructuring CTDB

slide-53
SLIDE 53

Twelve months of untangling

NAT gateway Had daemon support: NAT gateway master capability “ctdb natgwlist” calculated NAT gateway master node Capability unset on a node indicated “slave-only” Observed that NAT gateway nodes file lines could be augmented with “slave-only” keyword No capability needed, so no daemon support! New helper script: “ctdb_natgw master|list|status” “ctdb natgw master|list|status” runs helper NAT gateway event script also calls out to helper NAT gateway support now reduced to 2 non-core scripts

Martin Schwenke Untangling and Restructuring CTDB

slide-54
SLIDE 54

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP

Martin Schwenke Untangling and Restructuring CTDB

slide-55
SLIDE 55

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP “ctdb lvsmaster” calculated LVS master node

Martin Schwenke Untangling and Restructuring CTDB

slide-56
SLIDE 56

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP “ctdb lvsmaster” calculated LVS master node Re-implemented using same model as NAT gateway

Martin Schwenke Untangling and Restructuring CTDB

slide-57
SLIDE 57

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP “ctdb lvsmaster” calculated LVS master node Re-implemented using same model as NAT gateway New helper script: “ctdb_lvs master|list|status”

Martin Schwenke Untangling and Restructuring CTDB

slide-58
SLIDE 58

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP “ctdb lvsmaster” calculated LVS master node Re-implemented using same model as NAT gateway New helper script: “ctdb_lvs master|list|status” LVS support reduced to 2 non-core scripts

Martin Schwenke Untangling and Restructuring CTDB

slide-59
SLIDE 59

Twelve months of untangling

LVS Had daemon support: LVS capability, single public IP “ctdb lvsmaster” calculated LVS master node Re-implemented using same model as NAT gateway New helper script: “ctdb_lvs master|list|status” LVS support reduced to 2 non-core scripts Simplified IP takeover code due to absence of single public IP

Martin Schwenke Untangling and Restructuring CTDB

slide-60
SLIDE 60

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support

Martin Schwenke Untangling and Restructuring CTDB

slide-61
SLIDE 61

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address

Martin Schwenke Untangling and Restructuring CTDB

slide-62
SLIDE 62

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs

Martin Schwenke Untangling and Restructuring CTDB

slide-63
SLIDE 63

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address

Martin Schwenke Untangling and Restructuring CTDB

slide-64
SLIDE 64

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin

Martin Schwenke Untangling and Restructuring CTDB

slide-65
SLIDE 65

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin Much faster than talking to daemon

Martin Schwenke Untangling and Restructuring CTDB

slide-66
SLIDE 66

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin Much faster than talking to daemon SOCK_PACKET drops packets. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-67
SLIDE 67

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin Much faster than talking to daemon SOCK_PACKET drops packets. . . Bidirectional killing, packets got mixed up!

Martin Schwenke Untangling and Restructuring CTDB

slide-68
SLIDE 68

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin Much faster than talking to daemon SOCK_PACKET drops packets. . . Bidirectional killing, packets got mixed up! Some internal filtering and tuning needed

Martin Schwenke Untangling and Restructuring CTDB

slide-69
SLIDE 69

Twelve months of untangling

TCP connection killing Was combination of “ctdb killtcp” and daemon support Daemon side validated server-side IP address Daemon also sent “tickle ACKS”, listened for responses and sent RSTs No need to validate server-side IP address New helper ctdb_killtcp reads connections from stdin Much faster than talking to daemon SOCK_PACKET drops packets. . . Bidirectional killing, packets got mixed up! Some internal filtering and tuning needed Helper invoked directly from event script

Martin Schwenke Untangling and Restructuring CTDB

slide-70
SLIDE 70

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem

Martin Schwenke Untangling and Restructuring CTDB

slide-71
SLIDE 71

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-72
SLIDE 72

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss

Martin Schwenke Untangling and Restructuring CTDB

slide-73
SLIDE 73

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock”

Martin Schwenke Untangling and Restructuring CTDB

slide-74
SLIDE 74

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock” Want to split this. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-75
SLIDE 75

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock” Want to split this. . . . . . and allow other forms of cluster mutex than fcntl(2) lock

Martin Schwenke Untangling and Restructuring CTDB

slide-76
SLIDE 76

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock” Want to split this. . . . . . and allow other forms of cluster mutex than fcntl(2) lock New helper ctdb_mutex_fcntl_helper

Martin Schwenke Untangling and Restructuring CTDB

slide-77
SLIDE 77

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock” Want to split this. . . . . . and allow other forms of cluster mutex than fcntl(2) lock New helper ctdb_mutex_fcntl_helper Or: CTDB_RECOVERY_LOCK=\ "!/my/cluster/mutex/helper args ..."

Martin Schwenke Untangling and Restructuring CTDB

slide-78
SLIDE 78

Twelve months of untangling

Recovery lock fcntl(2) lock on cluster filesystem Lock is taken on first recovery. . . . . . and released on election loss Combination of “cluster master lock” and “recovery lock” Want to split this. . . . . . and allow other forms of cluster mutex than fcntl(2) lock New helper ctdb_mutex_fcntl_helper Or: CTDB_RECOVERY_LOCK=\ "!/my/cluster/mutex/helper args ..." Recovery lock not split yet

Martin Schwenke Untangling and Restructuring CTDB

slide-79
SLIDE 79

Twelve months of untangling

Monitoring in recovery daemon Recovery daemon runs main_loop at 1 second intervals

Martin Schwenke Untangling and Restructuring CTDB

slide-80
SLIDE 80

Twelve months of untangling

Monitoring in recovery daemon Recovery daemon runs main_loop at 1 second intervals Cluster leadership/elections, nodes states/flags, database recovery, IP failover & monitoring are all intertwined

Martin Schwenke Untangling and Restructuring CTDB

slide-81
SLIDE 81

Twelve months of untangling

Monitoring in recovery daemon Recovery daemon runs main_loop at 1 second intervals Cluster leadership/elections, nodes states/flags, database recovery, IP failover & monitoring are all intertwined Continuously revisit and improve. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-82
SLIDE 82

The pattern?

Martin Schwenke Untangling and Restructuring CTDB

slide-83
SLIDE 83

The pattern?

Helpers!

Martin Schwenke Untangling and Restructuring CTDB

slide-84
SLIDE 84

The pattern?

Helpers! Helpers!

Martin Schwenke Untangling and Restructuring CTDB

slide-85
SLIDE 85

The pattern?

Helpers! Helpers! Helpers!

Martin Schwenke Untangling and Restructuring CTDB

slide-86
SLIDE 86

The pattern?

Helpers! Helpers! Helpers! Call-outs!

Martin Schwenke Untangling and Restructuring CTDB

slide-87
SLIDE 87

The pattern?

Helpers! Helpers! Helpers! Call-outs! Helpers!

Martin Schwenke Untangling and Restructuring CTDB

slide-88
SLIDE 88

The pattern?

Helpers for incremental re-write

Martin Schwenke Untangling and Restructuring CTDB

slide-89
SLIDE 89

The pattern?

Helpers for incremental re-write Helpers can be used for writing shiny new code. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-90
SLIDE 90

The pattern?

Helpers for incremental re-write Helpers can be used for writing shiny new code. . . . . . to replace self-contained parts of the code

Martin Schwenke Untangling and Restructuring CTDB

slide-91
SLIDE 91

The pattern?

Helpers for incremental re-write Helpers can be used for writing shiny new code. . . . . . to replace self-contained parts of the code Works well for infrequently executed code

Martin Schwenke Untangling and Restructuring CTDB

slide-92
SLIDE 92

The pattern?

Helpers for incremental re-write Helpers can be used for writing shiny new code. . . . . . to replace self-contained parts of the code Works well for infrequently executed code Most of the code we want to move out is (relatively) infrequently executed. . .

Martin Schwenke Untangling and Restructuring CTDB

slide-93
SLIDE 93

The pattern?

Helpers for incremental re-write Helpers can be used for writing shiny new code. . . . . . to replace self-contained parts of the code Works well for infrequently executed code Most of the code we want to move out is (relatively) infrequently executed. . . A lot of it needs to be made more self-contained first!

Martin Schwenke Untangling and Restructuring CTDB

slide-94
SLIDE 94

What’s next?

Martin Schwenke Untangling and Restructuring CTDB

slide-95
SLIDE 95

What’s next?

Split the recovery lock

Martin Schwenke Untangling and Restructuring CTDB

slide-96
SLIDE 96

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...”

Martin Schwenke Untangling and Restructuring CTDB

slide-97
SLIDE 97

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...” What do you do when it fails?

Martin Schwenke Untangling and Restructuring CTDB

slide-98
SLIDE 98

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...” What do you do when it fails? Split recovery lock into separate cluster & recovery locks

Martin Schwenke Untangling and Restructuring CTDB

slide-99
SLIDE 99

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...” What do you do when it fails? Split recovery lock into separate cluster & recovery locks Split out election code

Martin Schwenke Untangling and Restructuring CTDB

slide-100
SLIDE 100

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...” What do you do when it fails? Split recovery lock into separate cluster & recovery locks Split out election code Drop recovery lock?

Martin Schwenke Untangling and Restructuring CTDB

slide-101
SLIDE 101

What’s next?

Split the recovery lock Drop support for “ctdb setreclock ...” What do you do when it fails? Split recovery lock into separate cluster & recovery locks Split out election code Drop recovery lock? Depends on handling of election during recovery

Martin Schwenke Untangling and Restructuring CTDB

slide-102
SLIDE 102

What’s next?

Split out election handling

Martin Schwenke Untangling and Restructuring CTDB

slide-103
SLIDE 103

What’s next?

Split out election handling Given work so far, quite easy to factor out

Martin Schwenke Untangling and Restructuring CTDB

slide-104
SLIDE 104

What’s next?

Split out election handling Given work so far, quite easy to factor out Should we then run as a separate daemon?

Martin Schwenke Untangling and Restructuring CTDB

slide-105
SLIDE 105

What’s next?

Split out election handling Given work so far, quite easy to factor out Should we then run as a separate daemon? Would this daemon do the recovery master validation?

Martin Schwenke Untangling and Restructuring CTDB

slide-106
SLIDE 106

What’s next?

Public IP management/takeover

Martin Schwenke Untangling and Restructuring CTDB

slide-107
SLIDE 107

What’s next?

Public IP management/takeover Improve API to IP allocation algorithm module?

Martin Schwenke Untangling and Restructuring CTDB

slide-108
SLIDE 108

What’s next?

Public IP management/takeover Improve API to IP allocation algorithm module? IP address reloading helper

Martin Schwenke Untangling and Restructuring CTDB

slide-109
SLIDE 109

What’s next?

Public IP management/takeover Improve API to IP allocation algorithm module? IP address reloading helper IP takeover run helper

Martin Schwenke Untangling and Restructuring CTDB

slide-110
SLIDE 110

What’s next?

Public IP management/takeover Improve API to IP allocation algorithm module? IP address reloading helper IP takeover run helper Move public IP state into a replicated database?

Martin Schwenke Untangling and Restructuring CTDB

slide-111
SLIDE 111

What’s next?

Public IP management/takeover Improve API to IP allocation algorithm module? IP address reloading helper IP takeover run helper Move public IP state into a replicated database? Move TCP connection tracking (“tickles”) into a replicated database?

Martin Schwenke Untangling and Restructuring CTDB

slide-112
SLIDE 112

Legal Statement

This work represents the view of the authors and does not necessarily represent the view of IBM. IBM is a registered trademark of International Business Machines Corporation in the United States and/or other countries. Linux is a registered trademark of Linus Torvalds. Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both. Other company, product, and service names may be trademarks or service marks of others.

Martin Schwenke Untangling and Restructuring CTDB

slide-113
SLIDE 113

Questions?

Martin Schwenke Untangling and Restructuring CTDB