Untangling the Strings Scaling Puppet with inotify Steven McDonald - - PowerPoint PPT Presentation

untangling the strings
SMART_READER_LITE
LIVE PREVIEW

Untangling the Strings Scaling Puppet with inotify Steven McDonald - - PowerPoint PPT Presentation

Untangling the Strings Scaling Puppet with inotify Steven McDonald steven.mcdonald@anchor.net.au Anchor linux.conf.au 2015 Sysadmin miniconf Background Puppet is a centralised configuration management system. Background Puppet is a


slide-1
SLIDE 1

Untangling the Strings

Scaling Puppet with inotify Steven McDonald steven.mcdonald@anchor.net.au Anchor linux.conf.au 2015 Sysadmin miniconf

slide-2
SLIDE 2

Background

  • Puppet is a centralised configuration

management system.

slide-3
SLIDE 3

Background

  • Puppet is a centralised configuration

management system.

  • Each Puppet-managed host (“node”) runs

a client program (the “Puppet agent”).

slide-4
SLIDE 4

Background

  • Puppet is a centralised configuration

management system.

  • Each Puppet-managed host (“node”) runs

a client program (the “Puppet agent”).

  • The server (“Puppet master”) tells the

Puppet agent what the node's configuration should be.

slide-5
SLIDE 5

Background: Puppet rollouts

  • We use a single “production” environment

with many (close to 1000) nodes.

slide-6
SLIDE 6

Background: Puppet rollouts

  • We use a single “production” environment

with many (close to 1000) nodes.

  • We use global virtual resources for things

like monitoring on unmanaged hosts.

slide-7
SLIDE 7

Background: Puppet rollouts

  • We use a single “production” environment

with many (close to 1000) nodes.

  • We use global virtual resources for things

like monitoring on unmanaged hosts.

  • We make very small changes (usually

specific to one node) that we want to take effect immediately.

slide-8
SLIDE 8

Background: Puppet rollouts

  • We use a single “production” environment

with many (close to 1000) nodes.

  • We use global virtual resources for things

like monitoring on unmanaged hosts.

  • We make very small changes (usually

specific to one node) that we want to take effect immediately.

  • This is a very slow workflow with Puppet.
slide-9
SLIDE 9

The goal

  • Have Puppet manifest changes apply

immediately after rolling out to the Puppet master.

slide-10
SLIDE 10

The goal

  • Have Puppet manifest changes apply

immediately after rolling out to the Puppet master.

  • Historically, we have achieved this by

restarting the Puppet master on every rollout.

slide-11
SLIDE 11

The problem

  • The Puppet master takes a long time to

parse manifests into types.

  • The time taken is negligible with up to a

few dozen manifests, but quickly escalates from there.

slide-12
SLIDE 12

The problem

  • The Puppet master takes a long time to

parse manifests into types.

  • The time taken is negligible with up to a

few dozen manifests, but quickly escalates from there.

  • Our tree has over 1300 manifests in our

site directory (i.e., loaded on startup and not autoloaded). This takes just over a minute to parse.

slide-13
SLIDE 13

What solutions does Puppet offer?

  • Puppet has very coarse internal caching;

it is capable of expiring an entire environment at a time.

slide-14
SLIDE 14

What solutions does Puppet offer?

  • Puppet has very coarse internal caching;

it is capable of expiring an entire environment at a time.

  • With all our nodes in one environment,

this is as good (or as bad) as a Puppet master restart.

slide-15
SLIDE 15

What solutions does Puppet offer?

  • Puppet can have distinct environments

for different groups of nodes, each with their own (smaller) set of manifests.

slide-16
SLIDE 16

What solutions does Puppet offer?

  • Puppet can have distinct environments

for different groups of nodes, each with their own (smaller) set of manifests.

  • While this is a good idea in theory, having

all our nodes in the same environment is the best fit for our workflow.

slide-17
SLIDE 17

What solutions could we implement?

  • Filesystem polling.
slide-18
SLIDE 18

What solutions could we implement?

  • Filesystem polling.
  • Extract changed file information from git.
slide-19
SLIDE 19

What solutions could we implement?

  • Filesystem polling.
  • Extract changed file information from git.
  • Listen for changes to files using Linux's

inotify subsystem.

slide-20
SLIDE 20

What solutions could we implement?

  • Filesystem polling.
  • Extract changed file information from git.
  • Listen for changes to files using Linux's

inotify subsystem.

  • All of these options require one piece of

infrastructure we needed to implement

  • urselves: the ability to expire code on a

per-file basis.

slide-21
SLIDE 21

Internal relationships

slide-22
SLIDE 22

Per-file code expiration

  • We implemented a general-purpose file

expiration mechanism, to expire code in types, and to expire entire types in type collections.

slide-23
SLIDE 23

Per-file code expiration

  • We implemented a general-purpose file

expiration mechanism, to expire code in types, and to expire entire types in type collections.

  • Because of the generic nature of the

expiration API, it can easily be adapted to any method of determining which files have changed.

slide-24
SLIDE 24

Option #1: Filesystem polling

  • Most portable.
slide-25
SLIDE 25

Option #1: Filesystem polling

  • Most portable.
  • T
  • o slow. Just about any other option is

more efficient.

slide-26
SLIDE 26

Option #1: Filesystem polling

  • Most portable.
  • T
  • o slow. Just about any other option is

more efficient.

  • This could be implemented as a fallback

mechanism that works anywhere, but we wanted to take advantage of the specific features of our environment.

slide-27
SLIDE 27

Option #2: Asking git for changes

  • This is a very clean and efficient idea.
slide-28
SLIDE 28

Option #2: Asking git for changes

  • This is a very clean and efficient idea.
  • It ties us to a git-based deployment

model.

slide-29
SLIDE 29

Option #2: Asking git for changes

  • This is a very clean and efficient idea.
  • It ties us to a git-based deployment

model.

  • It requires us to queue changes
  • urselves, in code that's unlikely to see

widespread testing.

slide-30
SLIDE 30

Option #2: Asking git for changes

  • This is a very clean and efficient idea.
  • It ties us to a git-based deployment

model.

  • It requires us to queue changes
  • urselves, in code that's unlikely to see

widespread testing.

  • Very easy to introduce bugs that miss

changes.

slide-31
SLIDE 31

Option #3: inotify

  • It ties us to Linux Puppet masters.
slide-32
SLIDE 32

Option #3: inotify

  • It ties us to Linux Puppet masters.
  • It does not tie us to any specific

deployment method.

slide-33
SLIDE 33

Option #3: inotify

  • It ties us to Linux Puppet masters.
  • It does not tie us to any specific

deployment method.

  • It does not require us to do any dirty work
  • urselves; the inotify code in the kernel is

very well tested.

slide-34
SLIDE 34

Option #3: inotify

  • It ties us to Linux Puppet masters.
  • It does not tie us to any specific

deployment method.

  • It does not require us to do any dirty work
  • urselves; the inotify code in the kernel is

very well tested.

  • Least risk of introducing bugs.
slide-35
SLIDE 35

Triggering expiry with inotify

  • The autoloader requests that files simply be

expired when they change; they will be re- autoloaded as necessary.

  • The initial importer (which parses the

environment's manifests directory) requests that files be reparsed when they change, since these are always supposed to be loaded.

slide-36
SLIDE 36

Triggering expiry with inotify

  • The autoloader requests that files simply be

expired when they change; they will be re- autoloaded as necessary.

  • The initial importer (which parses the

environment's manifests directory) requests that files be reparsed when they change, since these are always supposed to be loaded.

  • This all happens at the start of a catalog

compilation.

slide-37
SLIDE 37

Triggering expiry with inotify

  • The deprecated “import” function makes

it extremely difficult to track which files need to be reparsed when they change.

slide-38
SLIDE 38

Triggering expiry with inotify

  • The deprecated “import” function makes

it extremely difficult to track which files need to be reparsed when they change.

  • We elected not to support “import”, and

took the time to remove it from our manifests.

slide-39
SLIDE 39

Results

  • Based on initial testing on a lightly-loaded

staging environment, we expected around 70 seconds shaved off the average agent run immediately after rollout.

slide-40
SLIDE 40

Results

  • Based on initial testing on a lightly-loaded

staging environment, we expected around 70 seconds shaved off the average agent run immediately after rollout.

  • After deploying to production, we had

anecdotal speed improvements of up to 5 minutes on nodes with complex catalogs.

slide-41
SLIDE 41

Limitations

  • Our initial implementation does not

correctly support the future parser.

slide-42
SLIDE 42

Limitations

  • Our initial implementation does not

correctly support the future parser.

  • Reopening a class in a different file is not

supported.

slide-43
SLIDE 43

Limitations

  • Our initial implementation does not

correctly support the future parser.

  • Reopening a class in a different file is not

supported.

  • The use of “import” is not supported.
slide-44
SLIDE 44

Limitations

  • Our initial implementation does not

correctly support the future parser.

  • Reopening a class in a different file is not

supported.

  • The use of “import” is not supported.
  • Native Ruby code (used for custom

functions) does not get reloaded.

slide-45
SLIDE 45

Get the source

  • https://github.com/AnchorCat/puppet/tree/anchor/3.7.3/in

– Depends on a yet-unreleased version of rb-inotify,

but will work with the current master branch.

  • OpenDocument sources for this presentation are

available under the WTFPL:

– http://steven.beta.anchortrove.com/lca2015/untangling_the_s

  • Ruby inotify bindings:

– https://github.com/nex3/rb-inotify