Evolving the architecture of guardian.co.uk Mat Wall Lead - - PowerPoint PPT Presentation

evolving the architecture of guardian co uk
SMART_READER_LITE
LIVE PREVIEW

Evolving the architecture of guardian.co.uk Mat Wall Lead - - PowerPoint PPT Presentation

Evolving the architecture of guardian.co.uk Mat Wall Lead software architect History 1821 - Manchester Guardian released 1936 - Scott Trust formed: No proprietor Unique in British newspapers to this day 1959 - The Guardian goes national


slide-1
SLIDE 1

Evolving the architecture of guardian.co.uk

Mat Wall Lead software architect

slide-2
SLIDE 2

History

slide-3
SLIDE 3

1821 - Manchester Guardian released

slide-4
SLIDE 4

1936 - Scott Trust formed: No proprietor Unique in British newspapers to this day

slide-5
SLIDE 5

1959 - The Guardian goes national

slide-6
SLIDE 6

1959 - The Guardian goes national 2004 - “Berliner” redesign

slide-7
SLIDE 7

Digital History

slide-8
SLIDE 8

1995 - web site launch Simple portal Experimental project

slide-9
SLIDE 9

2006 - Europe’s largest online newspaper site. Reach of web far greater than national paper. 18M unique users, many international

slide-10
SLIDE 10

“The international audience for guardian.co.uk has brought a new goal within reach: for The Guardian to become the world’s leading liberal voice” GMG Scott Trust website

slide-11
SLIDE 11

“The Guardian to become the world’s leading liberal voice” Outgrown our web platform New platform required

slide-12
SLIDE 12

18 month build time 30M unique users 250M page impressions per month

slide-13
SLIDE 13

Beginning the R2 project

Intense 18 month agile build 4 development teams to manage >2M pages to migrate Lots of new functionality to develop What are we getting into?

slide-14
SLIDE 14

R2 project approach

Develop new system in parallel Zero downtime Migrate section by section to new system Architect system as we go along

slide-15
SLIDE 15

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests ?

Custom apache module allows per-URL backend selection Provides manageable migration

slide-16
SLIDE 16

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests ?

slide-17
SLIDE 17

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

slide-18
SLIDE 18

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

slide-19
SLIDE 19

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

slide-20
SLIDE 20

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

slide-21
SLIDE 21

travel environment science technology etc

business money sport news community

R1 R2 Apache layer R1 user requests

slide-22
SLIDE 22

travel environment science technology etc

business money sport news community

R1 R2 Apache layer R1 user requests

slide-23
SLIDE 23

R2 architecture

Start simple Impossible to predict final architecture Take an agile “Just in time” approach Learn from each release

slide-24
SLIDE 24

Travel site build

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests Why Travel? Only 14K articles to migrate Relatively low traffic Manageable performance Test our information architecture

slide-25
SLIDE 25

Application architecture

Spring Hibernate EHCache Java 6 build Simple stateless app EHCache Only needs to scale to14K articles Repositories Domain model Velocity 1.5 Caucho resin Controller (Spring MVC)

slide-26
SLIDE 26

System architecture

Oracle Search R2 frontend Apache R2 feeds R2 CMS Apache Apache

slide-27
SLIDE 27

Co-location

Oracle

LONDON

Apache R2 frontend R2 frontend Apache

MANCHESTER

standby standby standby feeds CMS R2 frontend R2 frontend Apache Apache

slide-28
SLIDE 28

Search Search

Co-location

Oracle

LONDON

Apache R2 frontend R2 frontend Apache

MANCHESTER

standby standby standby feeds CMS Search Search R2 frontend R2 frontend Apache Apache

Unreliable database But: Only 14K articles. Cache fits in RAM!

slide-29
SLIDE 29

Content Tags Article Video Audio Gallery Cartoon Keyword

Contributor

Series

Publication

Tone

slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33

“Simple sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests What are “simple sites”? Sites with similar functionality to travel site Content migration: 100K+ articles Front page of site

slide-34
SLIDE 34

“Simple sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests Performance tests indicate we should scale out application layer 2 x app servers

slide-35
SLIDE 35

“Simple sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests

Cache will longer fit in RAM: Site stability at risk We are in a WAN! ££££ to fix. Site front page included in this release I want to sleep at night

slide-36
SLIDE 36

Emergency mode

Oracle R2 frontend R2 feeds R2 CMS Apache Apache Apache Apache

NFS Gracefully degrade in the event of an outage Handle clean releases Fall back to flat files for a short time Graceful (and cheap)

slide-37
SLIDE 37

Emergency mode

Oracle R2 frontend R2 feeds R2 CMS Apache Apache Apache Apache

NFS

Publish content Content available on site Poll queue Store on NFS Get HTML

slide-38
SLIDE 38

Emergency mode

Oracle R2 frontend R2 feeds R2 CMS Apache Apache Apache Apache

NFS

Publish content Content available on site Poll queue Store on NFS Get HTML

Store HTML on NFS disc Schedule refresh in queue: Modified pages pressed in <2 minutes Unedited pages should be no more that 2 weeks old When database down serve from NFS Graceful degredation in user experience Fixed issue “Just in time” ie: before seen in production

slide-39
SLIDE 39

“Complex sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests

What are “Complex Sites”? Sites with third party interactions. Complex feeds. More traffic. 200K+ articles to migrate.

slide-40
SLIDE 40

“Complex sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests

200K + articles Performance tests indicate platform will be able to cope Some Oracle queries need optimising No scale increase required on app server

slide-41
SLIDE 41

“Complex sites”

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer user requests

slide-42
SLIDE 42

External information

slide-43
SLIDE 43

Database App server Web server

External system net Proxy

Stop using database as integration point Simple change: REST integration with third party server side Use proxy server to ensure performance / stability Third party control caching. Domain model. Used on our Sport site for football / cricket scores.

External information

slide-44
SLIDE 44

Database App server Web server

External system net Proxy

External information

slide-45
SLIDE 45
slide-46
SLIDE 46
slide-47
SLIDE 47
slide-48
SLIDE 48

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

News site launch The big one! Will end up with nearly 1M content pages! Much traffic

slide-49
SLIDE 49

travel environment science technology etc

business money sport news community

R1 R2 migration Apache layer R1 user requests

slide-50
SLIDE 50

R1 Scalability predictions Platform team formed. They predict problems with: related content tag pages Both will max out our database How radical will we have to be?

slide-51
SLIDE 51

R1 Related content 40% of Oracle load

slide-52
SLIDE 52

R1 Related content Difficult to decache

slide-53
SLIDE 53

R1 Related content High editorial value component

slide-54
SLIDE 54

R1 Related content Get it off the database!!

slide-55
SLIDE 55

R1 Solution Use Endeca search engine Index page ID > [tag IDs] Group tag IDs into buckets. Bucket size determined by content volume for tag.

slide-56
SLIDE 56

R1 Solution

Page ID B1 B2 B3 B4 B5 B6 B7 B8

123 34,575 632 45 645 124 15 551 389 125 45 4,676 34

Tags with most content Tags with least content

slide-57
SLIDE 57

R1 Solution When user requests page: Free text search for tag IDs. Search engine relevance ranks results. Tags with least content get higher relevance. Returns page IDs.

slide-58
SLIDE 58

R1 Problem 2: Tag queries

slide-59
SLIDE 59

R1 Tag queries

Platform team predict problems Queries becoming more expensive as content volume increases. Not scalable.

slide-60
SLIDE 60

R1 Tag queries Team have 2 ideas: 1: Cache page fragments on disc. Use Apache SSI. 2: SQL Queries can be sufficiently optimised I am tempted to just pick option 1.

slide-61
SLIDE 61

R1 but....

slide-62
SLIDE 62

R1 Developers are better than architects

slide-63
SLIDE 63

R1 Developers have greater understanding of the real detail of the system innards Don’t dictate to developers Let them innovate Allow them to try both options. (Privately I bet on Option 1, cache page fragments on disc)

slide-64
SLIDE 64

R1 Independent Oracle consultant (with beard) optimised problem queries in 1 day. Performance tests say we’re good to go for News. Not what I expected! Actual solution

slide-65
SLIDE 65

R1 Beyond News Platform team predicting scalability horizon ahead Caches overflowing Database load increasing Can no longer add app tier servers to scale

slide-66
SLIDE 66

R1 Beyond News

slide-67
SLIDE 67

R1 Beyond News

slide-68
SLIDE 68

R1 Reduce database load by 50% Keep it there

slide-69
SLIDE 69

R1 JBOSS cache & memcached 6Gb distributed cache in development now Can scale app tier without killing database Akami reverse proxy to reduce frontend load Required much later than I thought!

slide-70
SLIDE 70

R1 Beyond News

slide-71
SLIDE 71

R1 Agile architecture is possible Evolutions often smaller than expected Reserve technical time in project budget Try things out. Make small mistakes. Let developers innovate

slide-72
SLIDE 72

R1 One more thing...

slide-73
SLIDE 73

R1

slide-74
SLIDE 74

R1 Content API Backed of our search engine Provides full access to all guardian content Supports XML, JSON, ATOM

slide-75
SLIDE 75

R1 Free for apps that don’t publish full content Advertising network for those that do Currently in limited private beta

slide-76
SLIDE 76

R1

slide-77
SLIDE 77

R1

slide-78
SLIDE 78

R1

slide-79
SLIDE 79

R1

slide-80
SLIDE 80

R1

slide-81
SLIDE 81

R1 http://www.guardian.co.uk/open-platform http://blogs.guardian.co.uk/inside Thank you