Effective Localization Crowdsourcing Ratnadeep Debnath - - PowerPoint PPT Presentation

effective localization crowdsourcing
SMART_READER_LITE
LIVE PREVIEW

Effective Localization Crowdsourcing Ratnadeep Debnath - - PowerPoint PPT Presentation

Effective Localization Crowdsourcing Ratnadeep Debnath rtnpro@indifex.com Who? @rtnpro Languages L anguages L anguages ......... ..... I personally believe we developed language because of our deep inner need to complain. Jane


slide-1
SLIDE 1

Effective Localization Crowdsourcing

Ratnadeep Debnath rtnpro@indifex.com

slide-2
SLIDE 2

@rtnpro

Who?

slide-3
SLIDE 3

Languages Languages Languages

......... .....

slide-4
SLIDE 4

“I personally believe we developed language because of our deep inner need to complain.” —Jane Wagner

slide-5
SLIDE 5

95%

People on Earth with native Language other than English

slide-6
SLIDE 6

50%

Internet users speaking no English at all

slide-7
SLIDE 7

120

Published languages

  • n Wordpress.com
slide-8
SLIDE 8

150

Wikipedia languages with > 1500 articles

slide-9
SLIDE 9

4x

Chances to buy something from websites in native language

slide-10
SLIDE 10
slide-11
SLIDE 11

$25

Additional revenue per $1 spent on localization

slide-12
SLIDE 12

Develop for an international audience

slide-13
SLIDE 13

Usability Accessibility Developers

slide-14
SLIDE 14

13%

American websites available in multiple languages

slide-15
SLIDE 15

What's the solution?

slide-16
SLIDE 16

Internationalization, i18n

&

Localization, L10n

slide-17
SLIDE 17

I18n & L10n

slide-18
SLIDE 18

Scope

  • Language
  • Culture
  • Writing Conventions
  • Subject to regulatory compliance
slide-19
SLIDE 19

Why localize?

  • Reach to a larger number of users
  • Users are more comfortable in native

language

  • Language is a key factor to develop relations
  • Localization is essential for global operations
  • Achieve higher company revenues
slide-20
SLIDE 20

How does L10n work

(from the gettext manual)

Original C Sources ───> Preparation ───> Marked C Sources ───╮ │ ╭─────────<─── GNU gettext Library │ ╭─── make <───┤ │ │ ╰─────────<────────────────────┬───────────────╯ │ │ │ ╭─────<─── PACKAGE.pot <─── xgettext <───╯ ╭───<─── PO Compendium │ │ │ ↑ │ │ ╰───╮ │ │ ╰───╮ ├───> PO editor ───╮ │ ├────> msgmerge ──────> LANG.po ────>────────╯ │ │ ╭───╯ │ │ │ │ │ ╰─────────────<───────────────╮ │ │ ├─── New LANG.po <────────────────────╯ │ ╭─── LANG.gmo <─── msgfmt <───╯ │ │ │ ╰───> install ───> /.../LANG/PACKAGE.mo ───╮ │ ├───> "Hello world!" ╰───────> install ───> /.../bin/PROGRAM ───────╯

slide-21
SLIDE 21

Localization is a pain

slide-22
SLIDE 22
slide-23
SLIDE 23

A sample L10n use case

slide-24
SLIDE 24

Workflow

  • Mark, export strings (PO format)
  • Release string freeze
  • Translator: VCS checkout
  • Translate w/ specialized tools
  • Get 'em files back
  • SSH

, Email , Tickets ☠ ☢ ☹

  • For every friggin release
slide-25
SLIDE 25

Challenges

  • Too darn hard
  • Community isolation
  • Quality
  • Scalability
  • Always more languages, more users
slide-26
SLIDE 26
slide-27
SLIDE 27
slide-28
SLIDE 28

Why Transifex?

  • Abstracts various VCS systems
  • Easy to use
  • Better coordination
  • Upstream friendly
  • Prevents duplication of work
  • Automate L10n workflow
  • Open source
slide-29
SLIDE 29

History

  • 2007: Initial development of Transifex sponsored by Google under

its "Google Summer of Code" program

  • 2008: Fedora adopts Transifex as its official Localization platform
  • 2010: Indifex is chosen by Intel and Nokia to manage the

localization of MeeGo

  • 2011: Tranisfex is used by 2,000 open source projects and 10,000
  • users. More than 5 million words have been translated, reaching

an audience of more than 30 million people!

  • Late 2011: Indifex offers support for the localization of proprietary

projects too. Release of an Enterprise-level product, both in self- hosted and managed solutions.

slide-30
SLIDE 30

10K foot view of Transifex's features

www.transifex.net/tour/features/overview/

slide-31
SLIDE 31
slide-32
SLIDE 32

Current selected features of Transifex

  • L10n workflow automation
  • Workflow control / Project management
  • Team management / Communication
  • Effective crowdsourcing
  • Rich User Interface, web based application
  • Translator tools: TM, suggestions, etc
  • Quality control / assurance
  • Scalable, accessible, open-source
  • SaaS, fremium plans
  • Social features
slide-33
SLIDE 33

Upcoming selected features of Transifex

  • Translator marketplace
  • Data-mining, automatic rating of translators
  • Integrate L10n workflow with any type of content
slide-34
SLIDE 34

Transifex Versions

  • Transifex.net
  • SaaS, plug-n-play, batteries-included
  • 1.8K projects, 10.7K users, 50M words
  • Transifex Enterprise Edition
  • Robust, high-performance, intranet
  • Enterprise modules (TM, mgmt, QΑ)
  • Transifex Community Edition (open-source)
slide-35
SLIDE 35

Transifex Versions

slide-36
SLIDE 36

Overview of Transifex Team

  • 7-strong global team
  • Selected clients: Intel, Nokia, Mozilla, Red Hat
  • Selected open-source projects/partners:

Fedora, MeeGo, Firefox, Django, Creative Commons, Joomla

  • 2 years of specialized L10n services
  • Profitable, 100% yearly growth
  • Long open source involvement
slide-37
SLIDE 37

Overview of Transifex Team

slide-38
SLIDE 38

How does L10n work

with Transifex

slide-39
SLIDE 39

Under the Hood

slide-40
SLIDE 40

Django

slide-41
SLIDE 41

Our technologies

  • Lotte Rich user interface

  • Many file formats / types of content
  • Teams Persmissions management

  • Translation history/memory
  • Quality control & assurance
  • Project/Content management
  • 3rd party web services
  • Social authentication & features
  • “Middle level” Transifex apps
  • Community features
  • Time Release management

  • Data-mining/translator auto-rating
  • Python, Django (MVC)
  • Advanced web design (Django

templates-javascript-AJAX)

  • Postgres (RDBMS)
  • MongoDB, memcached (NonRel)
  • Full-text indexing / search
  • NginX Scalable deployment

  • Message queues/server-side

asynchronous workers

  • Unit testing
  • WebAPI CLI client

  • Django addons
  • Logging time based statistics

slide-42
SLIDE 42

Project / Content management

Meet any project's needs

  • Project
  • Resources
  • Categories
  • Releases
  • Hubs
  • Outsourced access
slide-43
SLIDE 43

3rd Party Web services

  • GoogleChart
  • Recaptcha
  • Gravatar
  • Getsatisfaction
  • Yahoo pipes
slide-44
SLIDE 44

Social authentication & features

(includes coming up too)

  • Google, Facebook, Twitter, etc
  • Tweet my week's translation work
  • Tweet my project's progress
  • Show tweets that have the '#tx' hashtag in

my Transifex public profile

  • Show me translation projects my friends like
slide-45
SLIDE 45

“Middle level” Transifex apps

slide-46
SLIDE 46

Python Django (MVC) –

  • MVC
  • Reusability, plugability
  • Rapid development
  • DRY
  • Admin interface
  • ORM
  • Regular expression URL

dispatcher

  • Template system
  • Forms
  • Caching framework
  • Extension to python's

testing framework

  • Ready-to-go authentication
  • Cross site request forgery

protection

slide-47
SLIDE 47

Advanced web-design

  • Django templating system
  • Javascript
  • AJAX
  • HTML5
  • CSS3
slide-48
SLIDE 48

PostgreSQL (Relational database)

  • Procedural

languages

  • Indexes
  • Triggers
  • Multi-version

Concurrency Control

  • Rules
  • Data types
  • User-defined
  • bjects
  • Inheritance
  • Replication
  • Add-ons
slide-49
SLIDE 49

Non-relational Databases

  • MongoDB
  • memcached
  • Redis
slide-50
SLIDE 50

Full-text indexing / search

  • Levenshtein algorithm
  • Haystack
  • Solr
slide-51
SLIDE 51

Scalable deployment

  • NginX
  • Load balancer
  • Webserver
  • Database server
  • Static page server
slide-52
SLIDE 52

Scalable deployment

1

slide-53
SLIDE 53

Message queues

  • Asynchronous

workers

  • Don't interrupt

user-experience

  • Run in the

background

  • Can run

distributively

slide-54
SLIDE 54

Unit-testing

  • Test-suite that simulates all use-cases
  • Ensure new changes don't “break” old

functionality

  • Continuous testing
  • Browser test-suite: Selenium
slide-55
SLIDE 55

Django-addons

  • Open-source project, contributed by us to

the community

  • Take Django's reusable applications to the

next level

  • Activate/deactivate addons with a simple

command

  • Easily “assemble” enterprise Transifex

edition

slide-56
SLIDE 56

A ride through Transifex

slide-57
SLIDE 57

Get started

slide-58
SLIDE 58

Plans

slide-59
SLIDE 59

Signup

slide-60
SLIDE 60

Signin

slide-61
SLIDE 61

Dashboard

slide-62
SLIDE 62

Add new project - Basic

slide-63
SLIDE 63

Add new project - Advanced

slide-64
SLIDE 64

Project details

slide-65
SLIDE 65

Project details Upload resource –

slide-66
SLIDE 66

Supported File Formats

  • Gettext (.po, .pot)
  • QT Linguist (.ts)
  • Java properties (.property)
  • Android resources (.xml)
  • PHP array/define (.php)
  • Joomla lang packs (.ini)
  • XHTML (.xhtml)
  • Xliff
  • Ruby .yml
  • Apple .strings
  • Any strings (over API)
  • Microsoft (.resx, .aspx)
  • Mozilla .dtd
  • Most popular subtitle formats (.srt, .sbv, .sub)

http://help.transifex.net/user-guide/formats.html

In pipeline:

  • MS .aspx
  • Libreoffice
slide-67
SLIDE 67

Translation details

slide-68
SLIDE 68

Projects Access Control –

slide-69
SLIDE 69

Teams Permissions Management –

slide-70
SLIDE 70

Resource details

slide-71
SLIDE 71

A standard .POT file

slide-72
SLIDE 72

Lotte

slide-73
SLIDE 73

Translation History / Translation Memory

  • Revert to older version of translation
  • Find translations from similar source strings
  • Share TM through different projects
  • Import TM instances from other programs
slide-74
SLIDE 74

Translation History / Translation Memory

slide-75
SLIDE 75

Quality control & assurance

  • Translation memory
  • Translation history (undo function)
  • Comments
  • Suggestions
  • Glossary
  • Machine translation
  • Review translations
slide-76
SLIDE 76

Time Project release – management

slide-77
SLIDE 77

Web API

http://help.transifex.net/features/api/index.html

$ curl -u foo:bar http://www.transifex.net/api/2\ /project/myproject/resource/myres/stats/el/

"completed": "100%", "untranslated_words": 0, "last_commiter": "tylerdurden", "last_update": "2011-06-07 18:34:40", "translated_entities": 3, "translated_words": 17,

slide-78
SLIDE 78

Command-line Client

# ls source_file_en.po # tx init # tx set -r p_name.r_name --source -l en source_file_en.po # tx push -s <!-- days pass -- > # tx pull -l el # find . . ./source_file_en.po ./.tx ./.tx/config ./.tx/p_name.r_name ./.tx/p_name.r_name/el_translation

slide-79
SLIDE 79

Community features

  • Recognize top contributors
  • Promote open-source projects
  • Get project managers in contact with

translators

  • Show recent activity
slide-80
SLIDE 80

Community features Projects

slide-81
SLIDE 81

Community features People

slide-82
SLIDE 82

Data-mining, translator auto-rating

  • Use collected translation data and statistics

in order to:

  • Promote efficient translators
  • Match translators with projects that will

interest them

slide-83
SLIDE 83
slide-84
SLIDE 84

Contribute

  • Use Transifex
  • Feedback
  • Localize
  • Feature Requests
  • Bug Reports
  • Develop

Useful tools

  • Trac, IRC, Mailing list
slide-85
SLIDE 85

Still wondering how to start?

  • Let's run Transifex
  • Use it
  • Find a few bugs
  • Fix them
  • Commit patches
slide-86
SLIDE 86

Me & Transifex

  • User
  • Contributor
  • Intern
  • Employee
slide-87
SLIDE 87

Useful links

  • Indifex:

www.indifex.com

  • Transifex: www.transifex.net
  • Docs:

help.transifex.net

  • API:

help.transifex.net/technical/api/

  • Client:

help.transifex.net/user-guide/client/

  • Support:

support.indifex.com

  • Code:

code.indifex.com/transifex

  • Trac:

trac.transifex.org

  • Mailinglist: groups.google.com/group/transifex-devel
slide-88
SLIDE 88

Licenced under Creative Commons CC-BY 3.0 licence

Questions?

slide-89
SLIDE 89

Multilingual Apps with a single click

Ratnadeep Debnath rtnpro@indifex.com www.indifex.com