clang-format Automatic formatting for C++ (Daniel Jasper - - - PowerPoint PPT Presentation

clang format
SMART_READER_LITE
LIVE PREVIEW

clang-format Automatic formatting for C++ (Daniel Jasper - - - PowerPoint PPT Presentation

clang-format Automatic formatting for C++ (Daniel Jasper - djasper@google.com) Why? A consistent coding style is important Formatting is tedious Clang's source files contain ~25% whitespace characters Sema::NameClassification


slide-1
SLIDE 1

clang-format

Automatic formatting for C++

(Daniel Jasper - djasper@google.com)

slide-2
SLIDE 2

Why?

  • A consistent coding style is important
  • Formatting is tedious

Clang's source files contain ~25% whitespace characters

Sema::NameClassification Sema::ClassifyName(Scope *S, CXXScopeSpec &SS, IdentifierInfo *&Name, SourceLocation NameLoc, const Token &NextToken, bool IsAddressOfOperand, CorrectionCandidateCallback *CCC) { }

slide-3
SLIDE 3

Why?

  • A consistent coding style is important
  • Formatting is tedious

Clang's source files contain ~25% whitespace characters

Sema::NameClassification Sema::Classify SomeName(Scope *S, CXXScopeSpec &SS, IdentifierInfo *&Name, SourceLocation NameLoc, const Token &NextToken, bool IsAddressOfOperand, CorrectionCandidateCallback *C CC) { }

slide-4
SLIDE 4

Why?

  • Time wasted on style discussions, e.g. in code reviews
  • From cfe-commits@:

> ... > ... > + while( TemplateParameterDepth <= MemberTemplateDepth ) Space after "while", no spaces immediately inside parens. ... ...

slide-5
SLIDE 5

Why?

  • Source code becomes machine editable

Fully automated refactoring tools!

Example: tools/extra/cpp11-migrate for (int i = 0; i < N; ++i) { sum += arr[i]; } for (auto & elem : arr) { sum += elem; }

slide-6
SLIDE 6

Why?

  • Source code becomes machine editable

Fully automated refactoring tools!

Example: tools/extra/cpp11-migrate for (int i = 0; i < N; ++i) { sum += arr[i]; } for (auto & elem : arr) { sum += elem; }

slide-7
SLIDE 7

Process

  • Design document
  • Feedback on cfe-dev@
  • Key ideas / questions:

Indentation as well as line breaking

Editor integration and library for other tools

Only changing whitespaces

Parser vs. lexer

Style deduction

  • Actual solutions might differ :-)
slide-8
SLIDE 8

How?

  • Build upon Clang component

Lexer: C++ token stream

Parser: Syntax tree

#define TYPE(Class, Parent) \ case Type::Class: { \ const Class##Type *ty = cast<Class##Type>(split.Ty); \ if (!ty->isSugared()) \ goto done; \ next = ty->desugar(); \ break; \ }

slide-9
SLIDE 9

Architecture

  • Structural parser: Unwrapped lines
  • Layouter: Arrange tokens

void f(int a){ int * i; f ( ); } structural parser void f(int a){ int * i; f ( ); } layouter void f( int a) { int *i; f(); } void f( int a) { int *i; f(); } ... token annotator * : pointer

slide-10
SLIDE 10

Unwrapped lines

  • Everything we'd like to put on a single line
  • One unwrapped line does not influence other

unwrapped lines void f() { someFunction(Parameter1, #define A Parameter2 A); }

line 1 line 2 line 3 line 4

slide-11
SLIDE 11

Layouter

  • Every line break has a certain penalty

aaaaaaaa(aaaaaaaaaaaaa, aaaaaaaaaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaa( Penalty: 100 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)), Penalty: 41 aaaaaaaa(aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa( Penalty: 100 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa))); Total: 241

  • Factors

○ Nesting level ○ Token types ○ Operator precedence ○ ...

  • Best formatting: Formatting with lowest penalty
slide-12
SLIDE 12

Layouter

  • Try "all" the combinations
  • Clang-format can split or not split at each token

int x = a + b + c + d + e + f + g; ^ ^ ^ ^ ^ ^ ^ ^

  • 28 = 256 combinations
  • Memoization using an "indent state"

Consumed n Tokens

Currently in column m

...

  • Find cheapest state-path with Dijkstra's algorithm
slide-13
SLIDE 13

More important problems

int *a; or int* a;

  • Clang-format has an adaptive mode:

Count cases in input

Take majority vote

slide-14
SLIDE 14

Example: for-loops (Sema.cpp)

for (OverloadExpr::decls_iterator It = Overloads.begin(), DeclsEnd = Overloads.end(); It != DeclsEnd; ++It) {} for (SmallVectorImpl<sema::PossiblyUnreachableDiag>::iterator i = Scope->PossiblyUnreachableDiags.begin(), e = Scope->PossiblyUnreachableDiags.end(); i != e; ++i) {} for (TentativeDefinitionsType::iterator T = TentativeDefinitions.begin(ExternalSource), TEnd = TentativeDefinitions.end(); T != TEnd; ++T) {} for (Module::submodule_iterator Sub = Mod->submodule_begin(), SubEnd = Mod->submodule_end(); Sub != SubEnd; ++Sub) {}

slide-15
SLIDE 15

Example: Expression indentation

bool value = ((aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb + ccccccccccccccccccccccccccccccccccccc) == ((ddddddddddddddddddddddddddddddddddddddddd * eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee) + fffffffffffffffffffffffffffffffffffff)) && ((ggggggggggggggggggggggggggggggggggggggggggggg * hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh) > iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii);

slide-16
SLIDE 16

Example: Expression indentation

bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb + ccccccccccccccccccccccccccccccccccccc == ddddddddddddddddddddddddddddddddddddddddd * eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee + fffffffffffffffffffffffffffffffffffff && ggggggggggggggggggggggggggggggggggggggggggggg * hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh > iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii;

slide-17
SLIDE 17

Example: Expression indentation

bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb + ccccccccccccccccccccccccccccccccccccc == ddddddddddddddddddddddddddddddddddddddddd * eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee + fffffffffffffffffffffffffffffffffffff && ggggggggggggggggggggggggggggggggggggggggggggg * hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh > iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii;

slide-18
SLIDE 18

Example: Expression indentation

bool value = aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb + ccccccccccccccccccccccccccccccccccccc == ddddddddddddddddddddddddddddddddddddddddd * eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee + fffffffffffffffffffffffffffffffffffff && ggggggggggggggggggggggggggggggggggggggggggggg * hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh > iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii;

slide-19
SLIDE 19

Demo time

slide-20
SLIDE 20

How can you use clang-format?

Integration into editors / workflows available:

  • vim:

clang-format.py

  • emacs:

clang-format.el

  • diff:

clang-format-diff.py All in: clang/tools/clang-format/ More to come: Eclipse, TextMate, ...

slide-21
SLIDE 21

How can you use clang-format?

As a library (include/clang/Format/Format.h):

tooling::Replacements reformat(const FormatStyle &Style, Lexer &Lex, SourceManager &SourceMgr, std::vector<CharSourceRange> Ranges, DiagnosticConsumer *DiagClient = 0);

  • E.g. as postprocessing for refactoring tools
  • Interface can be extended
slide-22
SLIDE 22

Where are we now?

  • Clang-format understands most C++ / ObjC constructs
  • Three style guides supported

LLVM / Clang

Google

Chromium

  • Clang-format can format its own source code
slide-23
SLIDE 23

What next?

  • Bugs and formatting improvements
  • Configuration (files, command-line, ...)
  • More coding styles

Coding styles using tabs?

Coding styles without column limit?

  • C++ 11 features (lambdas, trailing return types, ...)
  • clang-tidy

Based on Clang's AST

Find and fix stuff like: "Don’t evaluate end() every time through a loop"

slide-24
SLIDE 24

Thank you!

clang.llvm.org/docs/ClangFormat.html