Range-Based Text Formatting For a Future Range-Based Standard - - PowerPoint PPT Presentation

range based text formatting
SMART_READER_LITE
LIVE PREVIEW

Range-Based Text Formatting For a Future Range-Based Standard - - PowerPoint PPT Presentation

Range-Based Text Formatting For a Future Range-Based Standard Library 1 / 90 Text Formatting Text formatting everywhere Many different libraries/approaches Here is yet another one 2 / 90 Text Formatting Text formatting everywhere Many


slide-1
SLIDE 1

Range-Based Text Formatting

For a Future Range-Based Standard Library

1 / 90

slide-2
SLIDE 2

Text Formatting

Text formatting everywhere Many different libraries/approaches Here is yet another one

2 / 90

slide-3
SLIDE 3

Text Formatting

Text formatting everywhere Many different libraries/approaches Here is yet another one

3 / 90

slide-4
SLIDE 4

Text Formatting

Text formatting everywhere Many different libraries/approaches Here is yet another one Input components of text

  • rder of these components

per component, conversion parameters e.g., number of decimal places

4 / 90

slide-5
SLIDE 5

Text Formatting

Text formatting everywhere Many different libraries/approaches Here is yet another one Input components of text

  • rder of these components

per component, conversion parameters e.g., number of decimal places Output a string

5 / 90

slide-6
SLIDE 6

Syntax Options

Order of components and parameters described by

6 / 90

slide-7
SLIDE 7

Syntax Options

Order of components and parameters described by format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax)

7 / 90

slide-8
SLIDE 8

Syntax Options

Order of components and parameters described by format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax) 'Just C++': functions, parameters, operators

std::stringstream() << std::setprecision(2) << 3.14; "Hello " + std::to_string(3.14)

8 / 90

slide-9
SLIDE 9

Format Strings: Pros

format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax) Pros syntax closer to resulting string can decide format string at runtime forgoes compile-time format check

9 / 90

slide-10
SLIDE 10

Format Strings: Cons

format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax) Cons must escape format string (and remember it!)

10 / 90

slide-11
SLIDE 11

Format Strings: Cons

format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax) Cons must escape format string (and remember it!) extra language for parameters why not use C++? user-defined types need parser for parameters

11 / 90

slide-12
SLIDE 12

Format Strings: Cons

format string

printf("Hello %f", 3.14); absl::StrFormat (printf syntax)

{fmt} / std::format / LEWG P0645 (Python syntax) Cons must escape format string (and remember it!) extra language for parameters why not use C++? user-defined types need parser for parameters no gradual change in syntax from string concatenation to formatting

str1 + str2 vs. format("%s%s", str1, str2) vs. format("%s%d%s", str1, n, str2)

12 / 90

slide-13
SLIDE 13

Format Strings: I8N

translation outsourced to agencies set up to deal with text, not code XLIFF (XML Localization Interchange File Format)

13 / 90

slide-14
SLIDE 14

Format Strings: I8N

translation outsourced to agencies set up to deal with text, not code XLIFF (XML Localization Interchange File Format) text may contain placeholders

Dear {0}, thank you for your interest in our product.

So must use format strings!

14 / 90

slide-15
SLIDE 15

Format Strings: I8N

translation outsourced to agencies set up to deal with text, not code XLIFF (XML Localization Interchange File Format) text may contain placeholders

Dear {0}, thank you for your interest in our product.

So must use format strings! BUT: give agency only as much control as necessary insertion position formatting parameters provided by OS setting/culture database decimal separator (comma vs. period) number of decimal places date format

15 / 90

slide-16
SLIDE 16

Just C++: iostream

std::stringstream() << std::setprecision(2) << 3.14;

abuse of operator overloading

  • nly because no variadic templates?

stateful ("manipulators")

std::setprecision applies to all following items, std::width only to next item, ARGH!

slow due to virtual calls extra copy std::stringstream -> std::string

(std::stringstream() << std::setprecision(2) << 3.14).str()

16 / 90

slide-17
SLIDE 17

Just C++: string concatenation

"Hello " + std::to_string(3.14)

different abuse of operator overloading no formatting options slow many temporaries

17 / 90

slide-18
SLIDE 18

Just C++: string concatenation

"Hello " + std::to_string(3.14)

different abuse of operator overloading no formatting options slow many temporaries BUT: conceptually the essence of formatting turn data into string snippets concatenate the snippets into whole text naturally extends to user-defined types/parameters: call a function Overcome weaknesses with Ranges!

18 / 90

slide-19
SLIDE 19

Introduction to Ranges

Who knows the range-based for -loop?

19 / 90

slide-20
SLIDE 20

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS?

20 / 90

slide-21
SLIDE 21

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS? Who knows Eric Niebler's Range-v3 library?

21 / 90

slide-22
SLIDE 22

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS? Who knows Eric Niebler's Range-v3 library? Who uses ranges every day? Range-v3?

22 / 90

slide-23
SLIDE 23

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS? Who knows Eric Niebler's Range-v3 library? Who uses ranges every day? Range-v3? Boost.Range?

23 / 90

slide-24
SLIDE 24

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS? Who knows Eric Niebler's Range-v3 library? Who uses ranges every day? Range-v3? Boost.Range? think-cell library?

24 / 90

slide-25
SLIDE 25

Introduction to Ranges

Who knows the range-based for -loop? Who knows Ranges TS? Who knows Eric Niebler's Range-v3 library? Who uses ranges every day? Range-v3? Boost.Range? think-cell library? home-grown?

25 / 90

slide-26
SLIDE 26

Essence of Ranges

std::find(itBegin, itEnd, x)

  • >

std::find(rng, x) // Ranges TS rng anything with begin , end

26 / 90

slide-27
SLIDE 27

Essence of Ranges

std::find(itBegin, itEnd, x)

  • >

std::find(rng, x) // Ranges TS rng anything with begin , end

containers that own elements ( vector , basic_string , etc.)

27 / 90

slide-28
SLIDE 28

Essence of Ranges

std::find(itBegin, itEnd, x)

  • >

std::find(rng, x) // Ranges TS rng anything with begin , end

containers that own elements ( vector , basic_string , etc.) views that reference elements (= iterator pairs wrapped into single object)

28 / 90

slide-29
SLIDE 29

Essence of Ranges

std::find(itBegin, itEnd, x)

  • >

std::find(rng, x) // Ranges TS rng anything with begin , end

containers that own elements ( vector , basic_string , etc.) views that reference elements (= iterator pairs wrapped into single object) ranges may do lazy calculations

tc::filter(rng,pred) only captures rng and pred , performs no work

skips elements while iterating

29 / 90

slide-30
SLIDE 30

Why do I think I know something about ranges?

think-cell has range library evolved from Boost.Range 1 million lines of production code use it chicken-and-egg problem of library design can only learn good design by lots of use with lots of use, cannot change design avoid by all code in-house extra resources dedicated to refactoring

30 / 90

slide-31
SLIDE 31

Ranges for Text 101: Replace basic_string Member Functions by Range Algorithms

index = str.find(...);

  • >

iterator = std::find(str,...); // Ranges TS or tc::find_*

same generic algorithms for character and other sequences flexible with string types wrap OS-/library-specific string types in range interface treat uniformly in syntax/algorithms

31 / 90

slide-32
SLIDE 32

Ranges for Text 101: Replace basic_string Member Functions by Range Algorithms

index = str.find(...);

  • >

iterator = std::find(str,...); // Ranges TS or tc::find_*

same generic algorithms for character and other sequences flexible with string types wrap OS-/library-specific string types in range interface treat uniformly in syntax/algorithms C++17 basic_string_view perpetuates basic_string member interface:-(

32 / 90

slide-33
SLIDE 33

Ranges for Text Formatting (1)

All Range libraries already have concatenation

tc::concat("Hello ", strName) // similar syntax in Range-v3

33 / 90

slide-34
SLIDE 34

Ranges for Text Formatting (1)

All Range libraries already have concatenation

tc::concat("Hello ", strName) // similar syntax in Range-v3

To format data, add formatting functions like tc::as_dec

double f=3.14; tc::concat("You won ", tc::as_dec(f,2), " dollars.")

34 / 90

slide-35
SLIDE 35

Ranges for Text Formatting (1)

All Range libraries already have concatenation

tc::concat("Hello ", strName) // similar syntax in Range-v3

To format data, add formatting functions like tc::as_dec

double f=3.14; tc::concat("You won ", tc::as_dec(f,2), " dollars.")

not like <iostream> : double itself is not a character range:

tc::concat("You won ", f, " dollars.") // DOES NOT COMPILE

35 / 90

slide-36
SLIDE 36

Ranges for Text Formatting (1)

All Range libraries already have concatenation

tc::concat("Hello ", strName) // similar syntax in Range-v3

To format data, add formatting functions like tc::as_dec

double f=3.14; tc::concat("You won ", tc::as_dec(f,2), " dollars.")

not like <iostream> : double itself is not a character range:

tc::concat("You won ", f, " dollars.") // DOES NOT COMPILE

No need for special format function

36 / 90

slide-37
SLIDE 37

Ranges for Text Formatting (2)

Extensible by functions returning ranges

auto dollars(double f) { return tc::concat(tc::as_dec(f,2), " dollars"); } double f=3.14; tc::concat("You won ", dollars(f), ".");

37 / 90

slide-38
SLIDE 38

Ranges for Text Formatting (2)

Extensible by functions returning ranges

auto dollars(double f) { return tc::concat(tc::as_dec(f,2), " dollars"); } double f=3.14; tc::concat("You won ", dollars(f), ".");

Range algorithms work

tc::for_each( tc::as_dec(f,2), [](char c){...} ); if( tc::all_of/tc::any_of( tc::concat("You won ", tc::as_dec(f,2), " dollars."), [](char c){ return c!='1'; } ) ) {...}

38 / 90

slide-39
SLIDE 39

Formatting Into Containers (1)

std::string gives us

Empty Construction

std::string s(); // compiles

Construction from literal, another string

std::string s1("Hello"); // compiles std::string s2(s1); // compiles

39 / 90

slide-40
SLIDE 40

Formatting Into Containers (1)

std::string gives us

Empty Construction

std::string s(); // compiles

Construction from literal, another string

std::string s1("Hello"); // compiles std::string s2(s1); // compiles

Add construction from 1 Range

std::string s3(tc::as_dec(3.14,2)); // suggested std::string s3(tc::concat("You won ", tc::as_dec(3.14,2), " dollars.")); // suggested

40 / 90

slide-41
SLIDE 41

Formatting Into Containers (1)

std::string gives us

Empty Construction

std::string s(); // compiles

Construction from literal, another string

std::string s1("Hello"); // compiles std::string s2(s1); // compiles

Add construction from 1 Range

std::string s3(tc::as_dec(3.14,2)); // suggested std::string s3(tc::concat("You won ", tc::as_dec(3.14,2), " dollars.")); // suggested

Add construction from N Ranges

std::string s4("Hello", " World"); // suggested std::string s5("You won ", tc::as_dec(3.14,2), " dollars."); // suggested

41 / 90

slide-42
SLIDE 42

Formatting Into Containers (2)

What about existing constructors?

std::string s1("A", 3 ); std::string s2('A', 3 ); std::string s3( 3 , 'A');

42 / 90

slide-43
SLIDE 43

Formatting Into Containers (2)

What about existing constructors?

std::string s1("A", 3 ); // UB, buffer "A" overrun std::string s2('A', 3 ); std::string s3( 3 , 'A');

43 / 90

slide-44
SLIDE 44

Formatting Into Containers (2)

What about existing constructors?

std::string s1("A", 3 ); // UB, buffer "A" overrun std::string s2('A', 3 ); // Adds 65x Ctrl-C std::string s3( 3 , 'A');

44 / 90

slide-45
SLIDE 45

Formatting Into Containers (2)

What about existing constructors?

std::string s1("A", 3 ); // UB, buffer "A" overrun std::string s2('A', 3 ); // Adds 65x Ctrl-C std::string s3( 3 , 'A'); // Adds 3x 'A'

45 / 90

slide-46
SLIDE 46

Formatting Into Containers (2)

What about existing constructors?

std::string s1("A", 3 ); // UB, buffer "A" overrun std::string s2('A', 3 ); // Adds 65x Ctrl-C std::string s3( 3 , 'A'); // Adds 3x 'A'

Deprecate them!

std::string s(tc::repeat_n('A', 3)); //suggested, repeat_n as in Range-v3

46 / 90

slide-47
SLIDE 47

Formatting Into Containers (3)

think-cell library uses tc::explicit_cast to simulate adding/removing explicit constructors:

auto s4=tc::explicit_cast<std::string>("Hello", " World"); auto s5=tc::explicit_cast<std::string>("You won ", tc::as_dec(f,2), " dollars.");

47 / 90

slide-48
SLIDE 48

Formatting Into Containers (3)

think-cell library uses tc::explicit_cast to simulate adding/removing explicit constructors:

auto s4=tc::explicit_cast<std::string>("Hello", " World"); auto s5=tc::explicit_cast<std::string>("You won ", tc::as_dec(f,2), " dollars."); tc::cont_emplace_back wraps .emplace_back / .push_back , uses tc::explicit_cast as needed: std::vector<std::string> vec; tc::cont_emplace_back( vec, tc::as_dec(3.14,2) );

48 / 90

slide-49
SLIDE 49

Formatting Into Containers (3)

think-cell library uses tc::explicit_cast to simulate adding/removing explicit constructors:

auto s4=tc::explicit_cast<std::string>("Hello", " World"); auto s5=tc::explicit_cast<std::string>("You won ", tc::as_dec(f,2), " dollars."); tc::cont_emplace_back wraps .emplace_back / .push_back , uses tc::explicit_cast as needed: std::vector<std::string> vec; tc::cont_emplace_back( vec, tc::as_dec(3.14,2) );

Can tc::append :

std::string s; tc::append( s, tc::concat("You won ", tc::as_dec(f,2), " dollars.") ); tc::append( s, "You won ", tc::as_dec(f,2), " dollars." );

49 / 90

slide-50
SLIDE 50

Format Strings

tc::concat( "<body>", html_escape( tc::placeholders( "You won {0} dollars.", tc::as_dec(f,2) ) ), "</body>" )

50 / 90

slide-51
SLIDE 51

Format Strings

tc::concat( "<body>", html_escape( tc::placeholders( "You won {0} dollars.", tc::as_dec(f,2) ) ), "</body>" )

support for names

tc::concat( "<body>", html_escape( tc::placeholders( "You won {amount} dollars on {date}." , tc::named_arg("amount", tc::as_dec(f,2)) , tc::named_arg("date", tc::as_ISO8601( std::chrono::system_clock::now() )) ) ), "</body>" )

51 / 90

slide-52
SLIDE 52

Naive Implementation

each formatter returns std::string

tc::concat returns std::string tc::append appends std::string s

52 / 90

slide-53
SLIDE 53

Naive Implementation

each formatter returns std::string

tc::concat returns std::string tc::append appends std::string s

Pro simple

53 / 90

slide-54
SLIDE 54

Naive Implementation

each formatter returns std::string

tc::concat returns std::string tc::append appends std::string s

Pro simple Con need to allocate and copy many strings talk would be over

54 / 90

slide-55
SLIDE 55

Avoid Heap Allocation For Components

Make formatter ranges lazy generate character sequence during iteration size of as_dec -like formatter objects known at compile-time, no heap allocation

55 / 90

slide-56
SLIDE 56

Avoid Heap Allocation For Components

Make formatter ranges lazy generate character sequence during iteration size of as_dec -like formatter objects known at compile-time, no heap allocation

auto dollars(double f) { return tc::concat(tc::as_dec(f,2), " dollars"); } double f=3.14; std::string s(tc::concat("You won ", dollars(f), ".")); tc::as_dec(f,2) stores {f,2} , tc::concat stores components

lvalues stored by reference rvalues stored by copy/move like expression templates

56 / 90

slide-57
SLIDE 57

Fast Formatting Into Containers

determine string length allocate memory for whole string at once fill in characters

57 / 90

slide-58
SLIDE 58

Fast Formatting Into Containers

determine string length allocate memory for whole string at once fill in characters

template<typename Cont, typename Rng> auto explicit_cast(Rng const& rng) { return Cont(std::begin(rng),std::end(rng)); } // note: there are more explicit_cast implementations for types other than containers

58 / 90

slide-59
SLIDE 59

Fast Formatting Into Containers

determine string length allocate memory for whole string at once fill in characters

template<typename Cont, typename Rng> auto explicit_cast(Rng const& rng) { return Cont(std::begin(rng),std::end(rng)); } // note: there are more explicit_cast implementations for types other than containers

formatters are not random-access

string ctor runs twice over rng :-(

first determine size then copy characters

59 / 90

slide-60
SLIDE 60

Fast Formatting Into Containers

avoid traversing rng twice character rng implements size() member explicit loop to take advantage of std::size

template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > auto explicit_cast(Rng const& rng) { Cont cont; cont.reserve( std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } return cont; }

60 / 90

slide-61
SLIDE 61

Fast Formatting Into Containers

also have tc::append

template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > void append(Cont& cont, Rng const& rng) { cont.reserve( cont.size() + std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } }

61 / 90

slide-62
SLIDE 62

Fast Formatting Into Containers

also have tc::append

template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > void append(Cont& cont, Rng const& rng) { cont.reserve( cont.size() + std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } }

all good?

62 / 90

slide-63
SLIDE 63

Fast Formatting Into Containers

also have tc::append

template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > void append(Cont& cont, Rng const& rng) { cont.reserve( cont.size() + std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } } .reserve is evil!!!

63 / 90

slide-64
SLIDE 64

Better reserve

when adding N elements, guarantee O(N) moves and O(log(N)) memory allocations!

template< typename Cont > void cont_reserve( Cont& cont, typename Cont::size_type n ) { if( cont.capacity()<n ) { cont.reserve(max(n,cont.capacity()*8/5)); } } template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > void append(Cont& cont, Rng const& rng) { tc::cont_reserve( cont.size() + std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } }

64 / 90

slide-65
SLIDE 65

Fast Formatting Into Containers

template<typename Cont, typename Rng, enable_if< Rng has size and is not random-access > > void append(Cont& cont, Rng const& rng) { tc::cont_reserve( cont.size() + std::size(rng) ); for(auto it=std::begin(rng); it!=std::end(rng); ++it) { tc::cont_emplace_back(cont, *it); } }

Next bottleneck: iterators!

65 / 90

slide-66
SLIDE 66

Iterators Cost Performance

concat

iterator is std::variant of component iterators each operator* and operator++ branches on the variant

iterator::operator++() { std::visit( make_overload( [&](Iterator1& it1){ ++it1; if (it1==std::end(m_rng1)) { m_variant_of_its=std::begin(m_rng2); } }, [&](Iterator2& it2){ ++it2; } ), m_variant_of_its ); }

66 / 90

slide-67
SLIDE 67

Iterators Cost Performance

concat

iterator is std::variant of component iterators each operator* and operator++ branches on the variant

iterator::operator++() { std::visit( make_overload( [&](Iterator1& it1){ ++it1; if (it1==std::end(m_rng1)) { m_variant_of_its=std::begin(m_rng2); } }, [&](Iterator2& it2){ ++it2; } ), m_variant_of_its ); } tc::as_dec

iterator bookkeeping costs performance

67 / 90

slide-68
SLIDE 68

External Iteration

C++ iterators do external iteration Consumer calls producer to get new element

^ | Stack Producer Producer | / \ / \ Consumer Consumer Consumer

Consumer is at bottom of stack Producer is at top of stack

68 / 90

slide-69
SLIDE 69

External iteration (2)

Consumer is at bottom of stack contiguous code path for whole range easier to write better performance state encoded in instruction pointer no limit for stack memory Producer is at top of stack contiguous code path for each item harder to write worse performance single entry point, must restore state fixed amount of memory or go to heap

69 / 90

slide-70
SLIDE 70

Internal Iteration

Formatting text is more efficient with internal iteration Producer calls consumer to offer new element

^ | Stack Consumer Consumer | / \ / \ Producer Producer Producer

Producer is at bottom of stack ... all the advantages of being bottom of stack ... Consumer is at top of stack ... all the disadvantages of being top of stack ...

70 / 90

slide-71
SLIDE 71

Many Range Algorithms OK with Internal Iteration

Algorithm Internal Iteration? binary_search no (random access iterators) find no (single pass iterators); yes if only value

71 / 90

slide-72
SLIDE 72

Many Range Algorithms OK with Internal Iteration

Algorithm Internal Iteration? binary_search no (random access iterators) find no (single pass iterators); yes if only value for_each yes accumulate yes all_of yes any_of yes none_of yes ...

72 / 90

slide-73
SLIDE 73

Many Range Algorithms OK with Internal Iteration

Algorithm Internal Iteration? binary_search no (random access iterators) find no (single pass iterators); yes if only value for_each yes accumulate yes all_of yes any_of yes none_of yes ... View Internal Iteration? filter yes transform yes

73 / 90

slide-74
SLIDE 74

Many Range Algorithms OK with Internal Iteration

Algorithm Internal Iteration? binary_search no (random access iterators) find no (single pass iterators); yes if only value for_each yes accumulate yes all_of yes any_of yes none_of yes ... View Internal Iteration? filter yes transform yes Extend Range concept to internal iteration!

74 / 90

slide-75
SLIDE 75

Extend Range Concept to Internal Iteration

Range implements operator() that takes sink functor Con: C++20 std::span::operator() already used, must SFINAE it out Pro: can be written as lambda

tc::for_each( // the range [](auto sink) { sink(1); sink(2); }, // the visitor [](int n) { consume(n); } ); tc::for_each uses internal iteration if available (never slower than iterators)

  • therwise uses iterators

75 / 90

slide-76
SLIDE 76

concat with Internal Iteration

template<typename... Rngs> struct concat { std::tuple<Rngs...> m_rng; template<typename Sink> void operator()(Sink sink) const { // tc::for_each also works on tuples tc::for_each(m_rng, [&](auto const& rng) { tc::for_each(rng, sink); }); } };

no overhead

76 / 90

slide-77
SLIDE 77

Appender Customization Point

introduce appender sink for explicit_cast and append to use

template<typename Cont, typename Rng> void append(Cont& cont, Rng const& rng) { tc::for_each(std::forward<Rng>(rng), tc::appender(cont)); }

77 / 90

slide-78
SLIDE 78

Appender Customization Point

introduce appender sink for explicit_cast and append to use

template<typename Cont, typename Rng> void append(Cont& cont, Rng const& rng) { tc::for_each(std::forward<Rng>(rng), tc::appender(cont)); } appender customization point

returned by container::appender() member function default for std:: containers

template<typename Cont> struct appender { Cont& m_cont; template<typename T> void operator()(T&& t) { tc::cont_emplace_back(m_cont, std::forward<T>(t)); } };

78 / 90

slide-79
SLIDE 79

Chunk Customization Point

What about reserve ? Sink needs whole range to call std::size before iteration

79 / 90

slide-80
SLIDE 80

Chunk Customization Point

What about reserve ? Sink needs whole range to call std::size before iteration new Sink customization point chunk if available, tc::for_each calls it with whole range

template<typename Cont, enable_if<Cont has reserve()> > struct reserving_appender : appender<Cont> { template<typename Rng, enable_if<Rng has size()> > void chunk(Rng&& rng) const { tc::cont_reserve( m_cont, m_cont.size()+std::size(rng) ); tc::for_each( std::forward<Rng>(rng), static_cast<appender<Cont> const&>(*this) ); } };

80 / 90

slide-81
SLIDE 81

Chunk Customization Point: other uses

file sink advertises interest in contiguous memory chunks

struct file_appender { void chunk(std::span<unsigned char const> rng) const { std::fwrite(rng.begin(),1,rng.size(),m_file); } void operator()(unsigned char ch) const { chunk(tc::single(ch)); } };

81 / 90

slide-82
SLIDE 82

Performance: Appender vs Hand-Written

How much loss compared to hand-written code? trivial formatting task 10x 'A' + 10x 'B' + 10x 'C' best to expose overhead

struct Buffer { char achBuffer[1024]; char* pchEnd=&achBuffer[0]; } buffer; void repeat_handwritten(char chA, int cchA, char chB, int cchB, char chC, int cchC ) { for (auto i = cchA; 0 < i; --i) { *buffer.pchEnd=chA; ++buffer.pchEnd; } ... cchB ... chB ... ... cchC ... chC ... }

82 / 90

slide-83
SLIDE 83

Performance: Appender vs Hand-Written

struct Buffer { ... auto appender() & { struct appender_t { Buffer* m_buffer; void operator()(char ch) noexcept { *m_buffer->pchEnd=ch; ++m_buffer->pchEnd; } }; return appender_t{this}; } } buffer; void repeat_with_ranges(char chA, int cchA, char chB, int cchB, char chC, int cchC ) { tc::append(buffer, tc::repeat_n(chA,cchA), tc::repeat_n(chB,cchB), tc::repeat_n(chC,cchC)); }

83 / 90

slide-84
SLIDE 84

Performance: Appender vs Hand-Written

repeat_n iterator-based

~50% more time than hand-written (Visual C++ 15.8)

repeat_n supports internal iteration

~15% more time than hand-written (Visual C++ 15.8) Test is worst case: actual work is trivial smaller difference for, e.g., converting numbers to strings

84 / 90

slide-85
SLIDE 85

Performance: Custom vs Standard Appender

toy basic_string implementation

  • nly heap: pointers begin , end , end_of_memory

Again trivial formatting task: 10x 'A' + 10x 'B' + 10x 'C'

void repeat_with_ranges( char chA, int cchA, char chB, int cchB, char chC, int cchC ) { tc::append(mystring, tc::repeat_n(chA,cchA), tc::repeat_n(chB,cchB), tc::repeat_n(chC,cchC)); }

85 / 90

slide-86
SLIDE 86

Performance: Custom vs Standard Appender

Standard Appender

template<typename Cont> struct appender { Cont& m_cont; template<typename T> void operator()(T&& t) { m_cont.emplace_back(std::forward<T>(t)); } }; template<typename Cont, enable_if<Cont has reserve()> > struct reserving_appender : appender<Cont> { template<typename Rng, enable_if<Rng has size()> > void chunk(Rng&& rng) const { tc::cont_reserve( m_cont, m_cont.size()+std::size(rng) ); tc::for_each( std::forward<Rng>(rng), static_cast<appender<Cont> const&>(*this) ); } };

86 / 90

slide-87
SLIDE 87

Performance: Custom vs Standard Appender

Custom Appender

template<typename Cont> struct mystring_appender : appender<Cont> { Cont& m_cont; template<typename T> void operator()(T&& t) { m_cont.emplace_back(std::forward<T>(t)); } template<typename Rng, enable_if<Rng has size()> > void chunk(Rng&& rng) const { tc::cont_reserve( m_cont, m_cont.size()+std::size(rng) ); tc::for_each( std::forward<Rng>(rng), [&](auto&& t) { *m_cont.m_ptEnd=std::forward<decltype(t)>(t); ++m_cont.m_ptEnd; } ); } };

87 / 90

slide-88
SLIDE 88

Performance: Custom vs. Standard Appender

String was only 30 characters Heap allocation Custom Appender ~20% less time (Visual C++ 15.8) Requires own basic_string implementation uninitialized buffer not exposed by std::basic_string / std::vector

88 / 90

slide-89
SLIDE 89

Performance: Future Work

if not all snippets implement size() : new customization point min_size() ?

concat::min_size() is sum of min_size() of components min_size() never wrong to return 0

custom file appender that fills fixed I/O buffer replace std::FILE buffer with own buffer

  • ffer unchecked write as long as snippet size() still fits

new customization point max_size ?

89 / 90

slide-90
SLIDE 90

Conclusion

Use Range syntax and algorithms for text formatting For performance, need new customization points, Range::operator() , appender , chunk Then performance competitive with hand-written code think-cell library is at https://github.com/think-cell/range [NEWS: now under Boost license] Or if you want to help: www.think-cell.com/developers

90 / 90