using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, - PowerPoint PPT Presentation

Generating Handwritten Character Clones from an Incomplete Seed Character Set using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, Naoko Nitta, and Noboru Babaguchi Osaka University, Japan on 6th Aug. 2018, at ICFHR2018

Research Background • Handwriting generation • generate synthetic (or clone) images of handwritten characters resembling a target user’s actual handwriting. Generator Target user (e.g. auto-encoders, GANs) fed into generate Training dataset - handwriting character images - pen stroke (sequence of 2D pen-tip locations) handwritten character clones; HCCs Automatically generated HCCs - are applicable to communication tools (especially for hand-impaired people). - serve a large scale dataset for handwriting recognition, faked signature detection, etc.

Requirements in Practice • Incomplete seed character set • Seed characters : characters whose image(s) is in the training dataset this is a pen. a, b, e, h, i, j, k, c, d, f, g, o, q, r, Training i am japanese. dataset l, m, n, p, s, t u, v, w, x, y, z i like baseball. non-seed seed • Seed characters are usually limited because it is difficult to collect a lot of images from the target user. • It is not rare that at most one or zero image is available per character, especially in the case of Asian languages • Within-person variety • Images of humans’ actual handwriting differ from each other even if the same writer writes the same character. All of them have the similar characteristics but are slightly different from each other

Goal • Goal: To propose a HCC generation method • that can achieve the within-person variety Not a single HCC but its distribution should be created. • based on the incomplete seed character set At most one or zero image is available per character as a training data. • Novelty: • How to estimate the HCC distribution for each character from at most one or zero instance

Related Work • (Conventional) HCC generation [1, 2] a a 𝑞 HCC "a" ⋯ INPUT OUTPUT b b 𝑞 HCC "b" ⋯ HCC distribution A lot of images ⋮ ⋮ for each character for each character z z ⋯ 𝑞 HCC "z" ☒ Incomplete seed character set ☑ within-person variety • Font generation [3, 4, 5] b OUTPUT a INPUT d Images of A few images of e c n the other characters ⋱ several seed characters that seems to be written x written in a certain style y z in the same style ☑ Incomplete seed character set ☒ within-person variety [1] T. S. Haines et al.: “My Text in Your Handwriting ,” ACM Trans. on Graphics, Vol.35, No.3, 2016. [2] A. Graves: “Generating Sequences With Recurrent Neural Networks ,” arXiv preprint, arXiv:1308.0850, 2013. [3] D. G. Balreira et al.: “Handwriting Synthesis from Public Fonts,” in Proc. of 30th SIBGRAPI Conf. on Graphics, Patterns and Images (SIBGRAPI), pp.246--253, 2017. [4] J. W. Lin et al.: “Complete Font Generation of Chinese Characters in Personal Handwriting Style,” in Proc. of 34th IEEE Int'l Performance Computing and Communications Conf. (IPCCC), pp.1--5, 2015. [5] Z. Lian et al.: “Automatic Generation of Large-Scale Handwriting Fonts via Style Learning,” in Proc. of SIGGRAPH ASIA 2016 Technical Briefs, 2016.

• Character-wise HCC generation Overview • Offline (using only images) HCC Training Sample a new feature a dataset 𝑔 ~𝑞 𝑔 ෠ 𝑔 𝜄 Decoder • offered by 𝑣 the target user 𝑣 • incomplete seed feature distribution character set for each character 𝑞 𝑔 መ 𝜄 Side dataset 𝑥 1 𝑥 2 Parameter selection feature Encoder set 𝑔 select the parameter 𝑣 that is best-fit to 𝑔 feature 𝑣 𝑥 3 extractor 𝑥 4 𝜄 2 𝜄 1 𝜄 5 feature • collected from many 𝜄 4 set 𝑔 other writers 𝑥 1 , ⋯ , 𝑥 𝐾 𝑥 𝜄 3 • Each writer offers ⋱ 𝜄 𝐿 only a few images (e.g. an image per character) Parameter pool

Parameter Pool Construction Hypothesis • It is not rare that the shapes of two writer’s handwriting are very similar for some characters. IOW, there are a lot of writer-pairs whose handwriting shapes are similar for some characters. Not only the average shape but also the shape distribution of their handwriting would be similar. 𝑑 , Σ 1 𝑑 𝑢 1 Side dataset 𝑑 , Σ 2 𝑑 𝑢 2 Encoder feature extractor 𝑑 , Σ 3 𝑑 𝑢 3 Separately perform the following procedure for each character 𝒅 : 1. Extract a feature for each handwriting image in the side dataset 2. Cluster a set of the extracted features 𝑑 and the covariance Σ 𝑙 𝑑 for each cluster 𝑙 𝑑 , Σ 𝑙 𝑑 3. Compute the mean 𝑢 𝑙 𝜄 𝑙 = 𝑢 𝑙

Parameter Selection for Seeds Training dataset BestFit strategy 𝑣 𝑑 𝐽 𝑣 𝑑 , Σ 1 𝑑 𝑢 1 Side dataset 𝑑 , Σ 2 𝑑 𝑢 2 Encoder feature extractor 𝑑 𝑔 𝑑 , Σ 3 𝑑 𝑣 𝑢 3 𝜾 𝒅 = 𝜾 𝟒 ෡ 𝒅 For a seed character 𝑑 , 𝑑 . - use the target user’s actual handwriting image 𝐽 𝑣 𝑑 . - select the parameter that is best-fit to 𝐽 𝑣

Parameter Selection for Non-seeds Training dataset BestFit strategy 𝑣 𝑑 𝐽 𝑣 𝑑 , Σ 1 𝑑 𝑢 1 Side dataset 𝑑 , Σ 2 𝑑 𝑢 2 Encoder feature extractor 𝑑 𝑔 𝑑 , Σ 3 𝑑 𝑣 𝑢 3 𝜾 𝒅 = 𝜾 𝟒 ෡ 𝒅 For a non-seed character 𝑑 ′ , - there are no images of the target user ’s actual handwriting . - BestFit strategy cannot be used.

Parameter Selection for Non-seeds • For a non-seed character 𝒅 ′ , employ collaborative filtering (CF). • To perform CF, first construct a writer-character matrix Φ . • Estimate the best-fit parameters for not only the target user but also the other writers. • 𝜚 𝑘𝑛 ∈ 1,2, ⋯ , 𝐿 : ID of the best-fit parameter of 𝑘 -th writer’s feature distribution for 𝑛 -th character Seed characters non-seed character 𝒅 𝟐 𝒅 𝟑 ⋯ 𝒅 𝒏 ⋯ 𝒅 𝑵 𝒙 𝟐 𝜚 11 𝜚 12 ⋯ 𝜚 1𝑛 ⋯ 𝜚 1𝑁 𝒙 𝟑 𝜚 21 𝜚 22 ⋯ 𝜚 2𝑛 ⋯ 𝜚 2𝑁 Φ = other writers ⋮ ⋮ ⋮ ⋱ ⋮ ⋱ ⋮ 𝒙 𝑲 𝜚 𝐾1 𝜚 𝐾2 ⋯ 𝜚 𝐾𝑛 ⋯ 𝜚 𝐾𝑁 ？ 𝒗 ⋯ ⋯ 𝜚 𝑣,1 𝜚 𝑣,2 𝜚 𝑣,𝑁 target user writer-character matrix 𝜚 𝑣,1 , 𝜚 𝑣,2 , 𝜚 𝑣,𝑁 : known (estimated by Best-Fit strategy) 𝜚 𝑣,𝑛 : unknown try to estimate it by collaborative filtering!

Collaborative Filtering • User-based collaborative Filtering ( UserCF ) Hypothesis If the feature distributions of two writers are similar with each other for some characters, their distributions for another character also tend to be similar. other writers target user 𝑥 1 𝑣 𝑑 1 𝑑 2 𝑑 3 𝑑 4 similarity 𝑥 3 𝑥 1 𝜚 11 𝜚 12 𝜚 13 𝜚 14 𝑙 𝑥 2 similar Based on writers the feature vectors of 𝑥 2 𝜚 21 𝜚 22 𝜚 23 𝜚 24 𝑙 all the seed characters 𝑥 3 𝜚 31 𝜚 32 𝜚 33 𝜚 34 ？ Choose top- 𝑂 similar writers 𝑣 𝜚 𝑣,1 𝜚 𝑣,2 𝜚 𝑣,4 𝑙 𝑥 𝑘 [Majority voting] For each 𝑥 𝑘 , vote the similarity score sim 𝑣, 𝑥 𝑘 similar writers for 𝜚 𝑘3 -th parameter

Experiment ETL4 • Dataset • ETL4: a set of Japanese Hiragana Characters ETL5 • 48 characters, 120 writers, 48*120=5760 images • ETL5: a set of Japanese Katakana Characters • 48 characters, 208 writers, 48*204=9984 images • Setting • Randomly select 3 writers as “target user”, i.e., 𝑣 , and regard the remaining writers as “other writers”, i.e., 𝑥 𝑘 . • Generate the following 5 characters, regarding the other 43 characters as seed. • Hiragana : あ (a), し (shi), た (ta), は (ha), れ (re) • Katakana : ア (a), シ (shi), タ (ta), ハ (ha), レ (re) • Encoder & Decoder: Variational Autoencoder • Compared methods • BestFit : using all of the 48 characters as seed (complete seed character set) • UserCF • ItemCF : item-based collaborative filtering • HybrCF : the method combining UserCF and ItemCF • Random : randomly selecting a parameter from the pool

Result ( Hiragana in ETL4, K =40) that of generated HCCs feature of original image and Average distance between  BestFit can generate HCCs quite similar with Original .  UserCF and HybrCF can also generate good HCCs.  The performance of ItemCF is almost same with that of Random . • Co-occurrence probability 𝑀 becomes statistically unreliable with large K . K : num. of clusters (size of parameter pool)

Result ( Katakana in ETL5, K =40) that of generated HCCs feature of original image and Average distance between  Similar result was obtained. • HCCs generated by BestFit are quite similar with Original. • UserCF and HybrCF also generate good HCCs • ItemCF did not work well.  UserCF is more suitable to the HCC generation task. K : num. of clusters (size of parameter pool)

Several Examples of HCC (ETL4) Original BestFit UserCF  HCCs generated by BestFit slightly differ from each other while keeping the similar shape with original . within-person variety  This is also the case with UserCF .

Several Examples of HCC (ETL5) Original BestFit UserCF  HCCs generated by BestFit slightly differ from each other while keeping the similar shape with original . within-person variety  This is also the case with UserCF .

using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, - PowerPoint PPT Presentation

Generating Handwritten Character Clones from an Incomplete Seed Character Set using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, Naoko Nitta, and Noboru Babaguchi Osaka University, Japan on 6th Aug. 2018, at ICFHR2018 Research

CS490W: What is Collaborative Filtering? Collaborative Filtering (CF): Making recommendation

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pajk 3 , Kari Pulli 4 1

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

1 An Filtering System that Monitors Document Search Engines Can Help, But Not Enough!

Collaborative Filtering Presentation by Alex Hugger Filtering Documents Mittwoch, 28. April 2010

aHomestake Array and Wiener Filtering Array Coherence Wiener Filtering Velocity Measurements

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

The Filtering Matrix Interrogating Internet Filtering and Surveillance Practices Worldwide Nart

Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore

FILTERING MACROECONOMIC DATA WienerKolmogorov Filtering of Stationary Sequences The classical

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

ADVANCED TOPICS ON VIDEO PROCESSING Image Spatial Processing Image Spatial Processing FILTERING

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev & Boris Ginsburg

CS681: Advanced Topics in Computational Biology Can Alkan EA509 calkan@cs.bilkent.edu.tr

Is Spring Here? WIFI Code: MVCUG Presenters Network/mvcug01!! 1 Please Silence Your Cell Phones

Harnessing the Power of Modeling Tasks through the Lens of a Math Progression Graham Fletcher

Random Thoughts on 5G Security Prof. Jeffrey H. Reed Bradley Dept. of ECE, Virginia Tech Interim

CLOSED SETS OF FINITARY FUNCTIONS BETWEEN FINITE FIELDS OF COPRIME ORDER. Stefano Fioravanti

Forks, Clones, Varieties Exercises 1 (1) Let A := Z 3 , let C 1 := Clo (( Z 3 , +)), and let n

Maria Bulatova, Daria Kolistratova Background Network Function (NF) a component of a network

Quantum Processes (The computational model) Lu s Soares Barbosa IC May 2019 Qubits | v

using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, - PowerPoint PPT Presentation

Generating Handwritten Character Clones from an Incomplete Seed Character Set using Collaborative Filtering Kazuaki Nakamura, Eiji Miyazaki, Naoko Nitta, and Noboru Babaguchi Osaka University, Japan on 6th Aug. 2018, at ICFHR2018 Research

CS490W: What is Collaborative Filtering? Collaborative Filtering (CF): Making recommendation

Filtering Cubemaps Filtering Cubemaps Angular Extent Filtering and Edge Seam Fixup Methods

Traffic Control Mechanisms Filtering Source address filtering Other forms of filtering

Lesson 7 Rate Conversion Filtering and Downsampling interchange Filtering and Upsampling

Collaborative Filtering Yun-Ta Tsai 1 , Markus Steinberger 2 , Dawid Pajk 3 , Kari Pulli 4 1

Nonlinear Filtering using Particles and Outline Nonlinear Quadrature Filtering Monte Carlo

1 An Filtering System that Monitors Document Search Engines Can Help, But Not Enough!

Collaborative Filtering Presentation by Alex Hugger Filtering Documents Mittwoch, 28. April 2010

aHomestake Array and Wiener Filtering Array Coherence Wiener Filtering Velocity Measurements

Least-Action Filtering L. C. G. Rogers Statistical Laboratory, University of Cambridge

The Filtering Matrix Interrogating Internet Filtering and Surveillance Practices Worldwide Nart

Statistical Filtering and Control for AI and Robotics Part I. Bayes filtering Riccardo Muradore

FILTERING MACROECONOMIC DATA WienerKolmogorov Filtering of Stationary Sequences The classical

ECE 516: Adaptive Digital Filters Lecture 8 (Kalman Filtering) Mojtaba Soltanalian Kalman

ADVANCED TOPICS ON VIDEO PROCESSING Image Spatial Processing Image Spatial Processing FILTERING

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev &amp; Boris Ginsburg

CS681: Advanced Topics in Computational Biology Can Alkan EA509 calkan@cs.bilkent.edu.tr

Is Spring Here? WIFI Code: MVCUG Presenters Network/mvcug01!! 1 Please Silence Your Cell Phones

Harnessing the Power of Modeling Tasks through the Lens of a Math Progression Graham Fletcher

Random Thoughts on 5G Security Prof. Jeffrey H. Reed Bradley Dept. of ECE, Virginia Tech Interim

CLOSED SETS OF FINITARY FUNCTIONS BETWEEN FINITE FIELDS OF COPRIME ORDER. Stefano Fioravanti

Forks, Clones, Varieties Exercises 1 (1) Let A := Z 3 , let C 1 := Clo (( Z 3 , +)), and let n

Maria Bulatova, Daria Kolistratova Background Network Function (NF) a component of a network

Quantum Processes (The computational model) Lu s Soares Barbosa IC May 2019 Qubits | v

Training deep Autoencoders for collaborative filtering Oleksii Kuchaiev & Boris Ginsburg