finding camoufmaged needle in a haystack pornographic
play

Finding Camoufmaged Needle in a Haystack? Pornographic Products - PowerPoint PPT Presentation

. Zhuoren Jiang 3 . . . . . . . Finding Camoufmaged Needle in a Haystack? Pornographic Products Detection via Berrypicking Tree Model Guoxiu He 1,2 Yangyang Kang 2 Zhe Gao 2 Changlong Sun 2 . Xiaozhong Liu *4,2 Wei Lu *1 Qiong Zhang 2 Luo


  1. . Zhuoren Jiang 3 . . . . . . . Finding Camoufmaged Needle in a Haystack? Pornographic Products Detection via Berrypicking Tree Model Guoxiu He 1,2 Yangyang Kang 2 Zhe Gao 2 Changlong Sun 2 . Xiaozhong Liu *4,2 Wei Lu *1 Qiong Zhang 2 Luo Si 2 1 Wuhan University {guoxiu.he, weilu}@whu.edu.cn 2 Alibaba Group {yangyang.kangyy, gaozhe.gz, qz.zhang, luo.si}@alibaba-inc.com, changlong.scl@taobao.com 3 Sun Yat-sen University jiangzhr3@mail.sysu.edu.cn 4 Indiana University Bloomington liu237@indiana.edu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SIGIR 2019

  2. . . . . . . . . . . . . . . . Introduction In the past decade, decentralized eCommerce services , e.g., eBay, eBid, and Taobao, are challenging traditional monopolistic intermediaries. Through these eCommerce ecosystems, everyone could easily become an e-merchant, and eCommerce platforms provide extra incentives to sellers with convenient marketing and buyer-access channels and resources. While most of decentralized eCommerce platforms don’t have their own inventory, the illegal products , uploaded by some problematic sellers , can spread more easily than . . . . . . . . . . . . . . . . . . . . . . . . . ever before. Such risk can be quite harmful to both buyers and cybermarkets. • Background: • Problem:

  3. . . . . . . . . . . . . . . . . . Detection System in an eCommerce Service More specifjcally, though almost every eCommerce service has its own detection system, this strategy doesn’t work online well because sellers could easily hack the detection system . . . . . . . . . . . . . . . . . . . . . . . . • Challenges: algorithm submit upload pass return buyer query purchase reject product seller result list audit eCommerce service report

  4. . . . . . . . . . . . . Pornographic Products Detection Dataset (PPDD) . strategy: locate target (purchased) products -> extract 2-hour buyers’ seeking behavior logs before purchasing Accumulation in History Online Recalled Pool With the local training dataset, pornographic product detection can be a straightforward binary classifjcation problem . When the current learning algorithm fjnds a seller is listing a pornographic product, the seller could easily change the product title or description and release it again with a new seller/product ID, which means pornographic products and their sellers hide like chameleons in the eCommerce ecosystem while traditional learning algorithms can hardly detect them efgectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . https://github.com/GuoxiuHe/BIRD From Aug . 1 , 2016 To Sep . 1 , 2018 From Nov . 3 , 2018 To Nov . 16 , 2018 From Dec . 3 , 2018 To Dec . 16 , 2018 • Simple Idea: • Brutal Reality:

  5. . . . . . . . . . . . . Product Content Statistics in PPDD . Is it true that sellers often change the product content? Signifjcant difgerence Hard to distinguish LN-LP ON-OP LN-ON LP-OP product 2.4515 0.3317 0.0802 . . . . . . . . . . . . . . . 2.6292 . . . . . . . . . . . . • Question: 0.08 local pornography local normal online pornography 0.06 online normal proportion 0.04 0.02 0.00 0 20 40 60 80 100 product content word

  6. . . . . . . . . . . . . . . . . . . Performance Comparison With Text Classifjcation Baselines . . . . . . . . . . . . . . . . . . . . . . Do text classifjers work online? • Question:

  7. . . . . . . . . . . . . . . . . Interaction Statistics between Buyers and Products Employing camoufmaged content can be a double-edged sword . . When buyers search for ‘porn video USB’ , which is an illegal query, via Taobao, they won’t get any result. . . In order to locate what they are looking for, buyers will have to update the query content . . . . . . . . . . . . . . . . . . . . . a few times and also check/consume the retrieved products carefully. • Opportunity: • Example: • Berrypicking (Marcia Bates, 1989): 0.8 25 17.5 local pornography 15.0 20 0.6 12.5 proportion numbers numbers 15 10.0 0.4 7.5 10 5.0 0.2 5 2.5 0.0 0 0.0 local pornography local normal local pornography local normal 1 2 3 4 5 6 7 (a) numbers of queries in each session (b) numbers of records in each session (c) number of times buyers purchase products

  8. . . . . . . . . . . . . . . . . . . Performance Comparison with Difgerent Features Combinations . . . . . . . . . . . . . . . . . . . . . . • Question: Is simple behavioral information suffjcient to distinguish the pornographic products?

  9. . . . . . . . . . . . . . . . . . . Berrypicking Tree . . . . . . . . . . . . . . . . . . . . . . branch1 branch2 branch3 branch4 Berrypicking Tree root buyer query product target

  10. . . . . . . . . . . . . . . . B erryP I cking T R ee Mo D el (BIRD): Branch Representation . Attention Gate Attention Gate Attention . Gate . . . . . . . . . . . . . . . . . . . . . . . p 41 p 42 p 11 p 12 p 13 p 21 p ˆ q 1 q 2 q 3 q 4 a 11 a 12 a 13 a 21 a 41 a 42 a 43 p 11 p 12 p 13 p 21 p 41 p 42 p 43 q 2 q 4 q 1 a 1 a 2 q 3 a 4 p 4 p 1 p 2 branch1 branch2 branch3 branch4 x 1 x 2 x 3 x 4 vector forward latent intent prune mechanism bi-BPTRU bi-BPTRU bi-BPTRU bi-BPTRU average pooling attention hidden semantics hidden semantics hidden semantics hidden semantics combine gate pm bi-BPTRU backward latent intent y ′ label mlp mlp root

  11. . . . . . . . . . . . . . . . . . . B erryP I cking T R ee Mo D el (BIRD): Tree Representation . . Recurrent Neural Network . . . . . . . . . . . . . . . . . . . . p 11 p 12 p 13 p 21 p 41 p 42 p ˆ q 1 q 3 q 4 q 2 a 11 a 12 a 13 a 21 a 41 a 42 a 43 p 11 p 12 p 13 p 21 p 41 p 42 p 43 q 1 a 1 q 2 a 2 q 3 q 4 a 4 p 1 p 2 p 4 branch1 branch2 branch3 branch4 x 2 x 3 x 1 x 4 vector forward latent intent prune mechanism bi-BPTRU bi-BPTRU bi-BPTRU bi-BPTRU average pooling attention hidden semantics hidden semantics hidden semantics hidden semantics combine gate pm bi-BPTRU backward latent intent y ′ label mlp mlp root

  12. . . . . . . . . . . . . . . . . Berrypicking Tree Recurrent Unit (BPTRU) As the buyer is the root in the berrypicking tree , besides the semantics hidden in the sequence of branches , we also explore the latent purchase intent of buyer among all the information seeking efgorts in the tree. two hidden gates to determine the combination of the previous hidden state (latent intent) and the current branch. . . . . . . . . . . . . . . . . . . . . . . . . an interact gate to supplement the joint information. • Motivation:

  13. . . . . . . . . . . . . . . . . . . B erryP I cking T R ee Mo D el (BIRD): Pruning Mechanism . . . . . . . . . . . . . . . . . . . . . . p 11 p 12 p 13 p 21 p 41 p 42 p ˆ q 1 q 2 q 3 q 4 a 21 a 41 a 42 a 43 a 11 a 12 a 13 p 21 p 41 p 42 p 43 p 11 p 12 p 13 q 1 a 1 q 2 a 2 q 4 a 4 q 3 p 1 p 2 p 4 branch1 branch2 branch3 branch4 x 1 x 2 x 3 x 4 vector forward latent intent prune mechanism bi-BPTRU bi-BPTRU bi-BPTRU bi-BPTRU average pooling attention hidden semantics hidden semantics hidden semantics hidden semantics combine gate pm bi-BPTRU backward latent intent y ′ label mlp mlp root

  14. . . . . . . . . . . . . . . . Pruning Mechanism User ’ s behavior, in the eCommerce environment, can be somehow noisy . For instance, in a 2-hour window , buyer ’ s search and browsing behavior may focus on multiple information needs , e.g., looking for normal products and also a pornographic product, which might pollute the target berrypicking tree. consine similarity sigmoid . . . . . . . . . . . . . . . . . . . . . . . . . the last branch contains the target (purchased) product • Motivation: • Example:

  15. . . . . . . . . . . . . . . . . . B erryP I cking T R ee Mo D el (BIRD): Overview . . . . . . . . . . . . . . . . . . . . . . . branch1 branch2 branch3 branch4 Berrypicking Tree root buyer query product target p 11 p 12 p 13 p 21 p 41 p 42 ˆ p q 1 q 2 q 3 q 4 a 41 a 42 a 43 a 11 a 12 a 13 a 21 p 21 p 41 p 42 p 43 p 11 p 12 p 13 q 2 a 2 q 4 a 4 q 1 a 1 q 3 p 1 p 2 p 4 branch1 branch2 branch3 branch4 x 1 x 2 x 3 x 4 vector forward latent intent prune mechanism bi-BPTRU bi-BPTRU bi-BPTRU bi-BPTRU average pooling attention hidden semantics hidden semantics hidden semantics hidden semantics combine gate pm bi-BPTRU backward latent intent y ′ label mlp mlp root

  16. . . . . . . . . . . . . . . . . . . Performance Comparison with Base Models Built on Berrypicking Tree Can BIRD including BPTRU and PM outperform other models built on . . . . . . . . . . . . . . . . . . . . . . Berrypicking Tree ? • Question:

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend