Email Thread Reassembly Using Similarity Matching
Jen-Yuan Yeh
- Dept. of Computer Science
National Chiao Tung University Hsinchu 30010, TAIWAN jyyeh@cis.nctu.edu.tw
Aaron Harnly
- Dept. of Computer Science
Email Thread Reassembly Using Similarity Matching Jen-Yuan Yeh - - PowerPoint PPT Presentation
Email Thread Reassembly Using Similarity Matching Jen-Yuan Yeh Aaron Harnly Dept. of Computer Science Dept. of Computer Science National Chiao Tung University Columbia University Hsinchu 30010, TAIWAN New York 10027, USA
2/28
3/28
4/28
5/28
… content-class: urn:content-classes:message Subject: Message from Pug Winokur Date: Tue, 27 Mar 2001 09:20:07 -0600 MIME-Version: 1.0 Content-Type: application/ms-tnef; name="winmail.dat“ X-MS-Has-Attach:Content-Transfer-Encoding: binary Thread-Topic: Message from Pug Winokur Thread-Index: AcC20LeUM9ZkNCLDEdWw9ABQi+MJ2Q== From: "\"Beth Grizzle\" <bgrizzle@capricornholdings.com>@ENRON“ To: "Fastow, Andrew S." <Andrew.S.Fastow@ENRON.com>, "Buy, Rick" <Rick.Buy@ENRON.com>, <rcausey@enron.com> …
6/28
… the 4-8-8 pattern repeats … … L4=L3+8 3 E4 L3=L2+8 2 E3 L2=L1+4 1 E2 L1=32 E1 Index Length Depth Email AcGPKD4/2h3YBL/6R9Cpa1YkzGzkaQAkldVUAAGA/ME= E3: AcGPKD4/2h3YBL/6R9Cpa1YkzGzkaQAkldVU E2: AcGPKD4/2h3YBL/6R9Cpa1YkzGzkaQ== E1: E1 E2 E3
4 8 8
7/28
8/28
9/28
10/28
From: James Wills jwills3@swbell.net@Enron Sent: Wednesday, November 14, 2001 1:38 PM To: pallen70@hotmail.com; pallen@enron.com Subject: Re: new PO available Reply Part Quotation Part
11/28
12/28
13/28
14/28
15/28
16/28
17/28
mj mi missing node: mi+1 missing node: mi+2 n=2 Ri q1 q2 q3 mi
18/28
Will you be at the meeting?
Too bad. See you there.
Will you be at the meeting? Yes. Too bad. See you there.
No.
19/28
20/28
Gold standard: (A, C), (A, G), (B, C), (B, G), (A, D), (A, E), (B, D), (B, E) Similarity Matching: (A, C), (B, C), (A, D), (A, E), (B, D), (B, E) R=6/8=0.75
21/28
22/28
23/28
24/28
25/28
26/28
27/28
28/28