progressive interaction for autonomous entity matching
play

Progressive Interaction for Autonomous Entity Matching Ben - PowerPoint PPT Presentation

Progressive Interaction for Autonomous Entity Matching Ben McCamish, Arash Termehchy Oregon State University I nformation & D ata Manag e ment and A nalytics Laboratory (IDEA) User interacts with local data source DBMS A DBMS B Products


  1. Progressive Interaction for Autonomous Entity Matching Ben McCamish, Arash Termehchy Oregon State University I nformation & D ata Manag e ment and A nalytics Laboratory (IDEA)

  2. User interacts with local data source DBMS A DBMS B Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results • User interacts with DBMS A by using some query interface ‣ They express their intents, what they are looking for • Then the results are presented to the user

  3. DBMS A not able to satisfy query Store DBMS A DBMS B selling Soda Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results • User queries its local data source, DBMS A • DBMS A does not have the desired information • Must find the desired information in external data source, DBMS B

  4. DBMS A cannot query Store DBMS A DBMS B selling Soda ? Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results • DBMS A needs to submit queries to DBMS B • DBMS B schema and representation of entities is di ff erent • DBMS A does not know schema or representation ‣ Cannot properly formulate queries

  5. DBMS A queries DBMS B Store DBMS A DBMS B selling Soda Mapping Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results • Traditionally a mapping between two DBMSs • However this is costly ‣ Needs to be updated when the schema changes, manually ‣ Manually develop this mapping, takes time

  6. What if DBMS A learns through interactions? “Soda” Store DBMS A DBMS B selling Soda Keyword Query Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results • DBMS A wants to find similar entities in other DBMS, sends some query • There is often a common query language ‣ Keyword Queries • Other DBMSs understand this, but results are not very e ff ective

  7. Results are returned “Soda” Store DBMS A DBMS B selling Soda Keyword Query Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results Results Soda Hamburger 7/11 • Results are returned to the user • User gives some feedback on the results ‣ This is not what the user is looking for

  8. Results are returned “Soda” Store DBMS A DBMS B selling Soda Keyword Query Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results Results Soda Pop Kroger • Results are returned to the user • User gives some feedback on the results ‣ This is the answer the user wanted

  9. Utilize the feedback and learn Store DBMS A DBMS B selling Soda Keyword Query Products Sellers Queries ID Name ID Name Store 1 Soda 3 Hamburger 7/11 2 Beef 4 Pop Kroger … … … … … Results Results • Can build the mapping over time through interaction and feedback • Our Goal: Learn this mapping between DBMS A and DBMS B • Method: Establish a common language or means of communication between the two DBMSs

  10. Our Framework Mapping Local External Query • Local and External DBMS Results • Communicate via keyword Feedback queries and results Offline User Training Data Feedback

  11. Intents Products ID Name 1 Soda 2 Beef Mapping Local External Query Local DBMS Intents Intent # Intent e1 1 Soda e2 2 Beef Results • Local DBMS has intents Feedback • Defined by the user Offline User • Doesn’t require user Training Data Feedback however

  12. Mapping Queries DBMS A Queries Query # Query s1 1 soda Mapping Local External Query s2 2 beef s3 soda s4 beef Strategy Results s1 s2 s3 s4 e1 0.5 0.1 0.4 0 Feedback e2 0 0.4 0.3 0.3 • Sends keyword queries Offline User Training Data Feedback • Called Mapping Queries

  13. Returned Results Mapping Local External Query Sellers ID Name Store 3 Hamburger 7/11 Results 4 Pop Kroger … … … • External DBMS returns some Feedback results Offline User • External DBMS can also learn Training Data Feedback Results Soda Pop Kroger Local Intent External Result

  14. Feedback • Feedback on whether the Mapping Local External Query returned results are correct • Can come from user, but doesn’t have to Results • Can use a model built on Feedback previous user feedback Offline User Training Data Feedback

  15. Local DBMS Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Local DBMS has a strategy to send queries for intents • External DBMS may also have a strategy

  16. Local DBMS Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Suppose local DBMS has the intent e1

  17. Local DBMS Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Consults strategy to see what mapping query to send • Sends s3 with 0.4 probability

  18. Local DBMS Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • When results are returned and feedback given, strategy is updated • Uses reinforcement learning method

  19. Reinforcement Learning • Select a query based on past success, i.e., exploitation • Explore and try new/less successful queries to gain new knowledge, i.e., exploration ‣ Sacrifice immediate success for more success in the long run

  20. Reinforcing Local Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • The probabilities of queries allow for exploration and exploitation

  21. Reinforcing Local Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.4 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Suppose the feedback given for this query was positive • Then the strategy is reinforced as such

  22. Reinforcing Local Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.5 0.1 0.45 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Increase probability for mapping query sent

  23. Reinforcing Local Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.45 0.09 0.45 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • Implicitly decreases probability for others

  24. Reinforcing Local Strategy Local DBMS Intents Intent # Intent External DBMS Strategy e1 1 Soda s1 s2 s3 s4 Sellers e2 2 Beef ID Name Store Mapping Queries e1 0.45 0.09 0.45 0 3 Hamburger 7/11 Query # Query e2 0 0.4 0.3 0.3 4 Pop Kroger s1 1 soda … … … Products s2 2 beef ID Name s3 soda 1 Soda s4 beef 2 Beef • External DBMS may also learn, but we don’t focus on that here • In both cases when the external DBMS learns and doesn’t learn, it will converge, based on our previous results

  25. Our experiments • Use two databases, each containing information on products ‣ One is an Amazon database and the other a Google database • Approximately 1400 tuples in the Amazon and 3200 tuples in the Google dataset • We have the ground truth, which is used as simulated user feedback • Single tuples are used as intents and they have single match • The receiver does not learn • Cache simulated user feedback

  26. Results for learning every time

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend