TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. - - PowerPoint PPT Presentation

tagcentric open source rfid middleware
SMART_READER_LITE
LIVE PREVIEW

TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. - - PowerPoint PPT Presentation

TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. Thompson CSCE Department University of Arkansas What is TagCentric? Java application, built on Ubiquity framework. TagCentric collects event data from a heterogeneous set


slide-1
SLIDE 1

TagCentric: Open Source RFID Middleware

Joseph E. Hoag, Craig W. Thompson CSCE Department University of Arkansas

slide-2
SLIDE 2

What is TagCentric?

  • Java application, built on Ubiquity framework.
  • TagCentric collects event data from a

heterogeneous set of RFID readers and deposits the data into a user-specified database.

  • Readers supported: Alien, Symbol, ThingMagic,

plus a synthetic reader. (Impinj soon?)

  • Databases supported: Oracle, DB2, MySQL,

DRB, Postgres.

  • Also capable of printing/encoding tags via the

Zebra tag printer.

slide-3
SLIDE 3

Sample TagCentric Configuration

slide-4
SLIDE 4

Screen Shot: Admin Panel

slide-5
SLIDE 5

Screen Shot: Database Panel

slide-6
SLIDE 6

Screen Shot: Reader Panel

slide-7
SLIDE 7

TagCentric Release

  • TagCentric released on SourceForge in mid-

February, 2007.

  • Fairly high number (hundreds) of

downloads, but no feedback as of yet.

  • See www.tagcentric.info
slide-8
SLIDE 8

TagCentric Project: Benefits to Industry

  • Commercial RFID middleware can cost

$30K-$50K and require an army of engineers to operate and maintain it.

  • TagCentric lacks “bells and whistles”, but

provides basic RFID functionality for small businesses, universities and testing centers.

  • University of Arkansas part of a consortium
  • f open source RFID software providers.
slide-9
SLIDE 9

TagCentric: Future Enhancements

  • Work with Pramari and Impinj to develop

reference implementation for Low Level Reader Protocol, a new EPC standard reader interface.

  • Add smart devices – cameras, robot arms
  • Add device languages including MBNLI
  • Add analytics and spatial modeling
  • Add EPCglobal-compliant Application Level

Event (ALE) server agent.

  • Add database federation logic for linking multiple

TagCentric installations together.

slide-10
SLIDE 10

Synthetic Data Generation

Joseph Hoag, Craig Thompson CSCE Department University of Arkansas

slide-11
SLIDE 11

Why use synthetic data generation?

  • Regression testing. Repeatedly generate the same

large data set for testing enterprise applications.

  • Secure application development. Develop

enterprise applications using generated data that is realistic but not real.

  • Testing of data mining applications. Generate

data sets with known characteristics to gauge whether or not data mining tools can discover those characteristics.

slide-12
SLIDE 12

SDDL Sample, Results

<database> <seed>18745913</seed> <pool name="colors"> <choice name="red"/> <choice name="green"/> <choice name="blue"/> </pool> <table name="items"> <field name="itemID" type="int"> <iteration base="100" count="5"/> </field> <field name="color" type="CHAR(5)"> <formula>colors</formula> </field> <field name="inventory" type="int"> <min>3</min> <max>10</max> </field> </table> </database>

itemID color inventory 100 ‘red’ 7 101 ‘blue’ 5 102 ‘green’ 7 103 ‘red’ 4 104 ‘green’ 9

slide-13
SLIDE 13

Synthetic Data Generator -- Front End

  • Retrieve Tables, Attributes and Foreign Key Constraints
  • Edit SDDL description
slide-14
SLIDE 14

SDG Parallelism

  • Parallel processes all reference the

same Synthetic Data Description Language (SDDL) file.

  • Each parallel process generates a

single text output file, containing a portion of the generated table.

  • Database then imports the text files

as data.

  • Lack of inter-process dependencies

gives linear speedup. Speed of SDG is only limited by number and speed

  • f processors.
  • Output is identical regardless of the

number of generation processes utilized.

slide-15
SLIDE 15

Application: Store-Item-Sales Data

  • Generated 10 years of store-item-sales data

for a major retailer. Ran generator on grid

  • f sixteen 1.6-GHz Itaniums.

– 70 billion rows (4.5 TB) of data generated. – Peaked at over 500,000 rows/second. – Generated data set in about 2 days. – Row: Store#, Item#, Sun-Sat sales quantities, item price, total weekly sales for store/item.

slide-16
SLIDE 16

Application: Simple RFID Supply Chain Data

  • Problem: Generate synthetic

RFID events (“arrive” and “depart”) for 10 million unique objects traversing 100 read points (total = 2 billion events)

  • Row: TagID, ReaderNum,

BizEvt, Timestamp

  • Total data generated: 86 GB

(2B rows) Reader 1 Reader 2 Reader 3 Reader 100 . . .

slide-17
SLIDE 17

Application: Complex RFID Supply-Chain Data

  • Generate complex RFID

supply-chain data. SDDL file would model:

– Distances between sites – Pallet and contained cases travel as single unit – DCs supply specific stores – User-specifiable read rate – Internal readers for sites

slide-18
SLIDE 18

Application: Legal Strings from Grammars

  • We constructed an SDDL file that would

generate legal strings from user-defined context-free grammars.

  • Successfully generated:

– Palindromes over {0,1}*: (i.e., “01001010010”) – Valid mathematical expressions: (i.e., “(A+(B-C))/D”) – Strings over {0,1}* ending in 00 (regular): (i.e., “101100”) – Strings over {0,1}* with an even number of 0s and an even number of 1s (regular): (i.e., “001011010110”)

slide-19
SLIDE 19

Application: Parallel Mailing Lists

936-44I-2946 Anil Khatri 802-218-1914 4761 S. Cleveland Boulevard PORT HUENEME, CA 93041 121-63-1403 Joshua X. Anderson 350-5098 870 S. McKinley

  • Cir. PENSACOLA, Florida 32534

328-06-B372 Kato, Hiroshi (811) 366-1942 6203 East Sixth Street SHREVEPORT, La. 71107 936-44-2946 Anil Khatri 802-218-1914 4761 South Cleveland Boulevard PORT HUENEME, CA 93041 121-63-1403 Joshua X. Anderson XXX-350-5098 870 South McKinley Circle PENSACOLA, FL 32534 328-06-7372 Hiroshi Kato 811-366-1942 6203 East Sixth Street SHREVEPORT, LA 71107

Canonical List: List with Errors, Mixed Formats:

slide-20
SLIDE 20

Potential Future Work

  • Smart front-end that will automate the process of

characterizing existing tables.

  • Import tool to automatically convert existing

tabular data (example: census data) into SDDL pools.

  • Improve ability to simultaneously generate

multiple tables and handle circular dependencies.

  • Add streaming output capability.
slide-21
SLIDE 21

Menu-based Natural Language Interfaces

Craig Thompson and Kyle Neumeier CSCE Department University of Arkansas

slide-22
SLIDE 22

Problem with traditional NLIs

Habitability Problem

Natural Language Interface

slide-23
SLIDE 23
slide-24
SLIDE 24

MBNLI for smart devices

– Predictive menu to guide user to correct sentence – Solves Habitability Problem

slide-25
SLIDE 25
  • LingoLogic

– Architecture – Grammars – Translations

  • Applications of MBNLI

– Database querying – Agent control

Menu Based Natural Language Interfaces

Predictive Parser Front End Grammar Target System

slide-26
SLIDE 26

USF Business Plan Competition

  • MBA Entrepreneurship Class
  • Industry

– Tyson – Arkansas Best Corporation

  • Placed in top 10 of over 100
  • Arkansas Governor's Cup
slide-27
SLIDE 27

MBNLI Spatial Queries

  • Project with Center for Advanced Spatial

Technology (CAST)

  • Queries to Oracle Spatial

– Map the intersection of roads and railroads in Fayetteville

  • Spatial functions are like theta joins in relational
  • algebra. Have successfully shown that it is

possible to generate SQL to Oracle Spatial

  • Exploring automating output of query to display

using Google Maps.

slide-28
SLIDE 28

GRINDEX Grid Indexing Service

Craig Thompson, Reid Phillips, Evan Kirkconnell, Lisa Harrison, Matt Baker CSCE Department University of Arkansas

slide-29
SLIDE 29

Problem

  • Acxiom customers supply Acxiom files

– widely varying sizes and layouts:

  • Millions of small (KB) files
  • Thousands of medium (MB) files
  • Hundreds of large (TB) files

– Files and fields are typed and fixed width – Attributes may be raised to the file name – Only key attributes are permitted in queries

  • Customers want to query for subsets of files
slide-30
SLIDE 30

Synthetic Data Generator

  • Our initial test case is 130 KB
  • Records contain these fields – ABCDE composite key

– Attribute A - ~2000 values – Attribute B - ~2500 values – Attribute C - ~ 2 values – Attribute D - ~ 150 values – Attribute E - ~ 100 values – RestOfRecord

  • Files

– File name format: A/A_B_C.txt – 9,125,000 files = 2000 directories (one arrives every day) each containing 5000 – File contains ~150 records of 100 bytes each

  • Using SDG, we generated data on 4-node ACE grid in ~4 hours
slide-31
SLIDE 31

Original Design Concepts

  • This problem could be handled by a Relational DBMS

– CREATE table1 ON customerdata; – INDEX table1 ON { attributes }; – INPUT fileNames USING (attributes); – SELECT table1 WHERE expression ;

  • Except

– The data volume is huge and stored in files – Licensing costs for RDBMS are an issue – Only a subset of relational capability is needed – The Acxiom grid offers opportunities for parallelism

slide-32
SLIDE 32

HASHi TREEj … … … …

Indexing Layer Queryi

Query Tree consists of relational algebra operators like join, project, select but also stored procedure calls Query Execution results in tables

  • ptimized

translated to

Disk-based Storage Layer Query Layer

Relational DBMS Architecture

slide-33
SLIDE 33

… HASHj TREEk … … … …

Indexing Layer Grid-node-based Storage Layer Workflowi

scheduled Workflow Execution results in data sources

Workflow Layer

automated

Workflow Grid Architecture + Indexing Layer

slide-34
SLIDE 34

GRINDEX Prototype

  • The original grid indexing (GRINDEX1) prototype

– grid abstract machine – created a hash based index

  • GRINDEX22 extended the functionality

– re-implemented grid abstract machine – re-implemented the hash with improved performance – implemented a B+ tree index – implemented a relational algebra command language to create tables, indices, insert, and query

1 Jonathan Schisler, GRINDEX: Framework and Prototype for a Grid-based index, MS Thesis, August 2005. 2 Reid Phillips, GRINDEX2: Framework and Prototype for a Grid Resident Database System, MS Thesis, August 2006.

slide-35
SLIDE 35

GRINDEX2 commands

  • CREATE TABLE customerdata (attributes);
  • CREATE INDEX cd SS {hash | tree …} ON attributes;
  • INSERT customerdata (value, …, value)
  • SELECT FROM customerdata

WHERE attrib1 IN { x1, x2, …, xn } AND attrib2 IN { y1, y2, …, ym } AND ...

slide-36
SLIDE 36

Create Table T (a, a, a, …) Insert T (v, v, v, …) Create Index I {hash | split-tree1 | b-tree} (a, a, a, …) Select T where expr

GRINDEX2 Service

Tables = (Files, ……., ti, …….., tn) Indexes = (i1, .ic…., ij, ik, il, …, im) File(filei, fileindexi)

Front-end: Input Files

Create Table MyApplication (a, a, a, …) Create Index split-tree (a, a, a, …) For each file(segment), for each record Extract key attributes of record Insert T (key attributes, fileindex, offset(s))

Backend: Query

Select MyApplication where expression For each <file, offset> Extract records and append to output file

Index data nodes

File System Directory Structure Files of records … filei Synthetic Data Generation Opportunities for Parallelism Front-end – parallel read / file segments / bulk insert Within Grindex2 Bulk inserts Split indexes Simultaneous Input and Query

Subsetting WFG – Architecture

slide-37
SLIDE 37

Current and Future Work

  • Issues to more fully address

– Parallelism at each stage

  • Input – file segments
  • Storage – parallelism in indexes and

input/query

  • Output – assembling results

– Scalability - storage capacity

  • Extensible Keyless Hash
  • Split tree