TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. - - PowerPoint PPT Presentation
TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. - - PowerPoint PPT Presentation
TagCentric: Open Source RFID Middleware Joseph E. Hoag, Craig W. Thompson CSCE Department University of Arkansas What is TagCentric? Java application, built on Ubiquity framework. TagCentric collects event data from a heterogeneous set
What is TagCentric?
- Java application, built on Ubiquity framework.
- TagCentric collects event data from a
heterogeneous set of RFID readers and deposits the data into a user-specified database.
- Readers supported: Alien, Symbol, ThingMagic,
plus a synthetic reader. (Impinj soon?)
- Databases supported: Oracle, DB2, MySQL,
DRB, Postgres.
- Also capable of printing/encoding tags via the
Zebra tag printer.
Sample TagCentric Configuration
Screen Shot: Admin Panel
Screen Shot: Database Panel
Screen Shot: Reader Panel
TagCentric Release
- TagCentric released on SourceForge in mid-
February, 2007.
- Fairly high number (hundreds) of
downloads, but no feedback as of yet.
- See www.tagcentric.info
TagCentric Project: Benefits to Industry
- Commercial RFID middleware can cost
$30K-$50K and require an army of engineers to operate and maintain it.
- TagCentric lacks “bells and whistles”, but
provides basic RFID functionality for small businesses, universities and testing centers.
- University of Arkansas part of a consortium
- f open source RFID software providers.
TagCentric: Future Enhancements
- Work with Pramari and Impinj to develop
reference implementation for Low Level Reader Protocol, a new EPC standard reader interface.
- Add smart devices – cameras, robot arms
- Add device languages including MBNLI
- Add analytics and spatial modeling
- Add EPCglobal-compliant Application Level
Event (ALE) server agent.
- Add database federation logic for linking multiple
TagCentric installations together.
Synthetic Data Generation
Joseph Hoag, Craig Thompson CSCE Department University of Arkansas
Why use synthetic data generation?
- Regression testing. Repeatedly generate the same
large data set for testing enterprise applications.
- Secure application development. Develop
enterprise applications using generated data that is realistic but not real.
- Testing of data mining applications. Generate
data sets with known characteristics to gauge whether or not data mining tools can discover those characteristics.
SDDL Sample, Results
<database> <seed>18745913</seed> <pool name="colors"> <choice name="red"/> <choice name="green"/> <choice name="blue"/> </pool> <table name="items"> <field name="itemID" type="int"> <iteration base="100" count="5"/> </field> <field name="color" type="CHAR(5)"> <formula>colors</formula> </field> <field name="inventory" type="int"> <min>3</min> <max>10</max> </field> </table> </database>
itemID color inventory 100 ‘red’ 7 101 ‘blue’ 5 102 ‘green’ 7 103 ‘red’ 4 104 ‘green’ 9
Synthetic Data Generator -- Front End
- Retrieve Tables, Attributes and Foreign Key Constraints
- Edit SDDL description
SDG Parallelism
- Parallel processes all reference the
same Synthetic Data Description Language (SDDL) file.
- Each parallel process generates a
single text output file, containing a portion of the generated table.
- Database then imports the text files
as data.
- Lack of inter-process dependencies
gives linear speedup. Speed of SDG is only limited by number and speed
- f processors.
- Output is identical regardless of the
number of generation processes utilized.
Application: Store-Item-Sales Data
- Generated 10 years of store-item-sales data
for a major retailer. Ran generator on grid
- f sixteen 1.6-GHz Itaniums.
– 70 billion rows (4.5 TB) of data generated. – Peaked at over 500,000 rows/second. – Generated data set in about 2 days. – Row: Store#, Item#, Sun-Sat sales quantities, item price, total weekly sales for store/item.
Application: Simple RFID Supply Chain Data
- Problem: Generate synthetic
RFID events (“arrive” and “depart”) for 10 million unique objects traversing 100 read points (total = 2 billion events)
- Row: TagID, ReaderNum,
BizEvt, Timestamp
- Total data generated: 86 GB
(2B rows) Reader 1 Reader 2 Reader 3 Reader 100 . . .
Application: Complex RFID Supply-Chain Data
- Generate complex RFID
supply-chain data. SDDL file would model:
– Distances between sites – Pallet and contained cases travel as single unit – DCs supply specific stores – User-specifiable read rate – Internal readers for sites
Application: Legal Strings from Grammars
- We constructed an SDDL file that would
generate legal strings from user-defined context-free grammars.
- Successfully generated:
– Palindromes over {0,1}*: (i.e., “01001010010”) – Valid mathematical expressions: (i.e., “(A+(B-C))/D”) – Strings over {0,1}* ending in 00 (regular): (i.e., “101100”) – Strings over {0,1}* with an even number of 0s and an even number of 1s (regular): (i.e., “001011010110”)
Application: Parallel Mailing Lists
936-44I-2946 Anil Khatri 802-218-1914 4761 S. Cleveland Boulevard PORT HUENEME, CA 93041 121-63-1403 Joshua X. Anderson 350-5098 870 S. McKinley
- Cir. PENSACOLA, Florida 32534
328-06-B372 Kato, Hiroshi (811) 366-1942 6203 East Sixth Street SHREVEPORT, La. 71107 936-44-2946 Anil Khatri 802-218-1914 4761 South Cleveland Boulevard PORT HUENEME, CA 93041 121-63-1403 Joshua X. Anderson XXX-350-5098 870 South McKinley Circle PENSACOLA, FL 32534 328-06-7372 Hiroshi Kato 811-366-1942 6203 East Sixth Street SHREVEPORT, LA 71107
Canonical List: List with Errors, Mixed Formats:
Potential Future Work
- Smart front-end that will automate the process of
characterizing existing tables.
- Import tool to automatically convert existing
tabular data (example: census data) into SDDL pools.
- Improve ability to simultaneously generate
multiple tables and handle circular dependencies.
- Add streaming output capability.
Menu-based Natural Language Interfaces
Craig Thompson and Kyle Neumeier CSCE Department University of Arkansas
Problem with traditional NLIs
Habitability Problem
Natural Language Interface
MBNLI for smart devices
– Predictive menu to guide user to correct sentence – Solves Habitability Problem
- LingoLogic
– Architecture – Grammars – Translations
- Applications of MBNLI
– Database querying – Agent control
Menu Based Natural Language Interfaces
Predictive Parser Front End Grammar Target System
USF Business Plan Competition
- MBA Entrepreneurship Class
- Industry
– Tyson – Arkansas Best Corporation
- Placed in top 10 of over 100
- Arkansas Governor's Cup
MBNLI Spatial Queries
- Project with Center for Advanced Spatial
Technology (CAST)
- Queries to Oracle Spatial
– Map the intersection of roads and railroads in Fayetteville
- Spatial functions are like theta joins in relational
- algebra. Have successfully shown that it is
possible to generate SQL to Oracle Spatial
- Exploring automating output of query to display
using Google Maps.
GRINDEX Grid Indexing Service
Craig Thompson, Reid Phillips, Evan Kirkconnell, Lisa Harrison, Matt Baker CSCE Department University of Arkansas
Problem
- Acxiom customers supply Acxiom files
– widely varying sizes and layouts:
- Millions of small (KB) files
- Thousands of medium (MB) files
- Hundreds of large (TB) files
– Files and fields are typed and fixed width – Attributes may be raised to the file name – Only key attributes are permitted in queries
- Customers want to query for subsets of files
Synthetic Data Generator
- Our initial test case is 130 KB
- Records contain these fields – ABCDE composite key
– Attribute A - ~2000 values – Attribute B - ~2500 values – Attribute C - ~ 2 values – Attribute D - ~ 150 values – Attribute E - ~ 100 values – RestOfRecord
- Files
– File name format: A/A_B_C.txt – 9,125,000 files = 2000 directories (one arrives every day) each containing 5000 – File contains ~150 records of 100 bytes each
- Using SDG, we generated data on 4-node ACE grid in ~4 hours
Original Design Concepts
- This problem could be handled by a Relational DBMS
– CREATE table1 ON customerdata; – INDEX table1 ON { attributes }; – INPUT fileNames USING (attributes); – SELECT table1 WHERE expression ;
- Except
– The data volume is huge and stored in files – Licensing costs for RDBMS are an issue – Only a subset of relational capability is needed – The Acxiom grid offers opportunities for parallelism
HASHi TREEj … … … …
Indexing Layer Queryi
Query Tree consists of relational algebra operators like join, project, select but also stored procedure calls Query Execution results in tables
- ptimized
translated to
…
Disk-based Storage Layer Query Layer
Relational DBMS Architecture
… HASHj TREEk … … … …
Indexing Layer Grid-node-based Storage Layer Workflowi
scheduled Workflow Execution results in data sources
Workflow Layer
automated
Workflow Grid Architecture + Indexing Layer
GRINDEX Prototype
- The original grid indexing (GRINDEX1) prototype
– grid abstract machine – created a hash based index
- GRINDEX22 extended the functionality
– re-implemented grid abstract machine – re-implemented the hash with improved performance – implemented a B+ tree index – implemented a relational algebra command language to create tables, indices, insert, and query
1 Jonathan Schisler, GRINDEX: Framework and Prototype for a Grid-based index, MS Thesis, August 2005. 2 Reid Phillips, GRINDEX2: Framework and Prototype for a Grid Resident Database System, MS Thesis, August 2006.
GRINDEX2 commands
- CREATE TABLE customerdata (attributes);
- CREATE INDEX cd SS {hash | tree …} ON attributes;
- INSERT customerdata (value, …, value)
- SELECT FROM customerdata
WHERE attrib1 IN { x1, x2, …, xn } AND attrib2 IN { y1, y2, …, ym } AND ...
Create Table T (a, a, a, …) Insert T (v, v, v, …) Create Index I {hash | split-tree1 | b-tree} (a, a, a, …) Select T where expr
GRINDEX2 Service
Tables = (Files, ……., ti, …….., tn) Indexes = (i1, .ic…., ij, ik, il, …, im) File(filei, fileindexi)
Front-end: Input Files
Create Table MyApplication (a, a, a, …) Create Index split-tree (a, a, a, …) For each file(segment), for each record Extract key attributes of record Insert T (key attributes, fileindex, offset(s))
Backend: Query
Select MyApplication where expression For each <file, offset> Extract records and append to output file
Index data nodes
File System Directory Structure Files of records … filei Synthetic Data Generation Opportunities for Parallelism Front-end – parallel read / file segments / bulk insert Within Grindex2 Bulk inserts Split indexes Simultaneous Input and Query
Subsetting WFG – Architecture
Current and Future Work
- Issues to more fully address
– Parallelism at each stage
- Input – file segments
- Storage – parallelism in indexes and
input/query
- Output – assembling results
– Scalability - storage capacity
- Extensible Keyless Hash
- Split tree