SLIDE 12 . . Implementation Issues (3)
UDF functionality + traditional relational DB facilities (UDFs closely tied to the relational DB engine): eliminates the communication cost between the two execution layers (functional/relational) Naive bayes classification example:
use a UDF to split documents into words use the relational facilities to calculate word frequencies use aggregate UDFs to compute sum of logs ALL done in one madSQL query, completely within madIS
every process (classification, visualization) is implemented in terms of an (extended) SQLite query:
create temp table if not exists resultsT able as select ontop(5, p, title,class,matches,p) from (select title,class,jgroup(term,p) as matches,sum(p) as p from (select * from (select title,textwindow((summary),0,0,2) from abstractT able),arxiv where middle = term or regexpr(’(\S+)(\s)(\S+)’,middle,’1’) = term) group by title,class) group by title ; 12