How to Explain Log-Linear Towards an Explanation Resulting - - PowerPoint PPT Presentation

how to explain log linear
SMART_READER_LITE
LIVE PREVIEW

How to Explain Log-Linear Towards an Explanation Resulting - - PowerPoint PPT Presentation

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . How to Explain Log-Linear Towards an Explanation Resulting Explanation Relation Between Amount of Conclusions Computations and References Home Page Effectiveness of the


slide-1
SLIDE 1

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 1 of 8 Go Back Full Screen Close Quit

How to Explain Log-Linear Relation Between Amount of Computations and Effectiveness of the Result – a Relation that Motivates the Need for Big Data

Francisco Zapata, Olga Kosheleva, and Vladik Kreinovich

University of Texas at El Paso, El Paso, TX 79968 fazg74@gmail.com, olgak@utep.edu, vladik@utep.edu

slide-2
SLIDE 2

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 2 of 8 Go Back Full Screen Close Quit

1. Log-Linear Relation: A Brief Description

  • It is known that:

– the more computations we perform, – the more efficient the decisions and designs result- ing from these computations.

  • Empirical data shows that there is a log-linear depen-

dence between: – the effectiveness e of an application and – the amount d of computations that led to this ap- plication.

  • Specifically, we have e = a+b·ln(d) for some constants

a and b.

slide-3
SLIDE 3

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 3 of 8 Go Back Full Screen Close Quit

2. This Empirical Relation Explains Why We Need Big Data

  • Reminder: the formula e = a + b · ln(d) describes the

relation between: – the effectiveness e of an application and – the amount d of computations that led to this ap- plication.

  • This empirical relation can be reformulated as

d ∼ exp(const · e).

  • This reformulation explains why we need big data:

– every time we want to increase efficiency by one unit, – we need to double the amount of processed data.

  • What we do: we provide an explanation for the empir-

ical log-linear dependence.

slide-4
SLIDE 4

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 4 of 8 Go Back Full Screen Close Quit

3. How to Estimate Effectiveness

  • The effectiveness e of an application is proportional to

the number m of useful features that this design has.

  • For example, let us look at a headache medicine.
  • Its first – and most important – feature is that it should

cure headaches.

  • If it also avoids negative effects on the stomach, this is

better.

  • If it also clears your sinuses, even better, etc.
  • Let us denote the average probability that a randomly

selected substance (or design) has a feature by p.

  • The features are usually independent.
  • So, the probability that a randomly selected design has

m features is pm.

slide-5
SLIDE 5

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 5 of 8 Go Back Full Screen Close Quit

4. Towards an Explanation

  • The probability that a randomly selected design has m

features is pm.

  • According to statistics:

– if a rare event has probability q, – then we need, on average, a sample of size ≈ 1 q to

  • bserve at least one such event.
  • So, to find a design with m features, we need to test

1 pm different designs.

  • The resulting amount of computations d is propor-

tional to the number of tested designs, i.e., to 1 pm: d = c′ · 1 p m .

slide-6
SLIDE 6

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 6 of 8 Go Back Full Screen Close Quit

5. Resulting Explanation

  • The amount of computations is

d = c′ · 1 p m .

  • By taking logarithms of both sides, we get

ln(d) = ln(c′) + m · ln 1 p

  • .
  • So m = A + B · ln(d), where

B = 1 ln 1 p and A = −ln(c′) ln(d) .

  • On the other hand, the effectiveness e of a design is

proportional to m: e = c′ · m.

  • Hence e = a + b · ln(d), where a = c′ · A and b = c′ · B.
slide-7
SLIDE 7

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 7 of 8 Go Back Full Screen Close Quit

6. Conclusions

  • Thus, we indeed get a log-linear dependence between:

– the effectiveness e of an application and – the amount d of computations that led to this ap- plication.

slide-8
SLIDE 8

Log-Linear Relation: . . . This Empirical . . . How to Estimate . . . Towards an Explanation Resulting Explanation Conclusions References Home Page Title Page ◭◭ ◮◮ ◭ ◮ Page 8 of 8 Go Back Full Screen Close Quit

7. References

  • M. Banko and E. Brill, “Scaling to very very large cor-

pora for natural language disambiguation”, Proceed- ings of the 39th Annual Meeting of the Association for Computational Linguistics, 2001, pp. 26–33.

  • T. Brants et al., “Large language models in machine

translation”, Proceedings of the 2007 Joint Conference

  • n Empirical Methods in Natural Language Process-

ing and Computational Natural Language Processing, 2007, pp. 858–867.

  • J. Lin, “Is big data a transient problem?”, IEEE

Internet Computing, 2015, September/October 2015,

  • pp. 86–90.