Q1: Lift Analysis

Please calculate the following lift values for the table correlating burger and chips below:

◦ Lift(Burger, Chips)

◦ Lift(Burgers, ^Chips)

◦ Lift(^Burgers, Chips)

◦ Lift(^Burgers, ^Chips)

Please also indicate if each of your answers would suggest independent, positive correlation, or negative correlation?

Lift(Burger, Chips)

= s(B u C)/(s(B) x s(c))

= s(B u C) = (600/1400) = 0.43

= s(B) = 1000/1400 = 0.71

= s(C) = 800/1400 = 0.57

= Lift (B,C) = .43/(.71*.57)

= 1.07

= positive correlation

Lift(burgers, ^Chips)

= s (B u ^C)/(s(B) x s(^C)

= s(B U ^C) = (400/1400) = 0.29

= s(B) = 1000/1400 = 0.71

= s(^C) = 600/1400 = 0.43

= Lift (B,^C) = .29/(.71*.43)

= 0.97

= negative correlation

Lift(^Burgers, Chips)

= s(^B u C)/(S(^B) x s(C))

= s(^B u C) = 200/1400 = .14

= s(^B) = 400/1400 = .29

= s(C) = 800/1400 = .57

= Lift (^B, C) = .14/(.29*.57)

= 0.89

= Negative correlation

Lift(^Burgers, ^Chips)

= s(^b u ^C)/s(^B) x s(^C)

s(^B u ^C) = 200/1400 = .14

s(^B) = 400/1400 = .29

s(^C) = 600/1400 = 0.43

Lift(^b, ^C) = .14/(.29*.43)

= 1.08

= positive correlation

Q2:

Please calculate the following lift values for the table correlating shampoo and ketchup below:

◦ Lift(Ketchup, Shampoo)

◦ Lift(Ketchup, ^Shampoo)

◦ Lift(^Ketchup, Shampoo)

◦ Lift(^Ketchup, ^Shampoo)

Please also indicate if each of your answers would suggest independent, positive correlation, or negative correlation?

Lift(Ketchup, Shampoo)

= s(K u S)/s(K) x s(S)

s(K u S) = 100/900 = .11

s(K) = 300/900 = .33

s(S) = 300/900 = .33

Lift(K, S) = .11(.33*.33)

= 1

Independent correlation

◦ Lift(Ketchup, ^Shampoo)

= s(K u ^S)/s(K) x (s(S)

s (K u ^S) = 200/900 = .22

s(K) = 300/900 = .33

s(^S) = 600/900 = .66

Lift(K ,^S) = .22/(.33*.66)

= 1

= Independent correlation

◦ Lift(^Ketchup, Shampoo)

= s(^K u S)/s(^K) x s(S)

s(^k u S) = 200/900 = .22

s(^K) = 600/900 = .66

s(S) = 300/900 = .33

Lift(^K, S) 22/(.33*.66)

= 1

= Independent correlation

◦ Lift(^Ketchup, ^Shampoo)

= s(^K u ^S)/s(^K) x s(^S)

s(^k u ^S) = 400/900 = .44

s(^K) = 600/900 = .66

s(^S) = 600/900 = .66

Lift(^K, ^S) = .44/(.66*.66)

= 1

= Independent correlation

Q3: Chi Squared Analysis

Please calculate the following chi squared values for the table correlating burger and chips below (Expected values in brackets).

◦ Burgers & Chips

◦ Burgers & Not Chips

◦ Chips & Not Burgers

◦ Not Burgers and Not Chips

For the above options, please also indicate if each of your answer would suggest independent, positive correlation, or negative correlation?

χ2 = Sum of (Actual-Expected)2 /Expected

χ2 Burgers & Chips

χ2 = (900-800)2 /800 + (100-200)2 /200 + (300-400)2 /200 + (200-100)2 /100

=12.5 + 50 + 50 + 100 =212.5

Positive correlation (Actual is greater than expected)

χ2 Burgers & Not Chips

χ2 = (100-200)2 /200 + (300-400)2 /200 + (200-100)2 /100

= 50 + 50 + 100 = 200

= negative correlation (Expected is greater than actual)

χ2 Chips & Not Burgers

= (300-400)2 /200 + (200-100)2 /100

= 50 + 100 = 150

= negative correlation (Expected is greater than Actual

χ2 Not Chips & Not Burgers

= (200-100)2 /100

= 100

= Positive correlation (Actual was greater than expected)

Q4: Chi Squared Analysis

Please calculate the following chi squared values for the table correlating burger and sausages below (Expected values in brackets).

◦ Burgers & Sausages

◦ Burgers & Not Sausages)

◦ Sausages & Not Burgers

◦ Not Burgers and Not Sausages

For the above options, please also indicate if each of your answer would suggest independent, positive correlation, or negative correlation?

χ2 Burgers & Sausages

(800-800)2/800 + (200-200)2/200 + (400-400)2/400 + (100-100)2/100

0 + 0 + 0 + 0 = 0

Independent

χ2 Burgers & Not Sausages

(200-200)2/200 + (400-400)2/400 + (100-100)2/100

0 + 0 + 0 = 0

Independent

χ2 Sausages & Not Burgers

(400-400)2/400 + (100-100)2/100

0 + 0 = 0

Independent

χ2 Not Burgers and Not Sausages

(100-100)2/100

= 0

Independent

Q5:

Under what conditions would Lift and Chi Squared analysis prove to be a poor algorithm to evaluate correlation/dependency between two events?

Please suggest another algorithm that could be used to rectify the flaw in Lift and Chi Squared?

Both prove to be a poor algorithm to evaluate correlation or dependency between two events when there are a large number of Null Transactions

Alternatively one can use:

AllConf(A, B)

Jaccard (A, B)

Cosine (A, B)

Kulczynski (A, B)

MaxConf 9A, B)