কেন কোনও কোভারিয়েন্স ম্যাট্রিক্সের বিপরীতে র্যান্ডম ভেরিয়েবলের মধ্যে আংশিক পারস্পরিক সম্পর্ক রয়েছে?


32

আমি শুনেছি যে এলোমেলো ভেরিয়েবলের মধ্যে আংশিক পারস্পরিক সম্পর্কগুলি কোভেরিয়েন্স ম্যাট্রিক্সকে উল্টিয়ে দিয়ে এবং এর ফলে প্রাপ্ত নির্ভুলতা ম্যাট্রিক্স থেকে উপযুক্ত কোষ গ্রহণ করে (এই সত্যটি http://en.wikedia.org/wiki/Partial_correlation এ উল্লেখ করা হয়েছে , তবে একটি প্রমাণ ছাড়াই) ।

কেন এই ক্ষেত্রে?


1
আপনি যদি একটি সেল সব অন্যান্য ভেরিয়েবল নিয়ন্ত্রণ আংশিক পারস্পরিক সম্পর্ক পেতে চাওয়ার কথা বলছেন, তাহলে শেষ অনুচ্ছেদ এখানে আলোকপাত করতে পারে।
ttnphns

উত্তর:


34

যখন একটি মাল্টিভিয়ারেট এলোমেলো পরিবর্তনশীল ( এক্স 1 , এক্স 2 , , এক্স এন )(X1,X2,,Xn) এর একটি অবিস্মরণীয় কোভারিয়েন্স ম্যাট্রিক্স সি = ( γ আমি জে ) = ( কোভ ( এক্স আই , এক্স জে ) ) থাকেC=(γij)=(Cov(Xi,Xj)) তখন সমস্ত বাস্তব লিনিয়ার সংমিশ্রণের সেট থাকে এক্স i ভিত্তি E = ( এক্স 1 , এক্স 2 , ,) এর সাথে Xiএকটি এন-n মাত্রিক আসল ভেক্টর স্পেস তৈরি করেXn)E=(X1,X2,,Xn) and a non-degenerate inner product given by

এক্স আমি , এক্স =γij .

Xi,Xj=γij .

Its dual basis with respect to this inner product, E=(X1,X2,,Xn)E=(X1,X2,,Xn), is uniquely defined by the relationships

Xi,Xj=δij ,

Xi,Xj=δij ,

the Kronecker delta (equal to 11 when i=ji=j and 00 otherwise).

The dual basis is of interest here because the partial correlation of XiXi and XjXj is obtained as the correlation between the part of XiXi that is left after projecting it into the space spanned by all the other vectors (let's simply call it its "residual", XiXi) and the comparable part of XjXj, its residual XjXj. Yet XiXi is a vector that is orthogonal to all vectors besides XiXi and has positive inner product with XiXi whence XiXi must be some non-negative multiple of XiXi, and likewise for XjXj. Let us therefore write

Xi=λiXi, Xj=λjXj

Xi=λiXi, Xj=λjXj

for positive real numbers λiλi and λjλj.

The partial correlation is the normalized dot product of the residuals, which is unchanged by rescaling:

ρij=Xi,XjXi,XiXj,Xj=λiλjXi,Xjλ2iXi,Xiλ2jXj,Xj=Xi,XjXi,XiXj,Xj .

ρij=Xi,XjXi,XiXj,Xj=λiλjXi,Xjλ2iXi,Xiλ2jXj,Xj=Xi,XjXi,XiXj,Xj .

(In either case the partial correlation will be zero whenever the residuals are orthogonal, whether or not they are nonzero.)

We need to find the inner products of dual basis elements. To this end, expand the dual basis elements in terms of the original basis EE:

Xi=nj=1βijXj .

Xi=j=1nβijXj .

Then by definition

δik=Xi,Xk=nj=1βijXj,Xk=nj=1βijγjk .

δik=Xi,Xk=j=1nβijXj,Xk=j=1nβijγjk .

In matrix notation with I=(δij)I=(δij) the identity matrix and B=(βij)B=(βij) the change-of-basis matrix, this states

I=BC .

I=BC .

That is, B=C1B=C1, which is exactly what the Wikipedia article is asserting. The previous formula for the partial correlation gives

ρij=βijβiiβjj=C1ijC1iiC1jj .

ρij=βijβiiβjj=C1ijC1iiC1jj .

3
+1, great answer. But why do you call this dual basis "dual basis with respect to this inner product" -- what does "with respect to this inner product" exactly mean? It seems that you use the term "dual basis" as defined here mathworld.wolfram.com/DualVectorSpace.html in the second paragraph ("Given a vector space basis v1,...,vnv1,...,vn for VV there exists a dual basis...") or here en.wikipedia.org/wiki/Dual_basis, and it's independent of any scalar product.
amoeba says Reinstate Monica

3
@amoeba There are two kinds of duals. The (natural) dual of any vector space VV over a field RR is the set of linear functions ϕ:VRϕ:VR, called VV. There is no canonical way to identify VV with VV, even though they have the same dimension when VV is finite-dimensional. Any inner product γγ corresponds to such a map g:VVg:VV, and vice versa, via g(v)(w)=γ(v,w).
g(v)(w)=γ(v,w).
(Nondegeneracy of γγ ensures gg is a vector space isomorphism.) This gives a way to view elements of VV as if they were elements of the dual VV--but it depends on γγ.
whuber

3
@mpettis Those dots were hard to notice. I have replaced them with small open circles to make the notation easier to read. Thanks for pointing this out.
whuber

4
@Andy Ron Christensen's Plane Answers to Complex Questions might be the sort of thing you are looking for. Unfortunately, his approach makes (IMHO) undue reliance on coordinate arguments and calculations. In the original introduction (see p. xiii), Christensen explains that's for pedagogical reasons.
whuber

3
@whuber, Your proof is awesome. I wonder whether any book or article contains such a proof so that I can cite.
Harry

12

Here is a proof with just matrix calculations.

I appreciate the answer by whuber. It is very insightful on the math behind the scene. However, it is still not so trivial how to use his answer to obtain the minus sign in the formula stated in the wikipediaPartial_correlation#Using_matrix_inversion. ρXiXjV{Xi,Xj}=pijpiipjj

ρXiXjV{Xi,Xj}=pijpiipjj

To get this minus sign, here is a different proof I found in "Graphical Models Lauriten 1995 Page 130". It is simply done by some matrix calculations.

The key is the following matrix identity: (ABCD)1=(E1E1GFE1D1+FE1G)

(ACBD)1=(E1FE1E1GD1+FE1G)
where E=ABD1CE=ABD1C, F=D1CF=D1C and G=BD1G=BD1.

Write down the covariance matrix as Ω=(Ω11Ω12Ω21Ω22)

Ω=(Ω11Ω21Ω12Ω22)
where Ω11Ω11 is covariance matrix of (Xi,Xj)(Xi,Xj) and Ω22Ω22 is covariance matrix of V{Xi,Xj}V{Xi,Xj}.

Let P=Ω1P=Ω1. Similarly, write down PP as P=(P11P12P21P22)

P=(P11P21P12P22)

By the key matrix identity, P111=Ω11Ω12Ω122Ω21

P111=Ω11Ω12Ω122Ω21

We also know that Ω11Ω12Ω122Ω21Ω11Ω12Ω122Ω21 is the covariance matrix of (Xi,Xj)|V{Xi,Xj}(Xi,Xj)|V{Xi,Xj} (from Multivariate_normal_distribution#Conditional_distributions). The partial correlation is therefore ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22.

ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22.
I use the notation that the (k,l)(k,l)th entry of the matrix MM is denoted by [M]kl[M]kl.

Just simple inversion formula of 2-by-2 matrix, ([P111]11[P111]12[P111]21[P111]22)=P111=1detP11([P11]22[P11]12[P11]21[P11]11)

([P111]11[P111]21[P111]12[P111]22)=P111=1detP11([P11]22[P11]21[P11]12[P11]11)

Therefore, ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22=1detP11[P11]121detP11[P11]221detP11[P11]11=[P11]12[P11]22[P11]11

ρXiXjV{Xi,Xj}=[P111]12[P111]11[P111]22=1detP11[P11]121detP11[P11]221detP11[P11]11=[P11]12[P11]22[P11]11
which is exactly what the Wikipedia article is asserting.

If we let i=j, then rho_ii V\{X_i, X_i} = -1, How do we interpret those diagonal elements in the precision matrix?
Jason

Good point. The formula should be only valid for i=/=j. From the proof, the minus sign comes from the 2-by-2 matrix inversion. It would not happen if i=j.
Po C.

So the diagonal numbers can't be associated with partial correlation. What do they represent? They are not just inverses of the variances, are they?
Jason

This formula is valid for i=/=j. It is meaningless for i=j.
Po C.

4

Note that the sign of the answer actually depends on how you define partial correlation. There is a difference between regressing XiXi and XjXj on the other n1n1 variables separately vs. regressing XiXi and XjXj on the other n2n2 variables together. Under the second definition, let the correlation between residuals ϵi and ϵj be ρ. Then the partial correlation of the two (regressing ϵi on ϵj and vice versa) is ρ.

This explains the confusion in the comments above, as well as on Wikipedia. The second definition is used universally from what I can tell, so there should be a negative sign.

I originally posted an edit to the other answer, but made a mistake - sorry about that!

আমাদের সাইট ব্যবহার করে, আপনি স্বীকার করেছেন যে আপনি আমাদের কুকি নীতি এবং গোপনীয়তা নীতিটি পড়েছেন এবং বুঝতে পেরেছেন ।
Licensed under cc by-sa 3.0 with attribution required.