বিষয়টিতে (দ্বৈত) স্থানটিতে পিসিএর জ্যামিতিক বোঝা

প্রধান উপাদান উপাদান বিশ্লেষণ (পিসিএ) সাবজেক্ট (দ্বৈত) স্পেসে কীভাবে কাজ করে সে সম্পর্কে আমি একটি স্বজ্ঞাত বোঝার চেষ্টা করছি ।

দুটি ভেরিয়েবল সঙ্গে 2D ডেটা সেটটি বিবেচনা করুন, $x_1$ এবং $x_2$ , এবং $n$ ডাটা পয়েন্টের (ডাটা ম্যাট্রিক্স $\mathbf X$ হয় $n\times 2$ এবং কেন্দ্রিক অবস্থায় গণ্য করা হয়)। পিসিএর সাধারণ উপস্থাপনাটি হ'ল আমরা $n$ পয়েন্টগুলি বিবেচনা করি , কোভেরিয়েন্স ম্যাট্রিক্সটি লিখি এবং এর আইজেনভেেক্টরগুলি & এগেনভ্যালুগুলি পাই; প্রথম পিসি সর্বাধিক বৈকল্পিকের দিকের সাথে সম্পর্কিত, ইত্যাদি এখানে কোভারিয়েন্স ম্যাট্রিক্স সহ একটি উদাহরণ রয়েছে $\mathbb R^2$ $2\times 2$ $\mathbf C = \left(\begin{array}{cc}4&2\\2&2\end{array}\right)$ । লাল রেখাগুলি সংশ্লিষ্ট ইগনালভ্যালুগুলির বর্গাকার শিকড় দ্বারা স্কেল করা আইজেনভেেক্টরগুলি দেখায়।

$\hskip 1in$

এখন সাবজেক্ট স্পেসে কী ঘটে যায় তা বিবেচনা করুন (আমি এই শব্দটি @ttnphns থেকে শিখেছি), দ্বৈত স্থান (যান্ত্রিক শিক্ষায় ব্যবহৃত শব্দটি ) নামেও পরিচিত । এটি একটি মাত্রিক স্থান যেখানে আমাদের দুটি ভেরিয়েবলের নমুনা ( দুটি কলাম ) দুটি ভেক্টর এবং । প্রতিটি ভেরিয়েবল ভেক্টরের বর্গক্ষেত্র দৈর্ঘ্য তারতম্যের সমান, দুটি ভেক্টরের মধ্যবর্তী কোণটির কোসাইন তাদের মধ্যকার পারস্পরিক সম্পর্কের সমান। এই উপস্থাপনা, যাইহোক, একাধিক রিগ্রেশন চিকিত্সার ক্ষেত্রে খুব স্ট্যান্ডার্ড। আমার উদাহরণে বিষয়বস্তুর স্থানটি দেখতে দেখতে (আমি কেবল দুটি ভেরিয়েবল ভেক্টর দ্বারা বিভক্ত 2 ডি প্লেনটি দেখাই): $n$ $\mathbf X$ $\mathbf x_1$ $\mathbf x_2$

$\hskip 1in$

মূল উপাদানগুলি, দুটি ভেরিয়েবলের লিনিয়ার সংমিশ্রণ হওয়ায় একই সমতলে দুটি ভেক্টর এবং গঠন করবে । আমার প্রশ্ন হ'ল: এই জাতীয় একটি প্লটের মূল ভেরিয়েবল ভেক্টর ব্যবহার করে মূল উপাদান ভেরিয়েবল ভেক্টরগুলি কীভাবে গঠন করা যায় তার জ্যামিতিক বোঝার / অন্তর্দৃষ্টি কী? প্রদত্ত এবং , কি জ্যামিতিক পদ্ধতি উত্পাদ হবে ? $\mathbf p_1$ $\mathbf p_2$ $\mathbf x_1$ $\mathbf x_2$ $\mathbf p_1$

নীচে এটি সম্পর্কে আমার বর্তমান আংশিক বোঝার রয়েছে।

প্রথমত, আমি আদর্শ পদ্ধতির মাধ্যমে প্রধান উপাদান / অক্ষগুলি গণনা করতে এবং একই চিত্রটিতে প্লট করতে পারি:

$\hskip 1in$

তদুপরি, আমরা লক্ষ করতে পারি যে এমনভাবে বেছে নেওয়া হয়েছে যে (নীল ভেক্টর) এবং তে তাদের অনুমানগুলির মধ্যে স্কোয়ারড দূরত্বের যোগফল কম হয়; এই দূরত্বগুলি পুনর্নির্মাণের ত্রুটি এবং সেগুলি কালো ড্যাশযুক্ত লাইনে দেখানো হয়েছে। সমানভাবে, উভয় অনুমানের স্কোয়ার দৈর্ঘ্যের যোগফলকে সর্বাধিক করে তোলে। এটি সম্পূর্ণরূপে নির্দিষ্ট করে এবং অবশ্যই প্রাথমিক স্থানের অনুরূপ বর্ণনার সাথে সম্পূর্ণভাবে সাদৃশ্যপূর্ণ ( মূল উপাদানগুলির বিশ্লেষণ, ইগেনভেেক্টর এবং ইগেনালয়েজগুলি বোঝার জন্য আমার উত্তরটিতে অ্যানিমেশনটি দেখুন )। @ Ttnphns'es উত্তরের প্রথম অংশটিও এখানে দেখুন । $\mathbf p_1$ $\mathbf x_i$ $\mathbf p_1$ $\mathbf p_1$ $\mathbf p_1$

তবে এটি যথেষ্ট জ্যামিতিক নয়! এটি আমাকে এই জাতীয় কীভাবে সন্ধান করতে হবে তা বলে না এবং এর দৈর্ঘ্য নির্দিষ্ট করে না। $\mathbf p_1$

আমার অনুমান যে , , এবং এক উপবৃত্তাকার সব মিথ্যা কেন্দ্রীভূত সঙ্গে এবং হচ্ছে তার প্রধান অক্ষ। এটি আমার উদাহরণে দেখতে কেমন দেখাচ্ছে: $\mathbf x_1$ $\mathbf x_2$ $\mathbf p_1$ $\mathbf p_2$ $0$ $\mathbf p_1$ $\mathbf p_2$

$\hskip 1in$

প্রশ্ন 1: কীভাবে প্রমাণ করবেন? প্রত্যক্ষ বীজগণিত প্রদর্শন খুব ক্লান্তিকর বলে মনে হচ্ছে; কিভাবে এটি দেখতে হবে যে এই ক্ষেত্রে হতে হবে?

তবে কেন্দ্র করে এবং এবং দিয়ে যাচ্ছেন এমন অনেকগুলি উপবৃত্ত রয়েছে : $0$ $\mathbf x_1$ $\mathbf x_2$

$\hskip 1in$

প্রশ্ন 2: "সঠিক" উপবৃত্তটি কী নির্দিষ্ট করে? আমার প্রথম অনুমান যে এটি দীর্ঘতম সম্ভব মূল অক্ষ সহকারে উপবৃত্ত; তবে এটি ভুল বলে মনে হচ্ছে (কোনও দৈর্ঘ্যের মূল অক্ষ সহ উপবৃত্ত রয়েছে)।

যদি Q1 এবং Q2 এর উত্তর থাকে তবে আমি আরও জানতে চাই যে তারা দুটির বেশি ভেরিয়েবলের ক্ষেত্রে জেনারেলাইজ করে কিনা।

— অ্যামিবা বলছেন মনিকাকে রিইনস্টেট করুন
সূত্র

এটি কি সত্য যে বহু সম্ভাব্য উপবৃত্তিগুলি মূলতে কেন্দ্রীভূত হয় (যেখানে x1 এবং x2 ছেদ করে), এবং এক্স 1 এবং এক্স 2 এর শেষ প্রান্তের সাথে যোগাযোগ করে? আমি ভাবতাম এখানে কেবল একজনই থাকতেন। অবশ্যই আপনি এই 3 টি মাপদণ্ডের 1 টি (কেন্দ্র এবং 2 টি শেষ) শিথিল করলে অনেকগুলি থাকতে পারে।

— গুং - মনিকা পুনরায়

দুটি ভেক্টর দিয়ে যাওয়ার উত্সকে কেন্দ্র করে প্রচুর উপবৃত্তি রয়েছে। কিন্তু অ-কলিনারি ভেক্টরগুলির জন্য

এবং

কেবল দ্বৈত ভিত্তিতে একক বৃত্ত । এটি

লোকাস যেখানে

(a, b)

$(a,b)$

(c, d)

$(c,d)$

x (a, b) + y (c, d)

$x(a,b)+y(c,d)$

{| {(\begin{matrix} a & c \\ b & d \end{matrix})}^{- 1} (\begin{matrix} x \\ y \end{matrix}) |}^{2} = 1.

$\left|\pmatrix{a&c\\b&d}^{-1}\pmatrix{x\\y}\right|^2=1.$ এর প্রধান অক্ষ থেকে আরও অনেক কিছু শেখা যায়।

— হোবার

variable space (I borrowed this term from ttnphns)- @ অ্যামিবা, আপনার অবশ্যই ভুল হতে হবে। (মূলত) এন-ডাইমেনশনাল স্পেসে ভেক্টর হিসাবে ভেরিয়েবলগুলি সাবজেক্ট স্পেস বলে (এন সাবজেক্ট হিসাবে অক্ষ হিসাবে "স্পেস হিসাবে" পি স্পেস "স্প্যান" বলে)। পরিবর্তিত স্থানটি বিপরীতে, বিপরীত - অর্থাৎ স্বাভাবিক স্ক্রেটারপ্লট। এইভাবে বহুবিধ পরিসংখ্যানগুলিতে পরিভাষাটি প্রতিষ্ঠিত হয়। (যদি মেশিন লার্নিংয়ে এটি আলাদা হয় - আমি এটি জানি না - তবে এটি

— শিখার

নোট করুন যে উভয়ই ভেক্টর স্পেস: ভেক্টর (= পয়েন্ট) হ'ল স্প্যানস, অক্ষগুলি হ'ল নির্দেশাবলী এবং ভালুক পরিমাপের notches সংজ্ঞায়িত করে। ডায়ালেক্টিকগুলিও নোট করুন: উভয় "স্পেস" আসলে একই স্থান (কেবলমাত্র বর্তমান উদ্দেশ্যে আলাদাভাবে তৈরি করা হয়)। এটি দেখা যায়, উদাহরণস্বরূপ, এই উত্তরের শেষ ছবিতে । আপনি দুটি সূত্রকে ওভারলে করলে বাইপ্লট বা দ্বৈত স্থান পাবেন।

— ttnphns

My guess is that x1, x2, p1, p2 all lie on one ellipseএখানে উপবৃত্ত থেকে হিউরিস্টিক সহায়তা কী হতে পারে? আমি এটাকে সন্দেহ করি.

— ttnphns

প্রশ্নে প্রদর্শিত সমস্ত সংক্ষিপ্তসার কেবল তার দ্বিতীয় মুহুর্তের উপর নির্ভর করে; বা, সমানভাবে, ম্যাট্রিক্স । যেহেতু আমরা একটি পয়েন্ট ক্লাউড হিসাবে ভাবছি - প্রতিটি পয়েন্টটি একটি সারি - আমরা জিজ্ঞাসা করতে পারি যে এই পয়েন্টগুলিতে কী সহজ ক্রিয়াকলাপ এর বৈশিষ্ট্য সংরক্ষণ করে । $\mathbf X$ $\mathbf{X^\prime X}$ $\mathbf X$ $\mathbf X$ $\mathbf{X^\prime X}$

এক বাম-সংখ্যাবৃদ্ধি হয় একটি দ্বারা ম্যাট্রিক্স , যা অন্য উত্পাদন করবে ম্যাট্রিক্স । এটি কাজ করার জন্য, এটি অপরিহার্য $\mathbf X$ $n\times n$ $\mathbf U$ $n\times 2$ $\mathbf{UX}$

X^{'} X = (U X)^{'} U X = X^{'} (U^{'} U) X .

$\mathbf{X^\prime X} = \mathbf{(UX)^\prime UX} = \mathbf{X^\prime (U^\prime U) X}.$

সমতা নিশ্চিত করা হয় যখন হয় পরিচয় ম্যাট্রিক্স: যে, যখন হয় লম্ব । $\mathbf{U^\prime U}$ $n\times n$ $\mathbf{U}$

এটি সর্বজনবিদিত (এবং এটি সহজেই বোঝানো যায়) যে অर्थোগোনাল ম্যাট্রিকগুলি ইউক্লিডিয়ান প্রতিচ্ছবি এবং আবর্তনের পণ্য (তারা মধ্যে একটি প্রতিচ্ছবি গ্রুপ গঠন করে )। বুদ্ধিমানের সাথে ঘূর্ণনগুলি চয়ন করে, আমরা নাটকীয়ভাবে সহজ করতে পারি । একটি ধারণা হ'ল ঘূর্ণনগুলিতে ফোকাস করা যা একসাথে মেঘের দুটি মাত্র পয়েন্টকে প্রভাবিত করে। এগুলি বিশেষত সহজ, কারণ আমরা সেগুলি দেখতে পারি। $\mathbb{R}^n$ $\mathbf{X}$

বিশেষ করে, দিন এবং মেঘে দুটি স্বতন্ত্র অশূন্য পয়েন্ট হতে, সারি গঠনকারী এবং এর । কেবলমাত্র এই দুটি পয়েন্টকে প্রভাবিত করে কলাম স্পেস এর একটি ঘূর্ণন এগুলিকে রূপান্তর করে $(x_i, y_i)$ $(x_j, y_j)$ $i$ $j$ $\mathbf{X}$ $\mathbb{R}^n$

{\begin{cases} (x_{i}^{'}, y_{i}^{'}) = (\cos (θ) x_{i} + \sin (θ) x_{j}, \cos (θ) y_{i} + \sin (θ) y_{j}) \\ (x_{j}^{'}, y_{j}^{'}) = (- \sin (θ) x_{i} + \cos (θ) x_{j}, - \sin (θ) y_{i} + \cos (θ) y_{j}) . \end{cases}

$\cases{(x_i^\prime, y_i^\prime) = (\cos(\theta)x_i + \sin(\theta)x_j, \cos(\theta)y_i + \sin(\theta)y_j) \\ (x_j^\prime, y_j^\prime) = (-\sin(\theta)x_i + \cos(\theta)x_j, -\sin(\theta)y_i + \cos(\theta)y_j).}$

এটির পরিমাণটি ভেক্টরগুলি অঙ্কন করে এবং $(x_i, x_j)$ $(y_i, y_j)$ সমতল এবং কোণ তাদের আবর্তিত । (লক্ষ্য করুন যে স্থানাঙ্কগুলি এখানে কীভাবে মিশে যায়! একে অপরের সাথে যায় এবং এর একসাথে যায় Thus সুতরাং, এই ঘূর্ণনের প্রভাব সাধারণত ভেক্টরগুলির ঘূর্ণনের মতো দেখায় না এবং $\theta$ $x$ $y$ $\mathbb{R}^n$ $(x_i, y_i)$ $(x_j, y_j)$ মধ্যে টানা হিসাবে $\mathbb{R}^2$ ।)

ঠিক ঠিক কোণটি নির্বাচন করে, আমরা এই নতুন উপাদানগুলির যে কোনও একটিকে শূন্য করতে পারি। কংক্রিট হতে পারে, এর চয়ন করা যাক যাতে $\theta$

{\begin{cases} \cos (θ) = \pm \frac{x_{i}}{\sqrt{x_{i}^{2} + x_{j}^{2}}} \\ \sin (θ) = \pm \frac{x_{j}}{\sqrt{x_{i}^{2} + x_{j}^{2}}} \end{cases} .

$\cases{\cos(\theta) = \pm \frac{x_i}{\sqrt{x_i^2 + x_j^2}} \\ \sin(\theta) = \pm \frac{x_j}{\sqrt{x_i^2 + x_j^2}}}.$

এটি $x_j^\prime=0$ $y_j^\prime \ge 0$ $i$ $j$ $\mathbf X$ $\gamma(i,j)$

পুনরাবৃত্তভাবে প্রয়োগ করা হচ্ছে $\gamma(1,2), \gamma(1,3), \ldots, \gamma(1,n)$ $\mathbf{X}$ $\mathbf{X}$ $y$ $2, 3, \ldots, n$ $\mathbb{R}^n$ $n-1$ $X$

X = (\begin{matrix} x_{1}^{'} & y_{1}^{'} \\ 0 & z \end{matrix}),

$\mathbf{X} = \pmatrix{x_1^\prime & y_1^\prime \\ \mathbf{0} & \mathbf{z}},$

$\mathbf{0}$ $\mathbf{z}$ $n-1$

X^{'} X = (\begin{matrix} {(x_{1}^{'})}^{2} & x_{1}^{'} y_{1}^{'} \\ x_{1}^{'} y_{1}^{'} & {(y_{1}^{'})}^{2} + | | z | |^{2} \end{matrix}) .

$\mathbf{X^\prime X} = \pmatrix{\left(x_1^\prime\right)^2 & x_1^\prime y_1^\prime \\ x_1^\prime y_1^\prime & \left(y_1^\prime\right)^2 + ||\mathbf{z}||^2}.$

$\mathbf{X}$

X = (\begin{matrix} x_{1}^{'} & y_{1}^{'} \\ 0 & | | z | | \\ 0 & 0 \\ ⋮ & ⋮ \\ 0 & 0 \end{matrix}) .

$\mathbf{X} = \pmatrix{x_1^\prime & y_1^\prime \\ 0 & ||\mathbf{z}|| \\ 0 & 0 \\ \vdots & \vdots \\ 0 & 0}.$

বাস্তবে, আমরা এখন বুঝতে পারি $\mathbf{X}$ $2\times 2$ $\pmatrix{x_1^\prime & y_1^\prime \\ 0 & ||\mathbf{z}||}$

উদাহরণস্বরূপ, আমি একটি দ্বিখণ্ডিত সাধারণ বিতরণ থেকে চারটি আইড পয়েন্ট আঁকে এবং তাদের মানগুলিকে গোল করেছিলাম

X = (\begin{matrix} 0.09 & 0.12 \\ - 0.31 & - 0.63 \\ 0.74 & - 0.23 \\ - 1.8 & - 0.39 \end{matrix})

$\mathbf{X} = \pmatrix{ 0.09 & 0.12 \\ -0.31 & -0.63 \\ 0.74 & -0.23 \\ -1.8 & -0.39}$

এই প্রাথমিক বিন্দু মেঘটি পরবর্তী কালো চিত্রের বাম দিকে বর্ণিত তীরগুলি উত্স থেকে প্রতিটি বিন্দুতে নির্দেশ করছে (তাদেরকে ভেক্টর হিসাবে রূপায়ণে সহায়তা করার জন্য ) দেখানো হয়েছে ।

$\gamma(1,2), \gamma(1,3),$ $\gamma(1,4)$ $y$ $\mathbf X$ $||\mathbf{z}||$ $(x_1^\prime, y_1^\prime)$

$\mathbf X$

\begin{matrix} (1) & θ \to (\cos (θ) x_{1}^{'}, \cos (θ) y_{1}^{'} + \sin (θ) | | z | |) \end{matrix}

$\theta\ \to\ (\cos(\theta)x_1^\prime, \cos(\theta) y_1^\prime + \sin(\theta)||\mathbf{z}||)\tag{1}$

দ্বিতীয় ভেক্টর অনুসারে একই পথটি সন্ধান করে

\begin{matrix} (2) & θ \to (- \sin (θ) x_{1}^{'}, - \sin (θ) y_{1}^{'} + \cos (θ) | | z | |) . \end{matrix}

$\theta\ \to\ (-\sin(\theta)x_1^\prime, -\sin(\theta) y_1^\prime + \cos(\theta)||\mathbf{z}||).\tag{2}$

We may avoid tedious algebra by noting that because this curve is the image of the set of points $\{(\cos(\theta), \sin(\theta))\,:\, 0 \le \theta\lt 2\pi\}$ under the linear transformation determined by

(1, 0) \to (x_{1}^{'}, 0); (0, 1) \to (y_{1}^{'}, | | z | |),

$(1,0)\ \to\ (x_1^\prime, 0);\quad (0,1)\ \to\ (y_1^\prime, ||\mathbf{z}||),$

it must be an ellipse. (Question 2 has now been fully answered.) Thus there will be four critical values of $\theta$ in the parameterization $(1)$ , of which two correspond to the ends of the major axis and two correspond to the ends of the minor axis; and it immediately follows that simultaneously $(2)$ gives the ends of the minor axis and major axis, respectively. If we choose such a $\theta$ , the corresponding points in the point cloud will be located at the ends of the principal axes, like this:

Because these are orthogonal and are directed along the axes of the ellipse, they correctly depict the principal axes: the PCA solution. That answers Question 1.

The analysis given here complements that of my answer at Bottom to top explanation of the Mahalanobis distance. There, by examining rotations and rescalings in $\mathbb{R}^2$ , I explained how any point cloud in $p=2$ dimensions geometrically determines a natural coordinate system for $\mathbb{R}^2$ . Here, I have shown how it geometrically determines an ellipse which is the image of a circle under a linear transformation. This ellipse is, of course, an isocontour of constant Mahalanobis distance.

Another thing accomplished by this analysis is to display an intimate connection between QR decomposition (of a rectangular matrix) and the Singular Value Decomposition, or SVD. The $\gamma(i,j)$ are known as Givens rotations. Their composition constitutes the orthogonal, or " $Q$ ", part of the QR decomposition. What remained--the reduced form of $\mathbf{X}$ --is the upper triangular, or " $R$ " part of the QR decomposition. At the same time, the rotation and rescalings (described as relabelings of the coordinates in the other post) constitute the $\mathbf{D}\cdot \mathbf{V}^\prime$ part of the SVD, $\mathbf{X} = \mathbf{U\, D\, V^\prime}$ . The rows of $\mathbf{U}$ , incidentally, form the point cloud displayed in the last figure of that post.

Finally, the analysis presented here generalizes in obvious ways to the cases $p\ne 2$ : that is, when there are just one or more than two principal components.

— whuber
সূত্র

Though your answer may be exemplary on it own it is unclear - to me - how it relates to the question. You are speaking throughout about the data cloud X (and vectors you rotate are data points, rows of X). But the question was about the reduced subject space. In other words, we don't have any data X, we have only 2x2 covariance or scatter matrix X'X.

— ttnphns

(cont.) We represent the 2 variables summarized by it as 2 vectors with lengths = sqrt(diagonal elements) and angle = their correlation. Then the OP askes how can we purely geometrically solve for the principal components. In other words, OP wants to explain geometrically eigendecomposition (eigenvalues & eigenvectors or, better, loadings) of 2x2 symmetric covariance matrix.

— ttnphns

(cont.) Please look on the second picture there. What the OP of the current question seeks for is to find geometric (trigonometric etc) tools or tricks to draw the vectors P1 and P2 on that pic, having only vectors X and Y as given.

— ttnphns

@ttnphns. It doesn't matter what the starting point is: the first half of this answer shows that you can reduce any point cloud

X

$\mathbf{X}$ to a pair of points which contain all the information about $\mathbf{X^\prime X}$ . The second half demonstrates that pair of points is not unique, but nevertheless each lies on the same ellipse. It gives an explicit construction of that ellipse beginning with any two-point representation of

X^{'} X

$\mathbf{X^\prime X}$ (such as the pair of blue vectors shown in the question). Its major and minor axes yield the PCA solution (the red vectors).

— whuber

Thanks, I'm beginning to understand your thought. (I wish you added subtitles / synopsis right in your answer about the two "halves" of it, just to structure it for a reader.)

— ttnphns