এক-লেজযুক্ত কলমোগোরভ-স্মারনভ পরীক্ষা করা কি বোধগম্য?


15

এক-লেজযুক্ত কেএস পরীক্ষা করা কি অর্থবহ এবং সম্ভব? এ জাতীয় পরীক্ষার নাল অনুমান কী হবে? নাকি কেএস পরীক্ষাটি সহজাতভাবে একটি দ্বি-পুচ্ছ পরীক্ষা?

আমি কোন উত্তর যে সাহায্য করেছে আমাকে বিতরণের বুঝতে থেকে উপকৃত হবে ডি (আমি ম্যাসে এর 1951 কাগজ মাধ্যমে কাজ করছি, এবং বর্ণনা চ্যালেঞ্জ খুঁজে, উদাহরণস্বরূপ হয় এবং ডি - supremum এবং অ পরম মান পার্থক্যের infimum অভিজ্ঞতামূলক সিডিএফগুলির মধ্যে পার্থক্যের?)।D+D

ফলোআপ প্রশ্ন: কেমন আছ জন্য -values ডি + + এবং ডি - প্রাপ্ত? আমি যে সমস্ত প্রকাশনাগুলির মুখোমুখি হচ্ছি সেগুলি ডি এন , ডি + এবং ডি - এর সিডিএফের পরিবর্তে সারণী মানগুলি উপস্থাপন করছে ।pD+DDnD+D

Update: I just discovered the related question What's the null hypothesis in a one-sided Kolmogorov-Smirnov test?, which I missed on my initial scan before writing this one.

উত্তর:


20

Is it meaningful and possible to perform a one-tailed KS test?

Definitely.

is the KS test inherently a two-tailed test?

Not at all.

What would the null hypothesis of such a test be?

FXXFX as some hypothesized distribution (F0, if you prefer).

আপনি কিছু ক্ষেত্রে নালটিকে সমতা হিসাবে লিখতে পারেন (উদাহরণস্বরূপ এটি অন্যভাবে যাওয়া সম্ভব হিসাবে দেখা যায়নি), তবে আপনি যদি একটি লেজযুক্ত বিকল্পের জন্য একটি নির্দেশিক নাল লিখতে চান তবে আপনি এই জাতীয় কিছু লিখতে পারেন :

H0:FY(t)FX(t)

H1:FY(t)<FX(t), for at least one t

(or its converse for the other tail, naturally)

If we add an assumption when we use the test that they're either equal or that FY will be smaller, then rejection of the null implies (first order) stochastic ordering / first order stochastic dominance. In large enough samples, it's possible for the F's to cross - even several times, and still reject the one-sided test, so the assumption is strictly needed for stochastic dominance to hold.

Loosely if FY(t)FX(t) with strict inequality for at least some t then Y 'tends to be bigger' than X.

Adding assumptions like this is not weird; it's standard. It's not particularly different from assuming (say in an ANOVA) that a difference in means is because of a shift of the whole distribution (rather than a change in skewness, where some of the distribution shifts down and some shifts up, but in such a way that the mean has changed).


So let's consider, for example, a shift in mean for a normal:

enter image description here

The fact that the distribution for Y is shifted right by some amount from that for X implies that FY is lower than FX. The one-sided Kolmogorov-Smirnov test will tend to reject in this situation.

Similarly, consider a scale shift in a gamma:

enter image description here

Again, the shift to a larger scale produces a lower F. Again, the one-sided Kolmogorov-Smirnov test will tend to reject in this situation.

There are numerous situations where such a test may be useful.


So what are D+ and D?

In the one-sample test, D+ is the maximum positive deviation of the sample cdf from the hypothesized curve (that is the biggest distance the ECDF is above F0, while D is the maximum negative deviation - the biggest distance the ECDF is below F0). Both D+ and D are positive quantities:

enter image description here

A one tailed Kolmogorov-Smirnov test would look at either D+ or D depending on the direction of the alternative. Consider the one tailed one sample test:

H0:FY(t)F0(t)

H1:FY(t)<F0(t), for at least one t

To test this one - we want sensitivity to Y being stochastically larger than hypothesized (its true F is lower than F0). So unusually large values of D will tend to occur when the alternative is true. As a result, to test against the alternative FY(t)<F0(t), we use D in our one-tailed test.


Follow-up question: how are p-values for D+ and D obtained?

It's not a simple thing. There are a variety of approaches that have been used.

If I recall correctly one of the ways the distribution was obtained via the use of Brownian bridge processes (this document seems to support that recollection).

I believe this paper, and the paper by Marsaglia et al here both cover some of the background and give computational algorithms with lots of references.

Between those, you'll get a lot of the history and various approaches that have been used. If they don't cover what you need, you'll probably need to ask this as a new question.

So many of the publications I am encountering are presenting tabled values, rather than CDF of Dn, D+ and D

That's not particularly a surprise. If I remember right, even the asymptotic distribution is obtained as a series (this recollection would well be wrong), and in finite samples it's discrete and not in any simple form. In either case and there's no convenient way to present the information except as either a graph or a table.


2
"In large enough samples, it's possible for the F's to cross - even several times, and still reject the one-sided test" – note that this means that you can reject the one-sided test in both directions for the same data!
Hao Ye

2
@HaoYe Yes, that's possible. It would be a clear indication that stochastic dominance would be untenable.
Glen_b -Reinstate Monica
আমাদের সাইট ব্যবহার করে, আপনি স্বীকার করেছেন যে আপনি আমাদের কুকি নীতি এবং গোপনীয়তা নীতিটি পড়েছেন এবং বুঝতে পেরেছেন ।
Licensed under cc by-sa 3.0 with attribution required.