একটি পরীক্ষার ফলাফল কি দ্বিপাক্ষিক?


31

এখানে আমাকে দেওয়া একটি সরল পরিসংখ্যানের প্রশ্ন। আমি সত্যিই নিশ্চিত যে আমি এটি বুঝতে পেরেছি না।

এক্স = পরীক্ষায় অর্জিত পয়েন্টের সংখ্যা (একাধিক পছন্দ এবং একটি সঠিক উত্তর একটি পয়েন্ট)। এক্স দ্বিপদী বিতরণ করা হয়?

অধ্যাপকের উত্তর ছিল:

হ্যাঁ, কারণ এখানে কেবল সঠিক বা ভুল উত্তর রয়েছে।

আমার উত্তর:

না, কারণ প্রতিটি প্রশ্নের আলাদা আলাদা "সাফল্য-সম্ভাবনা" রয়েছে পি। যেহেতু আমি বুঝতে পারি যে দ্বি-দ্বি বিতরণ হ'ল বার্নোল্লি-পরীক্ষা-নিরীক্ষার একটি সিরিজ, যার প্রত্যেকটির একটি প্রদত্ত সাফল্য-সম্ভাবনা পি সহ একটি সহজ ফলাফল (সাফল্য বা ব্যর্থতা) থাকে (এবং সবগুলি পি সম্পর্কিত "অভিন্ন")। উদাহরণস্বরূপ, একটি (ফর্সা) মুদ্রা 100 বার উল্টানো, এটি 100 বার্নোল্লি-পরীক্ষা এবং সবগুলিতে পি = 0.5 রয়েছে। কিন্তু এখানে প্রশ্নগুলির বিভিন্ন ধরণের পি সঠিক আছে?


14
+1 আরও উল্লেখযোগ্য বিষয়: এটি সত্যিই একটি অদ্ভুত পরীক্ষা না হলে প্রশ্নগুলির প্রতিক্রিয়াগুলি দৃ strongly়ভাবে সংযুক্ত হবে। যদি কোনও ব্যক্তির মোট স্কোর হয় তবে এটি দ্বিপদী বিতরণকে বাদ দেবে। প্রশ্নটি একটি "নাল হাইপোথিসিস" অনুমানের অধীনে কাজ করছে যে সমস্ত পরীক্ষার্থী স্বতন্ত্র এবং এলোমেলোভাবে সমস্ত উত্তর অনুমান করছে? X
whuber

2
বিদ্বেষপূর্ণ, আমি এটির জন্য আংশিক creditণ পাওয়ার জন্য কমপক্ষে তদবির করব, তবে "উত্তর" মনে হয় এটি পুরষ্কারের জন্য একটি বিভেদ প্রতিফলিত করে :) (আমি মনে করি আপনি ঠিক এখানে আছেন)।
অ্যাডমো

1
হ্যাঁ, আপনাকে ধন্যবাদ: ডি, আমি মনে করি এটি আরও একটি পয়েসন দ্বিপদী বিতরণ (যদি কিছু থাকে)
পল

2
@ জাহাভা দেখুন stats.stackexchange.com/search?q=poisson+binomial
whuber

2
আমি সবার সাথে একমত যে প্রশ্নটি খারাপ ছিল না, তবে এখানে ফ্রেমিংয়ের একটি সমস্যা রয়েছে। যদি এটি প্রাথমিক পাঠ্যক্রম হয় এবং এটি একটি সংক্ষিপ্ত-উত্তর ফর্ম্যাট (যাতে আপনার যুক্তিটি ব্যাখ্যা করার সুযোগ হয়), তবে আমি বলব সর্বোত্তম উত্তরটি সম্ভবত "হ্যাঁ (প্রতিটি প্রশ্নের স্বাধীনতা এবং সমান অসুবিধা অনুমান করে)"; এটি প্রফেসরের সাথে সংকেত দেয় যে (১) আপনি প্রশ্নের সীমাবদ্ধতা বুঝতে পেরেছেন এবং (২) আপনি স্মার্ট-গাধা হওয়ার চেষ্টা করছেন না।
বেন বলকার

উত্তর:


25

I would agree with your answer. Usually this kind of data would nowadays be modeled with some kind of Item Response Theory model. For example, if you used the Rasch model, then the binary answer Xni would be modeled as

Pr{Xni=1}=eβnδi1+eβnδi

where βn can be thought as n-th persons ability and δi as i-th question difficulty. So the model enables you to catch the fact that different persons vary in abilities and questions vary in difficulty, and this is the simplest of the IRT models.

Your professors answer assumes that all questions have same probability of "success" and are independent, since binomial is a distribution of a sum of n i.i.d. Bernoulli trials. It ignores the two kinds of dependencies described above.

As noticed in the comments, if you looked at the distribution of answers of a particular person (so you don't have to care about between-person variability), or answers of different people on the same item (so there is no between-item variability), then the distribution would be Poisson-binomial, i.e. the distribution of the sum of n non-i.i.d. Bernoulli trials. The distribution could be approximated with binomial, or Poisson, but that's all. Otherwise you're making the i.i.d. assumption.

Even under "null" assumption about guessing, this assumes that there is no guessing patterns, so people do not differ in how they guess and items do not differ in how they are guessed--so the guessing is purely random.


That makes sense! Although i guess you could compute the probability of the success probability of a question but the "persons ability" sounds difficult :) Another idea that i had is to model this as a sum of bernulli distributions? E.g. lets say there are 2 question, therefore 2 success-probabilities p1 and p2. Analogously two variables X1 and X2 counting (so 2 bernulli-experiments). Then for example the probability of getting one total score of 1 is P(X1=1)*P(X2=0)+P(X1=0)*P(X2=1)=p1(1-p2)+(p1-1)p2. Does that sound reasonable?
Paul

2
@Paul sum of two Bernoulli's with different p's is Poisson-binomial
Tim

4
The "null" assumption is basically a spherical-cow thing, you can always quibble about exactly how spherical the cow is.
Hong Ooi

5

The answer to this problem depends on the framing of the question and when information is gained. Overall, I tend to agree with the professor but think the explanation of his/her answer is poor and the professor's question should include more information up front.

If you consider an infinite number of potential exam questions, and you draw one at random for question 1, draw one at random for question 2, etc. Then going into the exam:

  1. Each question has two outcomes (right or wrong)
  2. There are a fixed number of trials (questions)
  3. Each trial could be considered independent (going into question two, your probability p of getting it right is the same as when going into question one)

Under this framework, the assumptions of a binomial experiment are met.

Alas, ill-proposed statistical problems are very common in practice, not just on exams. I wouldn't hesitate to defend your rationale to your professor.


Jea i guess that is right too. The question is just "bad", since you could argue both ways, since so little information is given. But i just was very unhappy with the given answer of my professor.
Paul

4
@Paul, it's actually pretty hard to write good statistical questions. I know I've flubbed it on many occasions.
gung - Reinstate Monica

1
If you consider an infinite number of potential exam questions, and you draw one at random for question 1, draw one at random for question 2, etc. -- I think you should make explicit the assumption that exam questions are drawn independently from the pool of potential questions. It would be more realistic for them to be correlated: if question 1 is easy, it is likely that you are being given an easy exam and that question 2 will be easy.
Adrian

0

If there are n questions, and I can answer any one question correctly with probability p, and there is enough time to attempt answering all questions, and I did 100 of these tests, then my scores would be normal distributed with a mean of np.

But it's not me repeating the test 100 times, it's 100 different candidates doing one test, each with his own probability p. The distribution of these p's will be the overriding factor. You might have a test where p = 0.9 if you studied the subject well, p = 0.1 if you didn't, with very few people between 0.1 and 0.9. The distribution of points will have very strong maxima at 0.1n and 0.9 n and will be nowhere near normal distribution.

On the other hand, there are tests where everybody can answer any question, but take different amounts of time, so some will answer all n questions, and others will answer fewer because they run out of time. If we can assume that the speed of the candidates is normal distributed, then the points will be close to normal distributed.

But many tests will contain some very hard and some very easy questions, intentionally so that we can distinguish between the best candidates (who will answer all questions up to some degree of difficulty) and the worst candidates (who will only be able to answer very simple questions). This would change the distribution of points quite strongly.


2
The normal distribution that you describe in here is normal approximation of binomial. Obviously the sum of zeros and ones wouldn't be continuous and range between and
Tim

2
@Tim Despite the unnecessary reliance on normal distributions and the mystery of taking 100 tests, this answer has merit in attempting to demonstrate how a particular case can lead to an obviously non-binomial distribution. As such it could be a valuable contribution to the answers if these technical issues were addressed.
whuber

0

By definition, a binomial distribution is a set of n independent and identically distributed Bernoulli trials. In the case of a multiple choice exam, each of the n questions would be one of the Bernoulli trials.

The issue here arises because we can't reasonably assume that the n questions:

  • Are identically distributed. As you said, the probability a student knows the answer to question 1 is almost certainly not going to be the same as the probability they know the answer question 2, and so on.
  • Are independent. Many exams ask questions that are built upon the answers to the previous question(s). Who's to say for sure that that wouldn't happen on the exam in this question? There are other factors that could make answers to exam questions not independent of one another, but I think this one is the most intuitively obvious.

I have seen questions in Statistics classes that model exam questions as binomials, but they are framed something along the lines of:

What probability distribution would model the number of questions answered correctly on a multiple choice exam where every question has four choices, and the student taking the exam is guessing every answer at random?

In this scenario, of course it would be represented as a binomial distribution with p=14 .


There's nothing the matter with your facts, but the logic is incorrect: it doesn't suffice to demonstrate that some assumptions may not hold, because (logically) the distribution could still be binomial in any case. You also need to demonstrate that these assumptions can fail in ways that cause the score distribution definitely to be non-binomial.
whuber
আমাদের সাইট ব্যবহার করে, আপনি স্বীকার করেছেন যে আপনি আমাদের কুকি নীতি এবং গোপনীয়তা নীতিটি পড়েছেন এবং বুঝতে পেরেছেন ।
Licensed under cc by-sa 3.0 with attribution required.