কণা পদার্থবিজ্ঞানে প্রমাণ গ্রহণের জন্য "5 " প্রান্তিকের উত্স ?

33

নিউজ রিপোর্টে বলা হয়েছে যে সিইআরএন আগামীকাল ঘোষণা করবে যে হিগস বোসন পরীক্ষামূলকভাবে 5 প্রমাণ সহ সনাক্ত করা হয়েছে । নিবন্ধ অনুযায়ী: $\sigma$

5 একটি 99.99994% সুযোগের সমান যে সিএমএস এবং এটিএলএস সনাক্তকারীরা যে ডেটা দেখছে তা কেবল এলোমেলো শব্দ নয় - এবং 0.00006% সম্ভাবনা যা তারা হুডবিনড হয়েছে; 5 হ'ল আনুষ্ঠানিকভাবে একটি বৈজ্ঞানিক "আবিষ্কার" নামে লেবেল লাগানোর জন্য প্রয়োজনীয় নিশ্চিততা। $\sigma$ $\sigma$

এই সুপার কঠোর নয়, কিন্তু তা যে পদার্থবিদদের মান ব্যবহার করুন "হাইপোথিসিস পরীক্ষা" পরিসংখ্যানগত পদ্ধতি নির্ধারণের মনে হয়, থেকে যা অনুরূপ, (দুই-টেইলড)? নাকি এর অন্য কোনও অর্থ আছে? $\alpha$ $0.0000006$ $z=5$

বিজ্ঞানের বেশিরভাগ ক্ষেত্রে অবশ্যই 0.05 এ আলফা সেট করা নিয়মিতভাবে করা হয়। এটি "দ্বিগুণ " প্রমাণের সমতুল্য হবে , যদিও আমি এটিকে বলা হচ্ছে কখনও শুনিনি। এমন কি অন্যান্য ক্ষেত্রগুলি (কণা পদার্থবিজ্ঞানের পাশাপাশি) রয়েছে যেখানে আলফার আরও কঠোর সংজ্ঞাটি স্ট্যান্ডার্ড? পাঁচ- নিয়ম কীভাবে কণা পদার্থবিজ্ঞানের দ্বারা গৃহীত হয়েছে তার জন্য যে কোনও রেফারেন্স জানেন ? $\sigma$ $\sigma$

আপডেট: আমি একটি সাধারণ কারণে এই প্রশ্নটি জিজ্ঞাসা করছি। আমার বই স্বজ্ঞাত বায়োস্টাটিক্সে (বেশিরভাগ পরিসংখ্যানের বইয়ের মতো) একটি বিভাগ রয়েছে যা ব্যাখ্যা করে যে "পি <0.05" নিয়মটি কতটা নির্বিচারে। আমি একটি বৈজ্ঞানিক ক্ষেত্রের এই উদাহরণটি যুক্ত করতে চাই যেখানে অনেক (অনেক!) ছোট মান প্রয়োজনীয় বলে বিবেচিত হয়। তবে উদাহরণটি যদি বাস্তবে আরও জটিল হয়, বায়েসিয়ান পদ্ধতি ব্যবহারের সাথে (নীচে কিছু মন্তব্য হিসাবে বোঝায়), তবে এটি বেশ উপযুক্ত হবে না বা আরও অনেক বেশি ব্যাখ্যা প্রয়োজন হবে। $\alpha$

hypothesis-testing p-value history

— হার্ভি মোটুলস্কি
সূত্র

2

"সিক্স সিগমা" শুনেছেন কখনও ?

— ড্যানিয়েল আর হিক্স

মান নিয়ন্ত্রণে ছয় সিগমা বিবেচনা করা হয় যেমন ড্যানিয়েল তার প্রশ্ন / মন্তব্য সহ পরামর্শ দেয়। এই প্রত্যাখ্যানের সম্ভাব্যতাগুলি সমস্ত সাধারণ বন্টন থেকে নমুনা গ্রহণ করে এবং লেজ সম্ভাব্যতা অন্যান্য বিতরণের জন্য আরও বড় হতে পারে। 5 বা 6 সিগমার মতো চূড়ান্ত ব্যবহার কেবল বিশেষ পরিস্থিতিতে কার্যকর হতে পারে। অনুশীলনে নমুনার আকার এবং ডেটাতে পরিবর্তনশীলতা 2 বা 3 সিগমা ছাড়িয়ে অপরিহার্য করে তোলে।

— মাইকেল আর চেরনিক

1

মূলত, বেশিরভাগ কণা পদার্থবিজ্ঞানীরা প্যারামিটারগুলি গণনা করার সময় বেইসিয়ান ধারণাগুলির সাথে আরও স্বাচ্ছন্দ্যযুক্ত, তাই তারা আসলে "

নিশ্চিত, ডেটা এবং প্রিরিয়ারদের প্রদত্ত যে, হিগসের সংকেত শূন্য নয়", যা অবশ্যই সেখানে বলা থেকে আলাদা কেবলমাত্র "0.01 শতাংশ সংকেত এলোমেলো গোলমাল হওয়ার" সম্ভাবনা (এটি সিস্টেমেটিক্স থেকেও উদ্ভট নন-এলোমেলো উত্থান হয়!)! [1]: পদার্থবিজ্ঞান.স্ট্যাকেক্সেঞ্জার.কম / সেকশনস / 8752/…

X %

$X\%$

— নস্টোর

3

@ নস্টার: আমি এখন হিগস প্রেস কনফারেন্সের সরাসরি সম্প্রচারটি দেখছি এবং কেউই বায়েশিয়ান ব্যাখ্যার কথা উল্লেখ করছে না। "পি-মান" এবং "তাত্পর্য স্তর" ব্যবহৃত হয়, তবে কেবল ভয়াবহভাবে ভুল তথ্যযুক্ত বায়েশিয়ান তাদের সম্ভাব্যতা হিসাবে ব্যাখ্যা করবে যে সংকেতটি এলোমেলো গোলমাল। আমি মনে করি যে ওপি-র প্রশ্নের উদ্ধৃতিতে থাকা টেক্সটটি পি-ভ্যালু আসলে কী তা একটি ভুল ব্যাখ্যা।

— MånsT

1

বিটিডব্লিউ আমি এই সমস্যাটি সম্পর্কে আমার ব্লগে একটি ব্লগ পোস্ট করেছি: র্যান্ডোমাস্ট্রোনমি.ওয়ার্ডপ্রেস.কম ।

— নস্টর

13

পরিসংখ্যানগুলির বেশিরভাগ প্রয়োগগুলিতে এমন রয়েছে যে 'সমস্ত মডেল ভুল, কিছু কার্যকর' old এটি হ'ল, আমরা কেবলমাত্র একটি নির্দিষ্ট মডেলটি একটি নির্দিষ্ট স্তরে সঞ্চালনের আশা করব যেহেতু আমরা কিছু সাধারণ মডেল ব্যবহার করে অবিশ্বাস্যরকম জটিল প্রক্রিয়াটি বর্ণনা করছি।

পদার্থবিজ্ঞান খুব আলাদা, সুতরাং পরিসংখ্যানের মডেলগুলি থেকে বুদ্ধিমান বিকাশ এতটা উপযুক্ত নয়। পদার্থবিদ্যায়, বিশেষ কণা পদার্থবিজ্ঞানে যা সরাসরি মৌলিক শারীরিক আইনগুলির সাথে সম্পর্কিত হয়, মডেলটিকে সত্যিকারের সঠিক বর্ণনা বলে মনে করা হয়। মডেল যা পূর্বাভাস দেয় তার কোনও প্রস্থান পরীক্ষামূলক শব্দ দ্বারা পুরোপুরি ব্যাখ্যা করা উচিত, মডেলের সীমাবদ্ধতা নয়। এর অর্থ হ'ল যদি মডেলটি ভাল এবং সঠিক হয় এবং পরীক্ষামূলক সরঞ্জামটি বোঝে তবে পরিসংখ্যানটির তাত্পর্য খুব বেশি হওয়া উচিত , তাই উচ্চতর বারটি সেট করা আছে।

অন্য কারণটি historicalতিহাসিক, কণা পদার্থবিজ্ঞান সম্প্রদায় অতীতে 'আবিষ্কার' দ্বারা তাত্পর্যপূর্ণ স্তরে পরে গিয়েছিল এবং পরে তা প্রত্যাহার করা হয়েছিল, সুতরাং তারা এখন আরও সচেতন।

— Bogdanovist
সূত্র

1

আপনি কি সম্মত হন যে পদার্থবিজ্ঞান খুব কম আলফা (এই ক্ষেত্রে যাইহোক) এর সাথে স্ট্যান্ডার্ড স্ট্যাটিস্টিকাল হাইপোথিসিস পরীক্ষার ব্যবহার করে। বা নেস্টর উপরোক্ত মন্তব্যে যেমন বলেছেন তারা কি কোনও প্রকার বায়েশিয়ান পদ্ধতির ব্যবহার করেন?

— হার্ভি মোটুলস্কি

2

আটলাসে যারা কাজ করেন তাদের কয়েকজনের সাথে কথা বলার মধ্য দিয়ে আমার বোঝাপড়াটি হ'ল বিশ্লেষণটি খুব বায়েশিয়ান। তবে তারা নিম্ন স্তরের ছেলেরা (যেমন প্রকৃতপক্ষে কাজটি করে)। শৃঙ্খলা রক্ষাকারী কিছু বক্তৃতাটির ব্যাখ্যাটির একটি দরিদ্র উপলব্ধি থাকলে অবাক হওয়ার কিছু নেই। বলা হচ্ছে, এলএইচসি ফলাফলের উপস্থাপনাটি খুব খারাপ ছিল, এবং সত্যিই খুব বায়েশিয়ান হিসাবে আসে নি, অন্যরা যেমন উল্লেখ করেছে।

— বোগদানোভিস্ট

2

আমি সবসময়ই ভেবেছিলাম যে বিশেষত কণা পদার্থবিজ্ঞানও বিলিয়ন ঘটনার মোকাবেলা করেছে, সুতরাং আপনাকে বারটি খুব উঁচুতে স্থাপন করতে হবে।

— ওয়েইন

11

ইতিহাস এবং উত্স

রবার্ট ডি কাজিন্স এবং টমাসো ডরিগো মতে , প্রান্তিক উত্স 60 এর দশকের প্রথম দিকের কণা পদার্থবিজ্ঞানের কাজের মধ্যে রয়েছে যখন ছড়িয়ে ছিটিয়ে থাকা অসংখ্য হিস্টোগ্রাম পরীক্ষাগুলি অনুসন্ধান এবং অনুসন্ধান করা হয়েছিল যেগুলি কিছু নতুন আবিষ্কৃত কণাকে নির্দেশ করতে পারে । প্রান্তিক হ'ল একাধিক তুলনা যা তৈরি করা হচ্ছে তার জন্য অ্যাকাউন্ট করার মোটামুটি নিয়ম। $^{1}$ $^{2}$ $5\sigma$

উভয় লেখক Rosenfeld থেকে 1968 নিবন্ধটি উল্লেখ হোক বা না হোক সেখানে আউট গবেষণার মাধ্যমে মেসনের এবং বেরিয়নের পর্যন্ত, যা বিভিন্ন জন্য, যা প্রশ্নের সঙ্গে মোকাবিলা প্রভাব যেখানে মাপা। নিবন্ধটি প্রশ্নের জবাবে নেতিবাচক যুক্তি দিয়ে উত্তর দিয়েছে যে প্রকাশিত দাবির সংখ্যা পরিসংখ্যানগতভাবে প্রত্যাশিত ওঠানামার সংখ্যার সাথে মিলে যায়। বিভিন্ন এই যুক্তি প্রবন্ধের ব্যবহার উন্নীত সমর্থনকারী গণনার সাথে সাথে স্তর: $^3$ $4 \sigma$ $5\sigma$

$(K\pi\pi)_{3/2},(\pi \rho)^{--}$ $3\sigma$ $>4\sigma$

এবং পরে কাগজে (জোর আমার)

Rosenfeld: "Then to repeat my warning at the beginning of this section; we are generating at least 100 000 potential bumps per year, and should expect several $4\sigma$ and hundreds of $3\sigma$ fluctuations. What are the implications? To the theoretician or phenomenologist the moral is simple; wait for $5\sigma$ effects."

Tommaso seems to be careful in stating that it started with the Rosenfeld article

Tommaso: "However, we should note that the article was written in 1968, but the strict criterion of five standard deviations for discovery claims was not adopted in the seventies and eighties. For instance, no such thing as a five-sigma criterion was used for the discovery of the W and Z bosons, which earned Rubbia and Van der Meer the Nobel Prize in physics in 1984."

But in the 80s the use of $5\sigma$ was spread out. For instance, the astronomer Steve Schneider $^4$ mentions in 1989 that it is something being taught (emphasize mine in the quote below):

Schneider: "Frequently, 'levels of confidence' of 95% or 99% are quoted for apparently discrepant data, but this amounts to only two or three statistical sigmas. I was taught not to believe anything less than five sigma, which if you think about it is an absurdly stringent requirement --- something like a 99.9999% confidence level. But of course, such a limit is used because the actual size of sigma is almost never known. There are just too many free variables in astronomy that we can't control or don't know about."

Yet, in the field of particle physics many publications where still based on $4\sigma$ discrepancies up till the late 90s. This only changed into $5\sigma$ at the beginnning of the 21th century. It is probably prescribed as a guidline for publications around 2003 (see the prologue in Franklin's book Shifting Standards $^5$ )

Franklin: By 2003 the 5-standard-deviation criterion for "observation of" seems to have been in effect

...

A member of the BaBar collaboration recalls that about this time the 5-sigma criterion was issued as a guideline by the editors of the Physical Review Letters

Modern use

Currently, the $5\sigma$ threshold is a textbook standard. For instance, it occurs as a standard article on physics.org $^6$ or in some of Glen Cowan's works, such as the statistics section of the Review of Particle Physics from the particle data group $^7$ (albeit with several critical sidenotes)

Glen Cowan: Often in HEP, the level of significance where an effect is said to qualify as a discovery is $Z = 5$ , i.e., a $5\sigma$ effect, corresponding to a p-value of $2.87 \times 10^{−7}$ . One’s actual degree of belief that a new process is present, however, will depend in general on other factors as well, such as the plausibility of the new signal hypothesis and the degree to which it can describe the data, one’s confidence in the model that led to the observed p-value, and possible corrections for multiple observations out of which one focuses on the smallest p-value obtained (the “look-elsewhere effect”).

The use of the $5\sigma$ level is now ascribed to 4 reasons:

History based on practice one found that $5\sigma$ is a good threshold. (exotic stuff seems to happen randomly, even between $3\sigma$ to $4\sigma$ , like recently the 750 GeV diphoton excess)
The look elsewhere effect (or the multiple comparisons). Either because multiple hypotheses are tested, or because experiments are performed many times, people adjust for this (very roughly) by adjusting the bound to $5\sigma$ . This relates to the history argument.
Systematic effects and uncertainty in $\sigma$ often the uncertainty of the experiment outcome is not well known. The $\sigma$ is derived, but the derivation includes weak assumptions such as the absence of systematic effects, or the possibility to ignore them. Increasing the threshold seems to be a way to sort of a protect against these events. (This is a bit strange though. The computed $\sigma$ has no relation to the size of systematic effects and the logic breaks down, an example is the "discovery" of superluminal neutrino's which was reported to be having a $6\sigma$ significance.)
Extraordinary claims require extraordinary evidence Scientific results are reported in a frequentist way, for instance using confidence intervals or p-values. But, they are often interpreted in a Bayesian way. The $5\sigma$ level is claimed to account for this.

Currently several criticisms have been written about the $5\sigma$ threshold by Louis Lyons ${^{8,}}$ $^9$ , and also the earlier mentioned articles by Robert D Cousins $^{1}$ and Tommaso Dorigo $^{2}$ provide critique.

Other Fields

It is interesting to note that many other scientific fields do not have similar thresholds or do not, somehow, deal with the issue. I imagine this makes a bit sense in the case of experiments with humans where it is very costly (or impossible) to extend an experiment that gave a .05 or .01 significance.

The result of not taking these effects into account is that over half of the published results may be wrong or at least are not reproducible (This has been argued for the case of psychology by Monya Baker $^{10}$ , and I believe there are many others that made similar arguments. I personaly think that the situation may be even worse in nutritional science). And now, people from other fields than physics are thinking about how they should deal with this issue (the case of medicine/pharmacology $^{11}$ ).

Cousins, R. D. (2017). The Jeffreys–Lindley paradox and discovery criteria in high energy physics. Synthese, 194(2), 395-432. arxiv link
Dorigo, T. (2013) Demystifying The Five-Sigma Criterion, from science20.com 2019-03-07
Rosenfeld, A. H. (1968). Are there any far-out mesons or baryons? web-source: escholarship
Burbidge, G., Roberts, M., Schneider, S., Sharp, N., & Tifft, W. (1990, November). Panel discussion: Redshift related problems. In NASA Conference Publication (Vol. 3098, p. 462). link to photocopy on harvard.edu
Franklin, A. (2013). Shifting standards: Experiments in particle physics in the twentieth century. University of Pittsburgh Press.
What does the 5 sigma mean? from physics.org 2019-03-07
Beringer, J., Arguin, J. F., Barnett, R. M., Copic, K., Dahl, O., Groom, D. E., ... & Yao, W. M. (2012). Review of particle physics. Physical Review D-Particles, Fields, Gravitation and Cosmology, 86(1), 010001. (section 36.2.2. Significance tests, page 394, link aps.org )
Lyons, L. (2013). Discovering the Significance of 5 sigma. arXiv preprint arXiv:1310.1284. arxiv link
Lyons, L. (2014). Statistical Issues in Searches for New Physics. arXiv preprint arxiv link
Baker, M. (2015). Over half of psychology studies fail reproducibility test. Nature News. from nature.com 2019-03-07
Horton, R. (2015). Offline: what is medicine's 5 sigma?. The Lancet, 385(9976), 1380. from thelancet.com 2019-03-07

— Sextus Empiricus
সূত্র

4

For a reason entirely different from that of physics, there are other fields with much more strict alphas when they engage in hypothesis testing. Genetic Epidemiology is among them, especially when they use "GWAS" (Genome-Wide Association Study) to look at various genetic markers for disease.

Because a GWAS study is a massive exercise in multiple hypothesis testing, the state-of-the-art analysis techniques are all built around much more strict alphas than 0.05. Other such "candidate screening" study techniques that follow in the wake of the genomics studies will likely do the same.

— Fomite
সূত্র

2

These are only tiny local

α

$\alpha$ s. GWAS have still a overall type I error of 5% for claiming a success that isn't there in reality.

— Horst Grünbusch

3

The level is so high to avoid premature announcements of news that later turns out to be spurious. For more discussion on this, see

https://physics.stackexchange.com/questions/8752/standard-deviation-in-particle-physics?rq=1

https://physics.stackexchange.com/questions/31126/how-many-sigma-did-the-discovery-of-the-w-boson-have

— Arnold Neumaier
সূত্র