গুণাগুণ সম্ভাবনার চেয়ে লগের সম্ভাবনাগুলি কেন দ্রুত যুক্ত করা হচ্ছে?

প্রশ্নটি ফ্রেম করতে কম্পিউটার বিজ্ঞানে প্রায়শই আমরা বেশ কয়েকটি সম্ভাবনার পণ্য গণনা করতে চাই:

P(A,B,C) = P(A) * P(B) * P(C)

সরলতম পদ্ধতিটি কেবল এই সংখ্যাগুলিকে গুন করা এবং এটিই আমি করতে যাচ্ছিলাম। তবে, আমার বস জানিয়েছেন সম্ভাবনার লগ যুক্ত করা আরও ভাল:

log(P(A,B,C)) = log(P(A)) + log(P(B)) + log(P(C))

এটি লগের সম্ভাবনা দেয় তবে প্রয়োজনের পরে আমরা সম্ভাবনাটি পেতে পারি:

P(A,B,C) = e^log(P(A,B,C))

লগ সংযোজন দুটি কারণে ভাল বিবেচনা করা হয়:

এটি "আন্ডারফ্লো" রোধ করে যার ফলে সম্ভাবনার পণ্যগুলি এত ছোট যে এটি গোলাকৃত হয়ে যায়। এটি প্রায়শই ঝুঁকিপূর্ণ হতে পারে কারণ সম্ভাবনাগুলি প্রায়শই খুব কম থাকে small
এটি আরও দ্রুত কারণ অনেকগুলি কম্পিউটার আর্কিটেকচার গুণনের চেয়ে আরও দ্রুত সম্পাদন করতে পারে।

আমার প্রশ্ন দ্বিতীয় বিষয় সম্পর্কে। এইভাবে আমি এটি বর্ণনাতে দেখেছি, তবে এটি লগ পাওয়ার অতিরিক্ত মূল্য বিবেচনা করে না! আমাদের "লগের ব্যয় + যোগের ব্যয়" "গুণনের ব্যয়" এর সাথে তুলনা করা উচিত। বিষয়টি আমলে নেওয়ার পরেও কি এটি আরও ছোট?

এছাড়াও, উইকিপিডিয়া পৃষ্ঠা ( লগ সম্ভাব্যতা ) এই ক্ষেত্রে বিভ্রান্ত করছে, উল্লেখ করে "লগ ফর্ম রূপান্তর ব্যয়বহুল, তবে একবারেই ব্যয় হয়েছে।" আমি এটি বুঝতে পারি না, কারণ আমি মনে করি যোগ করার আগে আপনার প্রতিটি পদটির লগ স্বাধীনভাবে নেওয়া উচিত। আমি কী মিস করছি?

পরিশেষে, "কম্পিউটারগুলি সংখ্যাবৃদ্ধির চেয়ে আরও দ্রুত সম্পাদন করে" এই ন্যায্যতাটি এক ধরণের অস্পষ্ট। এটি কি x86 নির্দেশের সাথে নির্দিষ্ট, বা এটি প্রসেসরের আর্কিটেকচারের আরও কিছু মৌলিক বৈশিষ্ট্য?

algorithm-analysis probability-theory

— স্টিফেন
সূত্র

প্রথম সুবিধা (আন্ডারফ্লো এড়ানো) প্রায়শই পারফরম্যান্স লাভের চেয়ে অনেক বেশি গুরুত্বপূর্ণ, তাই এটি দ্রুত না হলেও আমরা লগের সম্ভাবনাগুলি ব্যবহার করতাম।

— ডিডাব্লু

@ ডিডব্লিউ যা বলেছে তার প্রসারিত করতে, সেখানে একই রকম "লগ-সাম-এক্সপ্রেস ট্রিক" রয়েছে যা বিশেষত আন্ডারফ্লোকে সম্বোধন করতে ব্যবহৃত হয়েছে, যা যা পারফরম্যান্সের ক্ষেত্রেই হোক না কেন। প্রকৃতপক্ষে, এই প্রথম আমি কাউকে একটি পারফরম্যান্স-উন্নতি কৌশল হিসাবে লগারিদম গ্রহণ বিবেচনা করতে দেখলাম!

— মেহরদাদ

উত্তর:

এছাড়াও, উইকিপিডিয়া পৃষ্ঠা ( https://en.wikedia.org/wiki/Log_probability ) এই ক্ষেত্রে বিভ্রান্তিকর, উল্লেখ করে যে "লগ ফর্ম রূপান্তর ব্যয়বহুল, তবে একবারেই ব্যয় করা হয়েছে।" আমি এটি বুঝতে পারি না, কারণ আমি মনে করি যোগ করার আগে আপনার প্রতিটি পদটির লগ স্বাধীনভাবে নেওয়া উচিত। আমি কী মিস করছি?

আপনি যদি কেবল একবার গণনা করতে চান তবে আপনি ঠিক বলেছেন। আপনাকে লোগারিদম এবং সংযোজন গণনা করতে হবে , যেখানে সাদামাটা পদ্ধতিতে গুণ করা দরকার requires $P(A_1)\ldots P(A_n)$ $n$ $n-1$ $n-1$

তবে এটি খুব সাধারণ যে আপনি ফর্মের প্রশ্নের উত্তর দিতে চান:

কম্পিউট জন্য কিছু উপসেট এর । $\prod_{i \in I} P(A_i)$ $I$ $\{1, \ldots n\}$

সেক্ষেত্রে আপনি সমস্ত কেবল একবার গণনা করতে আপনার ডেটা প্রিক্রোস করতে পারেন এবং প্রতিটি প্রশ্নের উত্তর দিয়েছিলেন সংযোজন। $\log P(A_i)$ $|I|$

পরিশেষে, "কম্পিউটারগুলি সংখ্যাবৃদ্ধির চেয়ে আরও দ্রুত সম্পাদন করে" এই ন্যায্যতাটি এক ধরণের অস্পষ্ট। এটি কি x86 নির্দেশের সাথে নির্দিষ্ট, বা এটি প্রসেসরের আর্কিটেকচারের আরও কিছু মৌলিক বৈশিষ্ট্য?

এটি একটি বিস্তৃত প্রশ্ন। সংযোজনের তুলনায় গুনে গুণ করা সাধারণত (সম্ভবত?) আরও শক্ত। কম্পিউটিং রৈখিক আকার হয় এবং (তুচ্ছ অ্যালগরিদম ব্যবহার করে), যেহেতু আমরা বর্তমানে গনা কিভাবে জানি না একই সময় জটিলতা (সেরা আলগোরিদিম চেক দিয়ে এখানে )। $a+b$ $a$ $b$ $a\times b$

অবশ্যই এর কোন সুস্পষ্ট উত্তর নেই: উদাহরণস্বরূপ যদি আপনি কেবল পূর্ণসংখ্যার সাথে ডিল করেন এবং আপনি শক্তি দিয়ে গুণ করেন , তবে আপনার পরিবর্তে অ্যাড অপারেশনগুলির সাথে শিফ্টের তুলনা করা উচিত। $2$

তবুও এটি সমস্ত সাধারণ কম্পিউটার আর্কিটেকচারের পক্ষে যুক্তিসঙ্গত বক্তব্য: ভাসমান-পয়েন্ট সংখ্যাগুলিতে গুণটি সংযোজনের তুলনায় ধীর হবে।

— MD5
সূত্র

P (A_{i})

$P(A_i)$

ফাইনাল এক্সপ্রেস ()? এতো আস্তে না?

— মেহরদাদ

Θ (M (n) \log n)

$\Theta(M(n)\log n)$

M (n)

$M(n)$ is the complexity of a multiplication algorithm. So it would give a

Θ (n M (n) \log n + n \sum_{q \in Q} | I_{q} |)

$\Theta(nM(n)\log n+n\sum_{q\in Q}|I_q|)$ complexity (where

Q

$Q$ is the set of queries).

— md5

@Mehrdad: It is as difficult as computing a logarithm. However I'm not sure you'll ever need to do that. For instance if you only compare probabilities you'd rather not compute the final

\exp

$\exp$ . The multiplication of

n

$n$ numbers in

(0, 1)

$(0,1)$ may quickly become very small, so for the same reason we try to avoid underflow by using log probabilities, we should stay in the logarithmic form at the end (e.g. by computing

\log

$\log$ in base

10

$10$ , so that it's even more "human-readable").

— md5

Is addition still faster than multiplication if you use IEEE floats - which you certainly will in this case? Modern cpus are pretty good at multiplying numbers whereas float addition has a couple steps that can't be executed simultaneously - align mantissas (shift left based on the result of subtraction), then actually add them, then normalize (which may trigger both underflow and overflow, yay). In circuit it's quite a lot of die, in microcode each step costs a cycle or few.

— John Dvorak

By "incurred once" it probably means that if you have $N$ probabilities $p_1,...p_N$ then you switch to log space only once by taking logs of each $p_i$ , perform probability multiplications in the log space by adding them (which is less time consuming), and then switch back to your initial space using exponentiation.

If the number of operation is only slightly greater than $N$ then I think there is no meaning of switching to the log space (from the performance point of view). However, if the number of operations is too many , then I think it is worth switching to the log space. For example, say you have 50 variables, and your computation involves 1000 multiplications. Then I think you should work in log space.

Finally, addition is faster than multiplication not because of machine architecture. Addition is inherently faster than multiplication. In terms of complexity, it takes $O(n)$ (linear) time to perform addition of two $n$ -bit integers, while multiplication takes $O(n^2)$ (quadratic) time.

By the way, this idea is similar to the Montgomery modular multiplication, where multiplications are performed in the Montgomery form which is quite faster than usual multiplication and then reduction.

— fade2black
সূত্র

-1 Multiplication does not take quadratic time...

— Mehrdad

@Mehrdad, i hope you learnt school multiplication of two numbers. That algoritnm is still widely used on computer chips please look here What you mean is software level algorithms which are still worse than linear time. Are these multiplication algorithms widely used as on multiplication circuit?

— fade2black

en.wikipedia.org/wiki/Carry-save_adder#The_basic_concept

— Mehrdad

The spirit of the answer is still correct though, right? If none of the multiplication algorithms are going to match the linear time of addition?

— Stephen

@Stephen, in fact the question was not about what the exact best complexity of multiplication algorithm is. I could provide additional information on this subject if commenters required. I think that a long discussion on that would be off-topic here. )))

— fade2black