একটি স্থিতিশীল গাদা আছে?

32

নিম্নলিখিত ক্রিয়াকলাপগুলিকে সমর্থন করে এমন কোনও অগ্রাধিকারের কিউ ডেটা কাঠামো রয়েছে?

সন্নিবেশ (x, পি) : অগ্রাধিকার পি সহ একটি নতুন রেকর্ড এক্স যুক্ত করুন
স্টেবলএক্সট্রাক্টমিন () : সন্নিবেশ ক্রমের মাধ্যমে সম্পর্কগুলি ভেঙে ন্যূনতম অগ্রাধিকার সহ রেকর্ডটি ফিরিয়ে দিন এবং মুছুন ।

সুতরাং, সন্নিবেশ (ক, 1), সন্নিবেশ (খ, 2), সন্নিবেশ (সি, 1), সন্নিবেশ (ডি, 2) এর পরে স্টেবলএক্সট্রাক্টমিনের ক্রমটি একটি, তারপরে সি, তারপরে বি, তারপরে ফিরে আসবে।

স্পষ্টতই কেউ আসল অগ্রাধিকার হিসাবে জোড় সংরক্ষণ করে যে কোনও অগ্রাধিকার সারি ডাটা স্ট্রাকচার ব্যবহার করতে পারে , তবে আমি ডেটা কাঠামোতে আগ্রহী যে সন্নিবেশের সময় সন্নিবেশ বার (বা সন্নিবেশ ক্রম) স্পষ্টভাবে সংরক্ষণ করে না স্থিতিশীল বাছাই। $(p, time)$

সমতুল্যভাবে (?) আছে: heapsort যে আবশ্যকতা নেই একটি স্থিতিশীল সংস্করণ $\Omega(n)$ অতিরিক্ত স্থান?

ds.data-structures

— Jeffε
সূত্র

আমার মনে হয় আপনি "এ, তারপরে সি, তারপরে বি, তারপরে" ডি?

— রস স্নাইডার

লিঙ্কযুক্ত লিপিবদ্ধ তালিকার সাথে গাদা + সুষম বাইনারি গাছ সংশ্লিষ্ট লিঙ্কযুক্ত তালিকার দিকে অগ্রাধিকার নির্দেশ করে কী কাজ করবে না? আমি কী মিস করছি?

— আর্যভট্ট

মরন: এটি সন্নিবেশ ক্রমটি স্পষ্টভাবে সংরক্ষণ করছে, যা আমি এড়াতে চাই ঠিক এটি। আমি সমস্যার বিবৃতিটি স্পষ্ট করেছি (এবং রসের টাইপও স্থির করে দিয়েছি)।

— জেফি

16

বেন্টলি-স্যাক্স পদ্ধতিটি মোটামুটি প্রাকৃতিক স্থিতিশীল অগ্রাধিকারের সারি দেয়।

বাছাই করা অ্যারে এর ক্রমতে আপনার ডেটা সংরক্ষণ করুন । আকার । প্রতিটি অ্যারের একটি কাউন্টার বজায় রাখে । অ্যারে $A_0,\ldots,A_k$ $A_i$ $2^i$ $c_i$ $A_i[c_i],\ldots,A_i[2^i-1]$ থাকা ডেটা রয়েছে।

প্রত্যেকের জন্য , এ সব উপাদান মধ্যে তুলনায় আরো সম্প্রতি যোগ করা হয় নি এবং প্রতিটি মধ্যে উপাদানের বন্ধন সঙ্গে মান দ্বারা আদেশ হয় পুরোনো উপাদানের নতুন উপাদানের এগিয়ে স্থাপন দ্বারা ভাঙ্গা হচ্ছে। নোট করুন যে এর অর্থ আমরা এবং একীভূত করতে পারি এবং এই ক্রমটি সংরক্ষণ করতে পারি। (একত্রীকরণ সময় বন্ধন ক্ষেত্রে, থেকে উপাদান নিতে ।) $i$ $A_i$ $A_{i+1}$ $A_i$ $A_i$ $A_{i+1}$ $A_{i+1}$

একটি মান ঢোকাতে , ক্ষুদ্রতম খুঁজে যেমন যে 0 উপাদান, মার্জ এবং , এই সংরক্ষণ এবং সেট উপযুক্তভাবে। $x$ $i$ $A_i$ $A_0,\ldots,A_{i-1}$ $x$ $A_i$ $c_0,\ldots,c_i$

সর্বনিম্ন বের করে আনতে, বৃহত্তম সূচক এটি যেমন যে প্রথম উপাদান সর্বাঙ্গে সর্বনিম্ন এবং বৃদ্ধি । $i$ $A_i[c_i]$ $i$ $c_i$

স্ট্যান্ডার্ড আর্গুমেন্ট অনুসারে, এটি প্রতি ক্রিয়াকলাপকে সূক্ষ্ম সময় দেয় এবং উপরে বর্ণিত ক্রমের কারণে স্থিতিশীল। $O(\log n)$

সন্নিবেশ এবং নিষ্কাশনগুলির ক্রমগুলির জন্য , এটি অ্যারে এন্ট্রিগুলি (খালি অ্যারে রাখবেন না) প্লাস বুককিপিং ডেটার শব্দ ব্যবহার করে। এটি মিহাইয়ের প্রশ্নের সংস্করণটির উত্তর দেয় না, তবে এটি দেখায় যে স্থির প্রতিবন্ধকতার জন্য প্রচুর স্থানের ওভারহেডের প্রয়োজন হয় না। বিশেষত, এটি দেখায় যে অতিরিক্ত স্থানের জন্য কোনও নিম্ন-সীমাবদ্ধ নেই। $n$ $n$ $O(\log n)$ $\Omega(n)$

আপডেট করুন: রল্ফ Fagerberg যে পয়েন্ট আউট যদি আমরা নাল সংরক্ষণ করতে পারেন (অ-ডেটা) মান, তাহলে এই পুরো ডাটা স্ট্রাকচার আকারের একটি অ্যারের মধ্যে বস্তাবন্দী করা যাবে , যেখানে এতদূর সন্নিবেশ সংখ্যা। $n$ $n$

প্রথমে লক্ষ্য করুন যে আমরা কে সেই ক্রমে একটি অ্যারেতে প্যাক করতে পারি ( প্রথমে, এরপরে এটি খালি নয়, ইত্যাদি)। এর কাঠামোটি এর বাইনারি উপস্থাপনা দ্বারা সম্পূর্ণভাবে এনকোড করা হয়েছে , এখনও অবধি elementsোকানো উপাদানগুলির সংখ্যা। বাইনারি উপস্থাপনা তাহলে অবস্থানে একটি 1 আছে , তারপর ব্যাপৃত হবে অ্যারের অবস্থান, অন্যথায় এটা কোন অ্যারের অবস্থানে ব্যাপৃত হবে। $A_k,\ldots,A_0$ $A_k$ $A_{k-1}$ $n$ $n$ $i$ $A_i$ $2^i$

সন্নিবেশ করানোর সময়, এবং আমাদের অ্যারের দৈর্ঘ্য 1 দিয়ে বৃদ্ধি পায় এবং আমরা একত্রিত করতে পারি $n$ $A_0,\ldots,A_i$ প্লাস নতুন স্থানে বিদ্যমান স্থিতিশীল মার্জিং অ্যালগরিদম ব্যবহার করে একত্রী করতে পারি।

এখন, যেখানে আমরা নাল মান ব্যবহার কাউন্টারে পরিত্রাণ হয় । ইন , আমরা প্রথম মান দ্বারা অনুসরণ সংরক্ষণ , নাল মান অবশিষ্ট দ্বারা অনুসরণ মান। একটি নির্যাস-মিনিট সময়, আমরা এখনও নির্যাস মান জানতে পারেন পরীক্ষা করে সময় । যখন আমরা এই মানটি পাই $c_i$ $A_i$ $c_i$ $2^i-c_i-1$ $O(\log n)$ $A_0[0],\ldots,A_k[0]$ $A_i[0]$ আমরা সেট নাল এবং তারপর উপর বাইনারি অনুসন্ধান প্রথম অ নাল মান এটি এবং swap 'র এবং । $A_i[0]$ $A_i$ $A_i[c_i]$ $A_i[0]$ $A_i[c_i]$

শেষ ফলাফল: পুরো কাঠামোটি এমন এক অ্যারে দিয়ে প্রয়োগ করা যেতে পারে যার দৈর্ঘ্য প্রতিটি সন্নিবেশের সাথে বৃদ্ধি এবং একটি কাউন্টার, , যা সন্নিবেশনের সংখ্যা গণনা করে। $n$

— প্যাট মরিন
সূত্র

1

এটি ও (এন) উত্তোলনের পরে প্রদত্ত তাত্ক্ষণিক ক্ষেত্রে সম্ভাব্য ও (এন) অতিরিক্ত স্থান ব্যবহার করে, না? এই মুহুর্তে আপনি পাশাপাশি অগ্রাধিকারটিও সঞ্চয় করতে পারেন ...

— মেহরদাদ

10

আমি নিশ্চিত না যে আপনার প্রতিবন্ধকতাগুলি কী; নিম্নলিখিতগুলি কি যোগ্যতা অর্জন করে? অ্যারেতে ডেটা সংরক্ষণ করুন, যা আমরা অন্তর্ভুক্ত বাইনারি ট্রি হিসাবে ব্যাখ্যা করি (বাইনারি হিপগুলির মতো), তবে গাছের অভ্যন্তরীণ নোডের পরিবর্তে গাছের নীচের স্তরে ডেটা আইটেমগুলি দিয়ে with গাছের প্রতিটি অভ্যন্তরীণ নোড তার দুটি সন্তানের কাছ থেকে অনুলিপি করা মানগুলির ছোট ছোট সঞ্চয় করে; সম্পর্কের ক্ষেত্রে বাম সন্তানের অনুলিপি করুন।

সর্বনিম্ন সন্ধান করতে গাছের গোড়াটি দেখুন।

কোনও উপাদান মোছার জন্য, এটি মুছে ফেলা (অলস মুছে ফেলা) হিসাবে চিহ্নিত করুন এবং গাছটি প্রচার করুন (মুছে ফেলা উপাদানটির একটি অনুলিপি ধারণকারী মূলের প্রতিটি নোড তার অন্য সন্তানের অনুলিপি দ্বারা প্রতিস্থাপন করা উচিত)। মুছে ফেলা উপাদানগুলির একটি গণনা বজায় রাখুন এবং যদি এটি সমস্ত উপাদানের একটি খুব বড় অংশ হয়ে যায় তবে নীচের স্তরে উপাদানগুলির ক্রম সংরক্ষণ করে কাঠামোটি পুনর্নির্মাণ করুন - পুনর্নির্মাণটি রৈখিক সময় নেয় তাই এই অংশটি কেবলমাত্র ধ্রুবক মোড়িত সময় যুক্ত করে অপারেশন জটিলতা।

কোনও উপাদান sertোকাতে, গাছের নীচের সারিতে পরবর্তী নিখরচায় এটি যুক্ত করুন এবং শিকড়টিতে পাথ আপডেট করুন। অথবা, যদি নীচের সারিটি পূর্ণ হয়ে যায় তবে গাছের আকার দ্বিগুণ করুন (আবার একটি orশ্বর্যিকরণের যুক্তি দিয়ে; নোট করুন যে কোনও মানক বাইনারি হিপ তার অ্যারে ছাড়িয়ে গেলে এই অংশটি পুনর্নির্মাণের প্রয়োজনের চেয়ে আলাদা নয়)।

এটি মিহাইয়ের প্রশ্নের কঠোর সংস্করণের কোনও উত্তর নয়, যদিও, এটি সত্যিকারের অন্তর্নিহিত ডেটা কাঠামোর চেয়ে দ্বিগুণ মেমরি ব্যবহার করে, এমনকি যদি আমরা অলসভাবে মুছে ফেলার ব্যবস্থাপনার স্পেস ব্যয়কে উপেক্ষা করি।

— ডেভিড এপস্টিন
সূত্র

আমি এই পছন্দ। নিয়মিত অন্তর্নিহিত গাছের মতো ন্যূনতম হিপ যেমন, সম্ভবত 3-অ্যারি বা 4-অ্যারি অন্তর্ভুক্ত গাছ ক্যাশে প্রভাবগুলির কারণে দ্রুততর হবে (যদিও আপনার আরও তুলনা প্রয়োজন)।

— জোনাথন গ্রেহল

8

নিম্নলিখিতটি কি আপনার সমস্যার বৈধ ব্যাখ্যা:

আপনাকে সমর্থনযোগ্য এমন কোনও সহায়ক তথ্য ছাড়াই এ [1..N] এর একটি অ্যারে এন কীগুলি সংরক্ষণ করতে হবে: * কী সন্নিবেশ করুন * মুছুন মিনিট, যদি একাধিক মিনিমা থাকে তবে সর্বাগ্রে সন্নিবেশকৃত উপাদানটি বেছে নেয়

এটি বেশ শক্তভাবে উপস্থিত দেখা যায়, বেশিরভাগ অন্তর্নিহিত ডেটা স্ট্রাকচার কিছু উপাদানের স্থানীয় ক্রম অনুসারে এনকোডিং বিটের কৌশল চালায়। এখানে যদি একাধিক ছেলেরা সমান হয় তবে তাদের ক্রমটি সংরক্ষণ করতে হবে, সুতরাং এ জাতীয় কোনও কৌশল সম্ভব নয়।

মজাদার.

— mihai
সূত্র

1

I think this should be a comment, not an answer, as it doesn't really answer the original question. (You can delete it and add it as a comment.)

— Jukka Suomela

5

Yeah, this website is a bit ridiculous. We have reputations, bonuses, rewards, all sorts of ways to comment that I can't figure out. I wish this would look less like a kids' game.

— Mihai

1

I think he needs more rep to post a comment. that's the problem.

— Suresh Venkat

@Suresh: Oh, right, I didn't remember that. How are we actually supposed to handle this kind of situation (i.e., a new user needs to ask for clarifications before answering a question)?

— Jukka Suomela

2

no easy way out. I've seen this often on MO. Mihai will have no trouble gaining rep, if its the Mihai I think it is :)

— Suresh Venkat

4

Short answer : You can't.

Slightly longer answer :

You'll need $\Omega(n)$ extra space to store the "age" of your entry which will allow you to discriminate between identical priorities. And you'll need $\Omega(n)$ space for information that will allow fast insertions and retrievals. Plus your payload (value and priority).

And, for each payload you store, you'll be able to "hide" some information in the address (e.g. $addr(X) < addr(Y)$ means Y is older than X). But in that "hidden" information, you'll either hide the "age", OR the "fast retrieval" information. Not both.

Very long answer with inexact flaky pseudo-math :

Note : the very end of the second part is sketchy, as mentioned. If some math guy could provide a better version, I'd be grateful.

Let's think about the amount of data that is involved on an X-bit machine (say 32 or 64-bit), with records (value and priority) $P$ machine words wide.

You have a set of potential records that is partially ordered : $(a,1) < (a,2)$ and $(a,1) = (a,1)$ but you can't compare $(a,1)$ and $(b,1)$ .

However you want to be able to compare two non-comparable values from your set of records, based on when they were inserted. So you have here another set of values : those that have been inserted, and you want to enhance it with a partial order : $X < Y$ iff $X$ was inserted before $Y$ .

In the worst-case scenario, your memory will be filled with records of the form $(?,1)$ (with $?$ different for each one), so you'll have to rely entirely upon the insertion time in order to decide which one goes out first.

The insertion time (relative to other records still in the structure) requires $X - log_2(P)$ bits of information (with P-byte payload and $2^X$ accessible bytes of memory).
The payload (your record's value and priority) requires $P$ machine words of information.

That means that you must somehow store $X - log_2(P)$ extra bits of information for each record you store. And that's $O(n)$ for $n$ records.

Now, how much bits of information does each memory "cell" provide us ?

$W$ bits of data ( $W$ being the machine word width).
$X$ bits of address.

Now, let's assume $P \geq 1$ (payload is at least one machine word wide (usually one octet)). This means that $X - log_2(P) < X$ , so we can fit the insertion order information in the cell's address. That's what happening in a stack : cells with the lowest address entered the stack first (and will get out last).

So, to store all our information, we have two possibilities :

Store the insertion order in the address, and the payload in memory.
Store both in memory and leave the address free for some other usage.

Obviously, in order to avoid waste, we'll use the first solution.

Now for the operations. I suppose you wish to have :

$Insert(task, priority)$ with $O(log n)$ time complexity.
$StableExtractMin()$ with $O(log n)$ time complexity.

Let's look at $StableExtractMin()$ :

The really really general algorithm goes like this :

Find the record with minimum priority and minimum "insertion time" in $O(log n)$ .
Remove it from the structure in $O(log n)$ .
Return it.

For example, in the case of a heap, it will be slightly differently organized, but the work is the same : 1. Find the min record in $0(1)$ 2. Remove it from the structure in $O(1)$ 3. Fix everything so that next time #1 and #2 are still $O(1)$ i.e. "repair the heap". This needs to be done in "O(log n)" 4. Return the element.

Going back to the general algorithm, we see that to find the record in $O(log n)$ time, we need a fast way to choose the right one between $2^(X - log_2(P))$ candidates (worst case, memory is full).

This means that we need to store $X - log_2(P)$ bits of information in order to retrieve that element (each bit bisects the candidate space, so we have $O(log n)$ bisections, meaning $O(log n)$ time complexity).

These bits of information might be stored as the address of the element (in the heap, the min is at a fixed address), or, with pointers for example (in a binary search tree (with pointers), you need to follow $O(log n)$ on average to get to the min).

Now, when deleting that element, we'll need to augment the next min record so it has the right amount of information to allow $O(log n)$ retrieval next time, that is, so it has $X - log_2(P)$ bits of information discriminating it from the other candidates.

That is, if it doesn't have already enough information, you'll need to add some. In a (non-balanced) binary search tree, the information is already there : You'll have to put a NULL pointer somewhere to delete the element, and without any further operation, the BST is searchable in $O(log n)$ time on average.

After this point, it's slightly sketchy, I'm not sure about how to formulate that. But I have the strong feeling that each of the remaining elements in your set will need to have $X - log_2(P)$ bits of information that will help find the next min and augment it with enough information so that it can be found in $O(log n)$ time next time.

The insertion algorithm usually just needs to update part of this information, I don't think it will cost more (memory-wise) to have it perform fast.

Now, that means that we'll need to store $X - log_2(P)$ more bits of information for each element. So, for each element, we have :

The insertion time, $X - log_2(P)$ bits.
The payload $P$ machine words.
The "fast search" information, $X - log_2(P)$ bits.

Since we already use the memory contents to store the payload, and the address to store the insertion time, we don't have any room left to store the "fast search" information. So we'll have to allocate some extra space for each element, and so "waste" $\Omega(n)$ extra space.

— Suzanne Dupéron
সূত্র

did you really intend to make your answer CW ?

— Suresh Venkat

Yes. My answer isn't 100% correct, like stated within, and It'd be good if anybody could correct it even if I'm not on SO anymore or whatever. Knowledge should be shared, knowledge should be changeable. But maybe I misunderstood the usage of CW, if so, please tell me :) . EDIT : whoops, indeed I just discovered that I won't get any rep from CW posts and that the content is CC-wiki licenced in any way... Too bad :).

— Suzanne Dupéron

3

If you implement your priority queue as a balanced binary tree (a popular choice), then you just have to make sure that when you add an element to the tree, it gets inserted to the left of any elements with equal priority.
This way, the insertion order is encoded in the structure of the tree itself.

— TonyK
সূত্র

1

But this adds O(n) space for the pointers, which I think is what the questioner wants to avoid?

— Jeremy

-1

I don't think that's possible

concrete case:

min heap with all x > 1

heapifying will eventually give something a choice like so

now which 1 to propagate to root?

— ratchet freak
সূত্র