কেন কেউ কখনও কোনও কেডি-গাছের উপরে অষ্ট্রি ব্যবহার করবেন?

32

আমার বৈজ্ঞানিক কম্পিউটিংয়ের কিছু অভিজ্ঞতা আছে এবং আমি বিএসপি (বাইনারি স্পেস পার্টিশন) অ্যাপ্লিকেশনগুলির জন্য ব্যাপকভাবে কেডি-ট্রি ব্যবহার করেছি। আমি সম্প্রতি অক্টোবরের সাথে আরও পরিচিত হয়েছি, 3-ডি ইউক্লিডিয়ান স্পেসগুলি বিভাজনের জন্য একই রকম ডেটা স্ট্রাকচার, তবে আমি যা সংগ্রহ করি তা থেকে স্থির নিয়মিত বিরতিতে কাজ করে এমন একটি।

কিছুটা স্বতন্ত্রতা গবেষণা ইঙ্গিত দেয় যে কেডি-ট্রি সাধারণত বেশিরভাগ ডেটাসেটের জন্য নির্মান করা - ত্বরান্বিত করা এবং অনুসন্ধান করা আরও কার্যকর are আমার প্রশ্নটি হল, স্থানিক / সাময়িক পারফরম্যান্সে বা অন্যথায় অক্ট্রিগুলির সুবিধা কী এবং কোন পরিস্থিতিতে তারা সর্বাধিক প্রযোজ্য (আমি থ্রিডি গ্রাফিক্স প্রোগ্রামিং শুনেছি)? উভয় প্রকারের সুবিধাগুলি এবং সমস্যার সংক্ষিপ্তসারটি আমাকে সবচেয়ে প্রশংসা করবে।

অতিরিক্ত হিসাবে, যদি কেউ আর-ট্রি ডেটা কাঠামোর ব্যবহার এবং এর সুবিধাগুলি সম্পর্কে বিস্তারিত বলতে পারে তবে আমি তার জন্যও কৃতজ্ঞ হব। কে-নিকটতম-প্রতিবেশী বা পরিসীমা অনুসন্ধানের জন্য আর-গাছগুলি (অক্টোবরের চেয়ে বেশি) কেডি-ট্রিগুলিতে বেশ একইভাবে প্রয়োগ করা হয় বলে মনে হয়।

ds.data-structures tree

— Noldorin
সূত্র

আমার খেয়াল করা উচিত যে কে-ট্রি এবং আর-ট্রি উভয়ই (তবে অক্ট্রি নয়) কে-নিকটতম প্রতিবেশী অনুসন্ধানগুলি সহজতর করার জন্য বিশেষভাবে ডিজাইন করা হয়েছে বলে মনে হয় - তারা কীভাবে এই অর্থে তুলনা করে?

— নলডোরিন

একটি দ্রষ্টব্য হ'ল কেডি-গাছগুলি ছোট গভীরতার গ্যারান্টিযুক্ত। সংকুচিত কোয়াড গাছ আপনাকে সেখানে পেতে পারে তবে কম সুবিধাজনক are

— সুরেশ ভেঙ্কট

@ সুরেশ ভেঙ্কট: এর জন্য ধন্যবাদ। আমি সংক্ষিপ্ত চতুষ্পদগুলির সাথে পরিচিত নই, তবে তারা কি সত্যিই 3-ডি স্থানিক reps জন্য উপযুক্ত হবে? সম্ভবত একটি "সংকুচিত অষ্ট্রি" অ্যানালগ রয়েছে।

— নলডোরিন

আমি আরও শুনেছি যে যখন কারও একটি জেড-অর্ডার (স্পেস ফিলিং) বক্ররেখা থাকে তখন অষ্টা গাছগুলি আরও উপযুক্ত but

— নলডোরিন

23

একটি ট্রি এর কোষগুলিতে উচ্চতর অনুপাত থাকতে পারে, তবে অষ্ট্রি কোষগুলি ঘনক্ষেত্র হওয়ার গ্যারান্টিযুক্ত। যেহেতু এটি একটি তত্ত্ব বোর্ড, তাই আমি আপনাকে তাত্ত্বিক কারণটি দেব কারণ উচ্চতর অনুপাত একটি সমস্যা: আনুমানিক নিকটবর্তী প্রতিবেশী প্রশ্নের সমাধান করার সময় আপনার যে কোষগুলি পরীক্ষা করতে হবে তা নিয়ন্ত্রণ করতে ভলিউম সীমা ব্যবহার করা অসম্ভব করে তোলে। $kD$

আরো বিস্তারিত: যদি আপনি একটি জন্য অনুরোধ একটি ক্যোয়ারী বিন্দু -approximate নিকটতম প্রতিবেশী , এবং প্রকৃত নিকটতম প্রতিবেশী দূরত্ব এ , আপনি সাধারণত আপ একটি সার্চ দিয়ে শেষ যে ব্যক্তি আপনাকে পরীক্ষা প্রত্যেক ডাটা স্ট্রাকচার সেল যে ভিতর থেকে পৌছানোর একটি annulus বাইরে বা ভেতরের ব্যাসার্ধ বলয়াকার শেল এবং বাইরের ব্যাসার্ধ । কোষগুলি যদি চতুর্দিকে যেমন অনুপাতের সীমাবদ্ধ থাকে, তবে সেখানে সর্বাধিক যেমন কোষ থাকতে পারে এবং আপনি ক্যোয়ারির জন্য সময়টিতে ভাল সীমানা প্রমাণ করতে পারেন। যদি অনুপাতের অনুপাতটি সীমাবদ্ধ না হয়, যেমন কোনও ট্রি হিসাবে, এই সীমাগুলি প্রযোজ্য নয়। $\epsilon$ $q$ $d$ $d$ $(1+\epsilon)d$ $1/\epsilon^{d-1}$ $kD$

$kD$ -trees have a different advantage over quadtrees, in that they are guaranteed to have at most logarithmic depth, which also contributes to the time for a nearest neighbor query. But the depth of a quadtree is at most the number of bits of precision of the input which is generally not large, and there are theoretical methods for controlling the depth to be essentially logarithmic (see the skip quadtree data structure).

— David Eppstein
সূত্র

4

See Sariel Har-Peled's recent textbook for a modern summary of compressed quadtrees.

— Jeffε

Thanks for a good quantitative summary, David. Just to confirm: is your use of "aspect ratio" synonymous with "branching ratio"? I'll definitely have to check into skip quadrees/octrees and also compressed quadtrees/octrees perhaps.

— Noldorin

1

The aspect ratio of a rectangular box can be defined as the ratio of its longest edge length to its shortest edge length. I don't know what branching ratio is supposed to mean in this context but aspect ratio is unrelated to the branching factor of the trees (which is constant for both data structures).

— David Eppstein

I missed the "cells in". Makes sense now.

— Noldorin

15

A group of friends and I are working on a space-RTS game as a fun side project. We're using a lot of the stuff we've learned at Computer Science to make it highly efficient, enabling us to make massive armies later on.

For this purpose we've considered using kd-trees, but we quickly dismissed them: insertions and deletions are extremely common in our program (consider a ship flying through space), and this is an unholy mess with kd-trees. We therefore picked octrees for our game.

— Alex ten Brink
সূত্র

Ah yes, I've heard this before too. Insertion/deletion with kd-trees is a costly operation (due to re-balancing). I believe the best-case time complexities are still the same however...

— Noldorin

2

It depends how you go about fixing the kd-tree. A good best-case time complexity is not something I generally aim for: for example bogosort has an O(1) best-case complexity, but I hope no one uses it.

— Alex ten Brink

Unfortunately I can't seem to find any good summaries of time complexities for common operations on these data structures, but not to mind. Average-case time complexity is often insightful...

— Noldorin

1

I really think you'd still do better if you just used a KD-tree that cycled axes and simply divided the space down the middle. Skip the bulky SAH and other expensive median cuts and you'll end up with something that not only searches faster than an octree but also builds faster. Since you're partitioning the space evenly as you would with an octree, but with a binary tree rather than an 8-ary tree, whatever you were doing before for removals shouldn't be any more complex with the KD-tree, as it'll be evenly-spaced in a similar way. Ex: you might simply remove empty nodes beyond a depth of N.

— Dragon Energy

8

what are the advantages of octrees in spatial/temporal performance or otherwise, and in what situations are they most applicable (I've heard 3D graphics programming)?

k-D trees are balanced binary trees and octrees are tries so the advantages and disadvantages are probably inherited from those more general data structures. Specifically:

Rebalancing can be expensive (octrees don't need rebalancing).
Balancing handles heterogeneity better because it is adaptive.
Higher branching factor in octrees means shallower trees (fewer indirections and allocations) for homogeneous distributions.

Also, bisection (as in octrees) lends itself to trivial implementation in terms of bit-twiddling. Similarly, I imagine octrees can benefit greatly from precomputed distances when doing range lookups.

EDIT

Apparently my references to tries and homogeneity need clarification.

Tries are a family of data structures represented by trees of dictionaries and are used as dictionaries for keys that are sequences (most notably strings but also DNA sequences and the bits in a hash value for hash tries). If each dictionary maps one bit of each of x, y and z coordinates (most significant bit in the first level of the trie, next significant bit in the second level etc.) then the trie is an octree that uniformly subdivides 3D space. Hence octrees inherit the characteristics of tries which are, in general:

High branching factor can mean shallow trees that incur few indirections so searching is fast, e.g. 20 levels of binary tree can be stored in 4 levels of a tree with a branching factor of 256.
Tries don't get rebalanced during insertions and deletions, saving an expensive operation required for balanced binary trees.

The disadvantage is that heterogeneity can result in imbalanced tries/octrees so searches can require many indirections. The equivalent problem in tries is solved using edge compression to collapse multiple levels of indirection into a single level. Octrees don't do this but there is nothing to stop you from compressing an octree (but I don't think you could call the result an octree!).

For comparison, consider a specialized dictionary for string keys that is represented as a trie. The first level of the trie branches on the first character in the key. The second level on the second character and so on. Any string can be looked up by searching for the first character from the key in the dictionary to obtain a second dictionary that is used to lookup the second character from the key and so on. A set of random key strings would be a homogeneous distribution. A set of key strings that all share some prefix (e.g. all words beginning with "anti") are a heterogeneous distribution. In the latter case, the first dictionary contains only one binding, for "a", the second only one for "n" and so on. Searching for any mapping in the trie always being by searching the same four dictionaries with the same four keys. This is inefficient and this is what octrees do if, for example, they are used to store heterogeneous particle distributions where the vast majority of particles lie in a tiny volume within the vector space.

— Jon Harrop
সূত্র

"octrees are tries"? Also, what do you mean by "handles heterogeneity better"? Homogeneous isn't a word I've encountered with respect to trees.

— Noldorin

2

"Octtrees don't need rebalancing"? That's absolutely not true for octtrees that store heterogeneous point distributions. Alternately, depending on how generally you define "octtree": Rebalancing an octtree is simply impossible, no matter how desirable it might be.

— Jeffε

@Noldorin "octrees are tries". Yes. Do you know what a trie is? en.wikipedia.org/wiki/Trie

— Jon Harrop

@Noldorin "Homogeneous isn't a word I've encountered with respect to trees". I'm referring to the homogeneity of the distribution that is being partitioned. For example, when partitioning particles in a 3D space then atoms in a solid are homogeneously distributed whereas stars in the universe are heterogeneously distributed. k-D trees are more likely to be preferable for heterogeneous distributions because their subdivision of space is adaptive.

— Jon Harrop

@JɛﬀE "Rebalancing an octtree is simply impossible". That's exactly what I was referring to. Apologies if my wording was confusing.

— Jon Harrop

2

Octrees are useful as a base datatype for continuum models, see for example the Gerris flow solver. Life is difficult enough in fluid dynamics, so knowing that the sizes of all of your subcubes depends only on their depth must a simplifying factor.

Caveat: I am not a fluid dynamicist!

— jjg
সূত্র

Interesting. I can definitely appreciate that octrees are simpler to work with in continuum models... I wonder what the reason is for graphics programming though?

— Noldorin