performance - Could I use a faster data structure than a tree for this? -


i have binary decision tree. takes inputs array of floats, , each branch node splits on input index , value taking me leaf.

i'm performing massive number of lookups on tree (about 17% of execution time according performance analysis (edit: having optimised other areas it's @ 40%)), , wondering if could/should using different data structure improve lookup speed.

some kind of hash table can't used, inputs not map directly leaf node, wondering had suggesting methods , data-structures use in place of tree (or as?) improve lookup speeds.

memory concern, less of concern speed.

code written in c#, method applied.

edit: there's bit code post, i'll give more detail tree.

the tree generated using information gain calculations, it's not 50/50 split, split value float value. single input split multiple times increasing resolution on input.

i posted question performance of iterator here:

micro optimisations iterating through tree in c#

but think might need @ data structure improve performance further.

i'm aiming performance possible here. i'm working on new method of machine learning, , tree grows using feedback loop. process i'm working on, estimate it'll running several months, few % saving here , there massive. ultimate goal speed without using memory.

if understand correctly, have floating point ranges have mapped decision. this:

       x <= 0.0      : decision  0.0 < x <= 0.5      : decision b  0.5 < x <= 0.6      : decision c  0.6 < x             : decision d 

a binary tree pretty way handle that. long tree balanced , input values evenly distributed across ranges, can expect o(log2 n) comparisons, n number of possible decisions.

if tree not balanced, doing far more comparisons necessary. in worst case: o(n). @ trees , see how deep are. if same tree used again , again, cost spent rebalancing once may amortized on many lookups.

if input values not evenly distributed (and know ahead of time), might want special-case order of comparisons common cases detected early. can manipulating tree or adding special cases in code before checking tree.

if you've exhausted algorithmic improvements , still need optimize, might data structure better locality general binary tree. example, put partition boundaries contiguous array , perform binary search on it. (and, if array isn't long, might try linear search on array may friendlier cache , branch prediction.)

lastly, i'd consider building coarse index gives headstart tree (or array). example, use few of significant bits of input value index , see if can cut off first few layers of tree. may more might imagine, skipped comparisons have low chance of getting correct branch predictions.


Comments

Popular posts from this blog

php - cannot display multiple markers in google maps v3 from traceroute result -

c# - DetailsView in ASP.Net - How to add another column on the side/add a control in each row? -

javascript - firefox memory leak -