suppose that we've got a massive amount of chunks (small ones, but come in random order)
How can we linearize them (the generation of linearized chunk index), there are two algorithms to achieve this goal
ONE
1. create a list for holding index streams
2. retrieve one index from the un-sorted index array, check if the index can be attached to the first index stream of the index streams, if so, just append it, otherwise, check the next index stream. If none of the existing index streams in the index streams list fit the requirement, we just create a new index stream, put the index into the newly created index stream, then attach the newly created index stream to the index streams list
3. loop
for the best case, the efficiency is about O(n)
for the worst case, the efficiency is about O(n^2) (we have to check the first m index streams, before we know we have to create the m + 1 one! and what's worse, we have to create n index streams!, that's about n * n / 2.
average performance is about O(m * n), m is the final number of index streams
TWO
1. create a list for holding index streams (named indexStreams)
2. create a list for holding the index stream whose 'last time' is the smallest amongst the index streams in the index streams list, named minIndexStream
3. retrieve one index from the un-sorted index array, check if the index can be attached to the minIndexStream, if cannot we simply create a new index stream, put the index into the newly created index stream, then attach the newly created index stream to the index stream list.
with the help of the minIndexStream, we can improve the performance to O(n), but the problem is how can we maintain such a minIndexStream?
indexStreams = []
newIndexStream = []
indexStreams.append(newIndexStream)
minIndexStream = 0
for an affine index:
oldIndexStream = indexStreams[minIndexStream]
if oldIndexStream.last < index.first:
oldIndexStream.append(index)
# how to determine the minIndexStream then?
else:
newIndexStream = [index]
indexStreams.append(newIndexStream)
# the minIndexStream stays the same, because it is still the smallest one
about maintaining a data structure for keeping the index and the 'last time', we can think of the AVL tree or Red-Black tree, which enables us to insert, delete or query a node in O(log n) time.
We choose Red-Black tree over AVL tree since the insertion and deletion operations may be frequent.
What information should the tree node hold?
1. id (index of the index stream in the index streams list)
2. last ('last time' of the index stream)
What we need from the tree?
1. we need the tree to calculate the minIndexStream
How can we get the minIndexStream according to the tree?
1. due to our construction of the tree, the very bottom-left node is always the minIndexStream
So how to construct (maintain) the tree?
1. by insertion and deletion. We insert a node according their values of 'last time', smaller on the left, larger on the right, and we delete or query a node by its id (the index of the index stream in the index stream list).
2. So, with each coming index, we may have to insert, delete and query the tree, which requires a time complexity of O(log n), n is the number of index streams. And in the old algorithm we have to check every existing index stream, which requires O(n).
3. As a conclusion, if the non-linearity is big, hence the number of index streams will be big as well, use of the new algorithm may dramatically improve the performance.
As a conclusion, the old algorithm has a time complexity of O(n^2), while the new one has a time complexity of O(n * log n), of which the n means the number of index streams that can be regarded as the degree of non-linearity of the TCAS file. So, from the analysis, we can predict that when the n is relatively small or say the TCAS file is not so bad non-linear, the old algorithm may still work well, but this is not the case when the n goes large. Below are some real tests
for more complex TCAS files (number of index streams is large), it will take much longer time for the old algorithm to finish generating the index streams. Although the number of index streams is larger when using the new algorithm, the impact to the rendering phase is trivial.