TCAX 字幕特效制作工具官方论坛 | ASS | TCAS | Python | Aegisub | Lua

 找回密码
 新人加入
查看: 4249|回复: 0
打印 上一主题 下一主题

[libtcas] About the new techniques in libtcas - chunk cache [复制链接]

Administrator

TCAX Dev.

Rank: 7Rank: 7Rank: 7

跳转到指定楼层
楼主
发表于 2011-10-22 14:34:36 |只看该作者 |倒序浏览
Conceptual Guideline

Chunk Pipeline

1. http://www.tcax.org/forum.php?mo ... d=483&fromuid=2
2. Use a worker thread to cache the most possible chunks
3. a consumer thread consumes the cached chunks
4. Thus can we save the time for waiting and keep the I/O as busy as possible.

Data Structures

1. The first problem we faced is that how can we find the chunks which we want to cache.
    a) query the index streams by the given frame number (we can use binary search directly)
    b) then we may obtain m chunk indices, which we do want to cache.

2. Another important problem is that how can we tell which chunk belongs to which frame (we generate TCAS frames due to frame number rather than chunk offset).
    a) we need an array (here we use our own implementation of vector) to store the pointers to chunks which belongs to the frame associated with the current array. The reason that we use pointers over real chunk objects is obvious, since a chunk may belongs to many frames, thus we only need one instance of the chunk, and give its pointer to the frames.
    b) We need a queue to store n number of such vectors (each vector holds several chunk pointers which belongs to the frame). And such vector is the basic unit of the chunk pipeline, (every time the consumer consumes one frame of chunks, the worker produces another new frame of chunks)

3. Then how can we store the real chunk objects.
    a) Of course, we can use a linked list (since insertion and deletion are common operations due to our need). But we should aware of another important issue, several frames may share the same chunk, so it may be read multiple times, which is just a waste of time and memory, and we need to eliminate this. How? the solution is simple, just check if the chunk has been cached!
    b) since insertion deletion and query are all common operations, and chunks are unique which can be identified by their offset, Binary Search Tree is a good approach. But we really need to avoid the degenerating of binary search trees, so AVL tree or Red-Black tree is a better choice.
    c) Inserting a new chunk to the tree is simple, just check if it has already in the tree, if not, insert it. But it's not the case when deleting a chunk from the tree, because of the fact that the relationship between chunks and frames are many-to-many.
    d) So, how to delete? We can think of the reference count technique. Assign a reference count to each chunk, when we finish a frame of chunks decrease the reference count by 1, then try to delete all the chunks, if the reference count is still above 0, nothing happened, if not, they are deleted. The reference count will be increased if a chunk is referenced by a frame.

4. What about the key frame chunks?
    a) We need another tree to hold the real key frame chunk objects. Since insertion and deletion are more common in normal frame chunks we choose red-black tree to hold them, and query is more common in key frame chunks, so we choose AVL tree to hold key frame chunks.
    b) Inserting a new key chunk pair to the tree is still simple, but we should generate the intermediate frame chunks and put them to the vector, so in order to distinguish intermediate frame chunks from normal frame chunks, we need a new data member in the frame chunk structure, and removing an intermediate chunk from the vector does not need to delete it from the tree, for the tree didn't hold the chunk, but the vector. And deleting an intermediate chunk from the vector will actually decrease the reference count of the key frame chunk in the tree.




您需要登录后才可以回帖 登录 | 新人加入

GitHub|TCAX 主页

GMT+8, 2024-12-4 16:10

Powered by Discuz! X2

© 2001-2011 Comsenz Inc.

回顶部
RealH