This thesis mainly focuses on the study of suffix tree index technical dealing with bio-sequences and multiple sequences alignment problem in bioinformatics .
u672cu6587u4e3bu8981u5bf9u9002u7528u4e8eu751fu7269u5e8fu5217u6570u636eu4e0au7684 u540eu7f00 u6811u7d22u5f15u6280u672fu548cu751fu7269u4fe1u606fu5b66u4e2du7684u591au5e8fu5217u6bd4u5bf9u7b97u6cd5u8fdbu884cu4e86u5206u6790u548cu7814u7a76u3002
Implementation of Chinese and English Clustering Engine Based on Improved Suffix Tree Algorithm
u57fau4e8eu6539u8fdb u540eu7f00 u6811u7b97u6cd5u4e2du82f1u6587u805au7c7bu5f15u64ceu7684u5b9eu73b0
At the same time by adopting the zoning searching method the algorithm constructs a suffix tree for each frequent page node and then mines continuous frequent access paths by visiting the tree .
u540cu65f6u91c7u7528u5206u533au641cu7d22u7684u65b9u5f0fuff0cu4e3au6bcfu4e2au9891u7e41u8282u70b9u6784u9020u4e00 u68f5 u540eu7f00 u6811uff0cu901au8fc7u904du5386u8be5 u540eu7f00u6811u6316u6398u51fau8fdeu7eedu9891u7e41u8bbfu95eeu8defu5f84u3002
Subsequently we analyzed the design selection and computation of the sample kernel for different applications . ( 2 ) For the computing of the string kernel we designed and adopted a data structure called pruning suffix tree .
u968fu540euff0cu672cu6587u9488u5bf9u4e0du540cu5e94u7528u4e0bu6837u672cu6838u7684u8bbeu8ba1u3001u9009u53d6u4ee5u53cau8ba1u7b97u7b80u8981u505au4e86u5206u6790u3002uff082uff09u5728u9488u5bf9u4e32u6838u7684u8ba1u7b97u65b9u6cd5u4e0auff0cu672cu6587u8bbeu8ba1u5e76u91c7u7528u4e86u4e00u79cdu79f0u4e3au526au679d u540eu7f00 u6811u7684u6570u636eu7ed3u6784u3002
The pruning suffix tree combines the suffix tree which has suffix chain and the trie-tree which computing the kernel value in the leaf . But it uses less space than suffix tree and computes faster than the trie-tree .
u526au679d u540eu7f00 u6811u7ed3u5408u4e86u540eu7f00u6811u7684u540eu7f00u94feu601du60f3u4ee5u53catrie-u6811u5728u6839u7ed3u70b9u8ba1u7b97u6838u503cu7684u65b9u6cd5uff0cu5177u6709u6bd4u540eu7f00u6811u66f4u5c11u7684u7a7au95f4u4ee5u53cau6bd4trie-u6811u66f4u5febu7684u65f6u95f4u3002
An Improved Text Clustering Algorithm of Generalized Suffix Tree
u4e00u79cdu6539u8fdbu7684u57fau4e8eu5e7fu4e49 u540eu7f00 u6811u7684u6587u672cu805au7c7bu7b97u6cd5
The paper has discussed a text clustering algorithm based on suffix tree named STC . After analyzing some flaws of this algorithm a more effective approach based on PAT-array and Fuzzy clustering is proposed in order to improve the quality of clustering .
u9610u8ff0u4e86u57fau4e8e u540eu7f00 u6811u7684u6587u672cu805au7c7buff08STCuff09u7b97u6cd5uff0cu5bf9u5176u6240u5b58u5728u7684u7f3au9677u8fdbu884cu4e86u5206u6790uff0cu5e76u5728u6b64u57fau7840u4e0au63d0u51fau4e86u91c7u7528PAT-arrayu548cu6a21u7ccau805au7c7bu76f8u7ed3u5408u7684u65b9u6cd5u5bf9u5176u8fdbu884cu7684u6539u8fdbuff0cu4ee5u63d0u9ad8u805au7c7bu7684u8d28u91cfu3002
It addresses the method of identifying the accurate tandem repeat in detail after analyzing suffix tree and suffix array algorithms of string matching .
u5728u5206u6790u4e86 u540eu7f00 u6811u548cu540eu7f00u6570u7ec4u5b57u7b26u4e32u5339u914du7b97u6cd5u7684u57fau7840u4e0auff0cu8be6u7ec6u9610u8ff0u4e86u57fau4e8e u540eu7f00u6570u7ec4u7684u7cbeu786eu4e32u8054u91cdu590du5e8fu5217u8bc6u522bu65b9u6cd5u3002
Suffix Tree Based Label Generation Method for Web Search Results Clustering
u57fau4e8e u540eu7f00 u6811u7684Webu68c0u7d22u7ed3u679cu805au7c7bu6807u7b7eu751fu6210u65b9u6cd5
The vague kernel also improves the speed of matching characters using the pruning suffix tree . Finally we designed and realized a classification model for protein .
u800cu5728u6a21u7ccau8c31u6838u7684u8ba1u7b97u4e0auff0cu540cu6837u5229u7528u4e86u526au679d u540eu7f00 u6811u63d0u9ad8u4e86u5b57u7b26u5339u914du7684u901fu5ea6u3002
Based on these criterions it takes three document clustering algorithms for assessment with experiments . The comparison and analysis show that STC ( Suffix Tree Clustering ) algorithm is better than k-Means and Ant-based clustering algorithms .
u5728u6b64u57fau7840u4e0au9009u62e9k-Meansu805au7c7bu7b97u6cd5u3001STCuff08 u540eu7f00 u6811u805au7c7buff09u7b97u6cd5u548cu57fau4e8eAntu7684u805au7c7bu7b97u6cd5u8fdbu884cu4e86u5b9eu9a8cu5bf9u6bd4u3002
We present a quick method to mine the frequent path and the reachable set and probability of web pages browsed by users based on the suffix tree ;
u7ed9u51fau4e86u57fau4e8e u540eu7f00 u6811u7528u6237u6d4fu89c8u9891u7e41u8defu5f84u3001u9875u9762u53efu8fbeu96c6u548cu53efu8fbeu6982u7387u7684u5febu901fu8ba1u7b97u65b9u6cd5uff1b
For example : the path expression template-match method the structural joins based on B + tree method the suffix tree index method recoding path expression into a special index method and utilizing XML schema to optimize path expression method etc.
u4f8bu5982uff1au8defu5f84u8868u8fbeu5f0fu6a21u677fu5339u914du65b9u6cd5u3001u57fau4e8eB+u6811u7684u7ed3u6784u8054u5408u65b9u6cd5u3001 u57fau4e8e u540eu7f00 u6811u7684u7d22u5f15u65b9u6cd5u3001u8defu5f84u8868u8fbeu5f0fu91cdu5199u7d22u5f15u67e5u8be2u65b9u6cd5u3001u5229u7528XMLu6a21u677fu4f18u5316u67e5u8be2u8defu5f84u7684u65b9u6cd5u7b49u3002
Based on the research of these algorithms ` design and the analysis of their performance a pair-wise sequence alignment algorithm based on suffix tree is implemented .
u5728u5bf9u8fd9u4e9bu7b97u6cd5u7684u8bbeu8ba1u601du60f3u8fdbu884cu7814u7a76u3001u6027u80fdu8fdbu884cu6bd4u8f83u5206u6790u7684u57fau7840u4e0auff0cu63d0u51fau4e86 u57fau4e8e u540eu7f00 u6811u7684u53ccu5e8fu5217u6bd4u5bf9u7b97u6cd5SPLSAu3002
While the data type of input data is a string type this paper realized a statistical method based on general general suffix tree model for frequent string .
u5bf9u8f93u5165u6570u636eu7684u7c7bu578bu4e3au5b57u7b26u578bu7684 u805au5408 u51fdu6570uff0cu5b9eu73b0u4e86u4e00u79cdu57fau4e8eu901au7528 u540eu7f00 u6811uff08GSTuff09 u8868u793au7684u5b57u7b26u4e32u9891u7387u7edfu8ba1u65b9u6cd5u3002
SHOC Lingo algorithm that combined vector space model ( VSD Model ) and the suffix tree document model not only considering the words of the location information but also consider the statistical properties of words had the good development in the STC foundation .
SHOCu3001Lingou7b97u6cd5u5c06u5411u91cfu7a7au95f4u6a21u578buff08VSDModeluff09u4e0e u540eu7f00 u6811u6587u6863u8868u793au6a21u578bu7ed3u5408u8d77u6765uff0cu65e2u8003u8651u4e86u8bcdu7684u4f4du7f6eu4fe1u606fuff0cu53c8u8003u8651u8bcdu7684u7edfu8ba1u7279u6027uff0cu5728STCu7684u57fau7840u4e0au6709u4e86u8f83u597du7684u53d1u5c55u3002
Faced to network content security we review other representation method and propose a representation method based on suffix tree model ( STM ) .
u672cu6587u9762u5411u7f51u7edcu4fe1u606fu5185u5bb9u5206u6790u8fd9u4e00u80ccu666fuff0cu9488u5bf9 u6d41 u6570u636eu5904u7406u4e2du7684 u6d41 u6587u672c u8868u793au95eeu9898uff0cu8003u5bdfu4e86u73b0u6709u7684 u6587u672cu8868u793au65b9u6cd5uff0cu63d0u51fau5e76u5b9eu73b0u4e86u57fau4e8e u540eu7f00 u6811u6a21u578buff08STMuff09u7684 u6d41u6587u672cu8868u793au65b9u6cd5u3002
According to different storage methods of child suffix tree we introduce several searching technology .
u6839u636eu4e0du540cu7684u5b50 u540eu7f00 u6811u5b58u50a8u65b9u6cd5uff0cu672cu6587u4ecbu7ecdu4e86u591au79cdu641cu7d22u65b9u6cd5u3002
Then based on partial suffix tree presents a new parallel algorithm of suffix tree which can construction large suffix tree in memory and more perfect to very large sequences .
u5728u90e8u5206 u540eu7f00 u6811u7684u57fau7840u4e0au63d0u51fau4e86u540eu7f00u6811u7684u5e76u884cu7b97u6cd5uff0cu89e3u51b3u4e86u540eu7f00u6811u5728u5e94u7528u4e0au7684u5185u5b58u74f6u9888u95eeu9898uff0cu56e0u6b64u66f4u9002u5408u5927u89c4u6a21u7684u5e8fu5217u5206u6790u3002
Using a suffix tree as input which is constructed by DNA sequences and a search algorithm which is base on suffix trees as measure this method outputs a classified table of element repeats finally .
u7b97u6cd5u4ee5DNAu5e8fu5217u6240u6784u9020u7684 u540eu7f00 u6811u4f5cu4e3au8f93u5165uff0cu5e76u4ee5u57fau4e8eu540eu7f00u6811u7684u67e5u8be2u7b97u6cd5u4f5cu4e3au624bu6bb5uff0cu6700u7ec8u751fu6210u8f93u5165u7684DNAu5e8fu5217u7684u521du7ea7u91cdu590du4f53u5206u7c7bu8868u3002
Research of a Suffix Tree Based Automatic Wrapper Generation Method
u4e00u79cdu57fau4e8e u540eu7f00 u6811u7684u5305u88c5u5668u81eau52a8u751fu6210u65b9u6cd5u7684u7814u7a76
There are three major steps : establish suffix tree search for common strings and link common strings .
u8be5u7b97u6cd5u5206u4e3au4e09u4e2au4e3bu8981u6b65u9aa4&u5efau7acb u540eu7f00 u6811uff0cu5bfbu627eu516cu5171u5b50u4e32uff0cu8fdeu63a5u516cu5171u5b50u4e32u3002
In order to improve the updating speed it makes use of non-compact suffix tree to incrementally insert the new user request and delete the outdated browsing information .
u8be5u6a21u578bu4ec5u4fddu7559u5904u4e8eu6ed1u52a8u7a97u53e3u4e4bu5185u7684u6700u8fd1u8bbfu95eeu5e8fu5217uff0cu4eceu800cu53cdu6620u7528u6237u5174u8da3u7684u53d8u5316uff0cu540cu65f6u5229u7528u975eu538bu7f29 u540eu7f00 u6811u589eu91cfu5f0fu6dfbu52a0u65b0u7684u7528u6237u8bf7u6c42u548cu5220u9664u8fc7u65f6u7684u6d4fu89c8u4fe1u606fuff0cu4ee5u63d0u9ad8u66f4u65b0u901fu5ea6u3002
High-frequency words extracting algorithm based on suffix tree is used to extract content features of items and the use of thesaurus can be avoided .
u91c7u7528u57fau4e8e u524du7f00 u6811u7684u9ad8u9891u8bcdu62bdu53d6u7b97u6cd5u62bdu53d6u8bd5u9898u7684u5185u5bb9u7279u5f81uff0cu907fu514du4e86u5bf9u540cu4e49u8bcdu5178u7684u4f9du8d56u3002
When the data file is processed beforehand the thesis proposes three types of index basic suffix tree index optimal suffix tree index and cluster index .
u5728u7d22u5f15u6570u636eu6587u4ef6u7684u60c5u51b5u4e0buff0cu672cu6587u63d0u51fau4e86u4e09u79cdu7d22u5f15u7ed3u6784uff0cu57fau4e8eu57fau672c u540eu7f00 u6811u7684u7d22u5f15u3001u57fau4e8eu6269u5c55u540eu7f00u6811u7684u7d22u5f15u548cu57fau4e8eu805au7c7bu7684u7d22u5f15u3002
Second general suffix tree algorithm is used to mine the key path from user 's all navigation paths and using key path to cluster all users into different interest groups .
u5176u6b21uff0cu63d0u51fau4eceu5bfcu822au8defu5f84u4e2du5229u7528 u6784u9020 u590du5408 u540eu7f00 u6811u7684u65b9u6cd5u6765u6316u6398u7528u6237u5173u952eu8defu5f84uff0cu5e76u5229u7528u5173u952eu8defu5f84u5c06u7528u6237u805au7c7bu6210u4e0du540cu7684u5174u8da3u7fa4u3002
The second part includes the index of child suffix tree and some other information for finding the location of every child suffix tree in disk quickly .
u7b2cu4e8cu90e8u5206u662fu7d22u5f15u90e8u5206uff0cu5305u62ecu5b50 u540eu7f00 u6811u7684u7d22u5f15u548cu5176u4ed6u4fe1u606fu3002u5229u7528u8fd9u90e8u5206u4fe1u606fuff0cu53efu4ee5u5febu901fu5b9au4f4du6bcfu68f5u5b50u540eu7f00u6811u5728u78c1u76d8u4e2du7684u4f4du7f6eu3002
In order to enhance the efficiency of the algorithm the suffix tree was improved . The leaf lists are stored in internal nodes of suffix tree .
u4e3au4e86u8fdbu4e00u6b65u63d0u9ad8u7b97u6cd5u7684u6548u7387uff0cu6211u4eecu5bf9 u540eu7f00 u6811u8fdbu884cu4e86u6539u8fdbuff0cu7ed9u4e2du95f4u8282u70b9u52a0u5165u4e86u53f6u5b50u4fe1u606fu6570u7ec4uff0cu8fd9u6837u5c31u907fu514du4e86u7b97u6cd5u5bf9u5b50u6811u7684u904du5386u3002
To improve the clustering efficiency a simple but reasonable measure for base cluster selection is presented to exclude some generalized suffix tree nodes which contribute less to the clustering .
u4e3au4e86u8fdbu4e00u6b65u63d0u9ad8u805au7c7bu6548u7387uff0cu7ed9u51fau4e86u4e00u79cdu7b80u5355u6709u6548u7684u7528u4e8eu57fau7c7bu9009u62e9u7684u6d4bu5ea6uff0cu7528u6765u6392u9664u4e00u4e9bu65e0u610fu4e49u7684u5e7fu4e49 u540eu7f00 u6811u8282u70b9u3002
A Spam Detection Method on Backbone Network Based on Suffix Tree
u57fau4e8e u540eu7f00 u6811u7684u9aa8u5e72u7f51u7edcu5783u573eu90aeu4ef6u68c0u6d4bu65b9u6cd5
美[ˈsʌfɪks tri]英[ˈsʌfɪks tri:]
[计] 后缀树