要找工作了感觉自己好久没有码代码了,今天看了珠矶就来实现一个上面提到的寻找给定的字符串最大重复子串的算法。
想到这个问题其实大家一般肯定都能想到的是循环查找整个字符串然后比较输出结果,但是在这样的操作中存在大量的无用操作。
好了我们先来看下,常规的方法:
-
/*************************************************************************
-
> File Name: maxMutiSubStr.cpp
-
> Author: dongdaoxiang
-
> Func: find the Max Mutiple SubString
-
> Mail: dongdaoxiang@ncic.ac.cn
-
> Created Time: 2013年04月28日 星期日 13时58分11秒
-
************************************************************************/
-
#include<iostream>
-
#include<string>
-
#include<cstring>
-
#include<cstdlib>
-
using namespace std;
-
-
int compareTwoStr(char *a, char *b) //return the length of the same subtring between two strings
-
{
-
int len = 0; //the first char of the two strings is same
-
while(*a && *b)
-
{
-
if(*a == *b)
-
{
-
++len;
-
}
-
else
-
{
-
break;
-
}
-
a++;
-
b++;
-
}
-
return len;
-
}
-
-
void find(string dest)
-
{
-
int maxlen = 0;
-
int temp;
-
string::size_type pos, it, jt;
-
for(it = 0; it != dest.size(); it++)
-
{
-
for(jt = it + 1; jt != dest.size(); jt++)
-
{
-
if(dest[it] == dest[jt]) //find the first same char's pos
-
{
-
temp = compareTwoStr(&dest[it], &dest[jt]);//call the compareTwoStr and return the len
-
if(temp > maxlen)
-
{
-
maxlen = temp;
-
pos = it;
-
}
-
continue;
-
}
-
}
-
}
-
-
cout << "The length of the max mutiple substring is: " << maxlen << endl;
-
cout << "The substring is: ";
-
cout << "pos: " << pos << endl;
-
for(it = 0; it != maxlen; it++)
-
{
-
cout << dest[it + pos];
-
}
-
cout << endl;
-
}
-
int main()
{
string dest("what is your name is dong dao xiang your name is dong dao xiang");
find(dest);
return 0;
}
相信大家一看就能明白上面的算法。这里不再多说,我们看下面的算法,因为在上述算法中我们要两重循环遍历整个字符数组,然后比较字串的长度。其实有很多比较是没有必要的,所以我们可以利用一个后缀数组,将该字符数组所有的顺序子串的首地址传递给该后缀数组的每个元素,然后按照字典顺序将所有的顺序子串排序,比较相邻的子串,输出最大的子串
-
/*************************************************************************
-
> File Name: maxMutiSubStr.cpp
-
> Author: dongdaoxiang
-
> Func: find the Max Mutiple SubString
-
> Mail: dongdaoxiang@ncic.ac.cn
-
> Created Time: 2013年04月28日 星期日 13时58分11秒
-
************************************************************************/
-
#include<iostream>
-
#include<string>
-
#include<cstring>
-
#include<cstdlib>
-
using namespace std;
-
-
int compareTwoStr(char *a, char *b) //return the length of the same subtring between two strings
-
{
-
int len = 0; //the first char of the two strings is same
-
while(*a && *b)
-
{
-
if(*a == *b)
-
{
-
++len;
-
}
-
else
-
{
-
break;
-
}
-
a++;
-
b++;
-
}
-
return len;
-
}
-
-
int pstrcmp(const void *p, const void *q)
-
{
-
const char **_p = (const char**)p;
-
const char **_q = (const char**)q;
-
return strcmp((*_p), (*_q));
-
}
-
-
void quickFind(string dest)
-
{
-
int maxlen = 0;
-
int temp = 0;
-
int flag;
-
char* suffixArr[dest.size()]; //create a suffix array
-
string::size_type it, pos;
-
for(it = 0; it != dest.size(); it++)
-
{
-
suffixArr[it] = &dest[it];
-
}
-
qsort(suffixArr, dest.size(), sizeof(char*), pstrcmp); //sort the array by the order of the word
-
cout << "%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%" << endl;
-
for(it = 0; it != dest.size(); it++)
-
{
-
cout << suffixArr[it] << endl;
-
}
-
cout << "%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%" << endl;
-
for(it = 0; it < dest.size(); it++)
-
{
-
temp = compareTwoStr(suffixArr[it], suffixArr[it + 1]); //cmp the nearby strings
-
if(temp > maxlen)
-
{
-
maxlen = temp;
-
pos = it;
-
}
-
}
-
cout << "The length of the max mutiple substring is: " << maxlen << endl;
-
cout << "The substring is: ";
-
cout << suffixArr[pos];
-
cout << endl;
-
-
}
-
-
int main()
-
{
-
string dest("what is your name is dong dao xiang your name is dong dao xiang");
-
string dest2("abcefabcdhabcefabcdh");
-
find(dest2);
-
quickFind(dest2);
-
return 0;
-
}
阅读(1836) | 评论(0) | 转发(0) |