判断首字符的字节数。 返回值小于2, 表明首字符不是中文.
/******************************************************************************
Function: GetCharSize
Description:
This function return the size of lead character of specified data string.
Input:
1. string to get character size
2. string size
Return:
-1. invalid character
0. null string
>0. character size
******************************************************************************/
int GetCharSize( const char *Data, int Size )
{
const unsigned char *p = (unsigned char *) Data;
// check arguments
if ( p == NULL || Size <= 0 )
return 0;
// Chinese 1st byte 0x81-0xFE
if ( p[0] < 0x81 || p[0] > 0xFE )
return 1;
// Chinese code size = 2, 4
if ( Size < 2 )
return -1;
// Chinese 2nd byte 0x30-0x39, 0x40-0x7E, 0x80-0xFE
if ( p[1] < 0x30 || p[1] > 0x39 && p[1] < 0x40 || p[1] == 0x7F
|| p[1] > 0xFE )
return -1;
// 2 bytes Chinese code
if ( p[1] >= 0x40 )
return 2;
// Chinese code size = 4
if ( Size < 4 )
return -1;
// Chinese 3rd byte 0x81-0xFE
if ( p[2] < 0x81 || p[2] > 0xFE )
return -1;
// Chinese 4th byte 0x30-0x39
if ( p[3] < 0x30 || p[3] > 0x39 )
return -1;
return 4;
}
判断是否汉字:
#define IsChinese(x,y) ( GetCharSize( x, y ) > 1 )
查找字符串中的无效字符(不完整汉字),返回指向该字符的指针或空。
/******************************************************************************
Function: FindInvalidChar
Description:
This function return the pointer to the first invalid character of
specified data string or NULL.
Input:
1. string to check invalid character
2. string size
Return:
NULL. not found
Other. pointer to invalid character
******************************************************************************/
const char *FindInvalidChar( const char *Data, int Size )
{
const char *p = Data;
int r, s = Size;
while ( ( r = GetCharSize( p, s ) ) > 0 ) {
p += r;
s -= r;
}
if ( r == 0 )
return NULL;
return p;
}
计算字符串中的字符数:
/******************************************************************************
Function: GetCharCount
Description:
This function count the characters in specified data string.
Input:
1. string to count characters
2. string size
Return:
-1. find invalid character
>0. count of characters
******************************************************************************/
int GetCharCount( const char *Data, int Size )
{
const char *p = Data;
int r, c = 0, s = Size;
while ( ( r = GetCharSize( p, s ) ) > 0 ) {
c ++;
p += r;
s -= r;
}
if ( r == 0 )
return c;
return -1;
}