常用技术汇总

zoukankan html css js c++ java

常用技术汇总

一、获取文件的编码格式

当我们在使用文件输入输出流时，经常会出现乱码问题，这通常是由于编码格式导致的。

以复制一份文件为例：

我们用输入流（FileInputStream）读取文件，然后用输出流（FileOutPutStream）重新写入到另一个文件，

如果源文件的编码格式和我们重新写入时的编码格式不一致，那么就可能出现乱码问题。

因此，我们需要获取源文件的编码格式，以便在重新写入时使用相同的编码格式。

下面介绍一个简单的方式准确获取文件的编码格式：

一般地，我们根据文件的前三个字节就可以判断该文件是什么编码格式，如下：

EF BB BF　　　 UTF-8
FE FF　　　　　 UTF-16/UCS-2, little endian
FF FE　　　　　 UTF-16/UCS-2, big endian
FF FE 00 00　　UTF-32/UCS-4, little endian.
00 00 FE FF　　UTF-32/UCS-4, big-endian.

因此，我么读取文件的前几个字节就可用于判断文件的编码格式，代码如下：

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;

public class GetFileEncode {
public static void main(String[] args) {
String filePath = "D:\javaTest\test.txt";
File sourceFile = new File(filePath);
getFilecharset(sourceFile);
}

private static String getFilecharset(File sourceFile) {
String charset = "GBK";
byte[] first3Bytes = new byte[3];
try {
boolean checked = false;
BufferedInputStream bis = new BufferedInputStream(new FileInputStream(sourceFile));
bis.mark(0);
int read = bis.read(first3Bytes, 0, 3);
if (read == -1) {
return charset; // 文件编码为 ANSI
} else if (first3Bytes[0] == (byte) 0xFF
&& first3Bytes[1] == (byte) 0xFE) {
charset = "UTF-16LE"; // 文件编码为 Unicode
checked = true;
} else if (first3Bytes[0] == (byte) 0xFE
&& first3Bytes[1] == (byte) 0xFF) {
charset = "UTF-16BE"; // 文件编码为 Unicode big endian
checked = true;
} else if (first3Bytes[0] == (byte) 0xEF
&& first3Bytes[1] == (byte) 0xBB
&& first3Bytes[2] == (byte) 0xBF) {
charset = "UTF-8"; // 文件编码为 UTF-8
checked = true;
}
bis.reset();
if (!checked) {
int loc = 0;
while ((read = bis.read()) != -1) {
loc++;
if (read >= 0xF0)
break;
if (0x80 <= read && read <= 0xBF) // 单独出现BF以下的，也算是GBK
break;
if (0xC0 <= read && read <= 0xDF) {
read = bis.read();
if (0x80 <= read && read <= 0xBF) // 双字节 (0xC0 - 0xDF)
// (0x80
// - 0xBF),也可能在GB编码内
continue;
else
break;
} else if (0xE0 <= read && read <= 0xEF) {// 也有可能出错，但是几率较小
read = bis.read();
if (0x80 <= read && read <= 0xBF) {
read = bis.read();
if (0x80 <= read && read <= 0xBF) {
charset = "UTF-8";
break;
} else
break;
} else
break;
}
}
}
bis.close();
} catch (Exception e) {
e.printStackTrace();
}
System.out.print(charset);
return charset;
}
}

二、下面介绍一种简单的压缩文件的方法

public static void compressExe(String filePath, String descPath) {
File srcdir = new File(filePath);
File zipFile = new File(descPath);

Project prj = new Project();
Zip zip = new Zip();
zip.setProject(prj);
zip.setDestFile(zipFile);
FileSet fileSet = new FileSet();
fileSet.setProject(prj);
fileSet.setDir(srcdir);
zip.addFileset(fileSet);
zip.execute();
System.out.println("压缩成功！");
if(deleteFile(srcdir))
{
System.out.println("删除成功："+srcdir);
}
}

private static boolean deleteFile(File dir) {
if (dir.isDirectory()) {
String[] children = dir.list();
for (int i = 0; i < children.length; i++) {
boolean success = deleteFile(new File(dir, children[i]));
if (!success) {
return false;
}
}
}
// 目录此时为空，可以删除
return dir.delete();
}

1、需要传入源文件的目录和压缩后的文件目录即可，压缩后会调用函数将源文件删除。

2、需要引入org.apache.tools.ant.jar包；

查看全文

相关阅读:
Java8中利用stream对map集合进行过滤的方法
 安装数据库MySQL，启动时报错服务没有响应控制功能的解决办法
 mysql 安装时失败，提示因为计算机中丢失 msvcp140.dll
复习一下数学排列组合公式的原理
 java如何进行排列组合运算
 Redis 分布式锁：使用Set+lua替代 setnx
深入详解Go的channel底层实现原理【图解】
MYSQL MVCC实现原理详解
 聚簇索引和非聚簇索引,全在这！！！
深度解密Go语言之 map

原文地址：https://www.cnblogs.com/FZ1314/p/6024852.html