最近在做一个项目,使用到了读取Word表格中内容的技术。在网上找了许多资料都不尽人意,最后整理了并修改了一些网上的代码,取其精华去其糟粕,现将代码同各位园子里的朋友们分享。
![](https://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
读取Word表格数据的方法
1
//将读取Word表格封装与方法中。
2
public string ReadWord(string fileName, int rowIndex, int colIndex)
3![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](https://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
{
4
ApplicationClass cls = null;
5
Document doc = null;
6![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
7
Table table = null;
8
object missing = Missing.Value;
9![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
10
object path = fileName;
11
cls = new ApplicationClass();
12![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
13
try
14![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
15
doc = cls.Documents.Open
16
(ref path, ref missing, ref missing, ref missing,
17
ref missing, ref missing, ref missing, ref missing,
18
ref missing, ref missing, ref missing, ref missing,
19
ref missing, ref missing, ref missing, ref missing);
20
table = doc.Tables[1];
21
string text = table.Cell(rowIndex, colIndex).Range.Text.ToString();
22
text = text.Substring(0, text.Length - 2); //去除尾部的mark
23
return text;
24
}
25
catch (Exception ex)
26![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
27![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
28
return ex.Message;
29
}
30
finally
31![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
32
if (doc != null)
33
doc.Close(ref missing, ref missing, ref missing);
34
cls.Quit(ref missing, ref missing, ref missing);
35
}
36
}
这个方法用于读取Word表格中某个单元格的数据。其中的参数分别为文件名(包括路径),行号,列号。
由于考虑到代码复用,我将代码写成了一个类。此外,通过审视代码可以发现,如果要多次读取同一文件中的不同的单元格数据会造成频繁的打开、关闭Word程序;因此,我将代码进行优化。在我做优化的时候突然想起来ADO.NET的SqlConnection和SqlCommand类。这两个类我常常用做数据库操作,一般用到的方法顺序都是:打开数据库连接,执行数据库查询,关闭数据库连接。我没有使用到两个类,我将这段代码封装于一个类中。使用Open、Close控制Word文档的打开和关闭,使用WordTableRead方法读取表格中的数据。这样对于读取多个单元格中的数据,每次只需要打开、关闭一次Word程序即可,大大的节约了资源的开销和节省了时间,提高的读取效率。此外,对于代码的优化还有可以读取指定表格中数据。下图显示了这个类的结构,代码及相应注释附在图的下方:
![](https://images.cnblogs.com/cnblogs_com/superwulei/2008_10/Class1.png)
![](https://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
Word中表格单元格读取类
1
class WordTableRead
2![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedBlockStart.gif)
![](https://www.cnblogs.com/Images/OutliningIndicators/ContractedBlock.gif)
{
3
private string fileName;
4
private ApplicationClass cls = null;
5
private Document doc = null;
6
private Table table = null;
7
private object missing = Missing.Value;
8
//Word是否处于打开状态
9
private bool openState;
10![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
11![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
12![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
/**//// <summary>
13
/// 自定义构造方法
14
/// </summary>
15
/// <param name="fileName">包含路径的文件名</param>
16
public WordTableRead(string fileName)
17![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
18
this.fileName = fileName;
19
}
20
21![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
/**//// <summary>
22
/// 打开Word文档
23
/// </summary>
24
public void Open()
25![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
26
object path = fileName;
27
cls = new ApplicationClass();
28
try
29![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
30
doc = cls.Documents.Open
31
(ref path, ref missing, ref missing, ref missing,
32
ref missing, ref missing, ref missing, ref missing,
33
ref missing, ref missing, ref missing, ref missing,
34
ref missing, ref missing, ref missing, ref missing);
35
openState = true;
36
}
37
catch
38![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
39
openState = false;
40
}
41
}
42
43![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
/**//// <summary>
44
/// 返回指定单元格中的数据
45
/// </summary>
46
/// <param name="tableIndex">表格号</param>
47
/// <param name="rowIndex">行号</param>
48
/// <param name="colIndex">列号</param>
49
/// <returns>单元格中的数据</returns>
50
public string ReadWord(int tableIndex, int rowIndex, int colIndex)
51![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
52
//Give the value to the tow Int32 params.
53![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
54
try
55![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
56
if (openState == true)
57![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
58
table = doc.Tables[tableIndex];
59
string text = table.Cell(rowIndex, colIndex).Range.Text.ToString();
60
text = text.Substring(0, text.Length - 2); //去除尾部的mark
61
return text;
62
}
63
else
64![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
65
return "";
66
}
67
}
68
catch
69![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
70
return "Error";
71
}
72
}
73![](https://www.cnblogs.com/Images/OutliningIndicators/InBlock.gif)
74![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
/**//// <summary>
75
/// 关闭Word文档
76
/// </summary>
77
public void Close()
78![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
79
if (openState == true)
80![](https://www.cnblogs.com/Images/OutliningIndicators/ExpandedSubBlockStart.gif)
{
81
if (doc != null)
82
doc.Close(ref missing, ref missing, ref missing);
83
cls.Quit(ref missing, ref missing, ref missing);
84
}
85
}
86
}
尽管如此,我还是认为这个类的设计仍然存在缺陷。每次测试这个类的时候,感觉数据读取的速度不是很令我满意;而且,这个类用于控制台应用程序的时候不会在屏幕上看到任何值,不明白应该如何改进代码。希望朋友们能够给我提供一些改进此类的建议。
ps:本文的界面设计、数据见如何从MS Word的表格中提取指定单元格的数据