http://www.biaodianfu.com/a-newapproach-to-content-extraction-from-web-page.html
http://www.docin.com/p-131616050.html#
http://hi.baidu.com/vcprogrammer/blog/item/dc8ce1c44b9d9ac638db4952.html
http://blog.chinaunix.net/uid-13030755-id-2909453.html
http://blog.csdn.net/tingya/article/details/601836