zoukankan      html  css  js  c++  java
  • Link-based Classification相关数据集

    Link-based Classification相关数据集

    Datasets

    Document Classification Datasets:

    •  CiteSeer: The CiteSeer dataset consists of 3312 scientific publications classified into one of six classes. The citation network consists of 4732 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 3703 unique words. The README file in the dataset provides more details. Click here to download the tarball containing the dataset.
    •  Cora: The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words. The README file in the dataset provides more details. Click here to download the tarball containing the dataset.
    •  WebKB: The WebKB dataset consists of 877 scientific publications classified into one of five classes. The citation network consists of 1608 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1703 unique words. The README file in the dataset provides more details. Click here to download the tarball containing the dataset.

    Social Network Datasets:

    • Terrorists: This dataset contains information about terrorists and their relationships. Unlike the previous datasets, this dataset was designed for classification experiments aimed at classifying the relationships among terrorists. The dataset contains 851 relationships, each described by a 0/1-valued vector of attributes where each entry indicates the absence/presence of a feature. There are a total of 1224 distinct features. Each relationship can be assigned one or more labels out of a maximum of four labels making this dataset suitable for multi-label classification tasks. The README file provides more details. Click here to download the tarball containing the dataset.
    • Terrorist Attacks: This dataset consists of 1293 terrorist attacks each assigned one of 6 labels indicating the type of the attack. Each attack is described by a 0/1-valued vector of attributes whose entries indicate the absence/presence of a feature. There are a total of 106 distinct features. The files in the dataset can be used to create two distinct graphs. The README file in the dataset provides more details. Click here to download the tarball containing the dataset.

    更多  http://www.cs.umd.edu/~sen/lbc-proj/LBC.html

  • 相关阅读:
    Linux新用户创建与删除细节详解
    通过windows远程访问linux桌面的方法(简单)
    物理机网络地址配置原理
    Hive安装中metadata初始化问题
    彻底理解Promise对象——用es5语法实现一个自己的Promise(上篇)
    基于react+react-router+redux+socket.io+koa开发一个聊天室
    深入探析koa之异步回调处理篇
    深入探析koa之中间件流程控制篇
    【踩坑记录】一个新手几乎都踩过的坑...
    NodeJS优缺点及适用场景讨论
  • 原文地址:https://www.cnblogs.com/celia01/p/4645761.html
Copyright © 2011-2022 走看看