zoukankan      html  css  js  c++  java
  • stop words list

    参见:http://www.lextek.com/manuals/onix/stopwords1.html

    stop words,称为无意义的词或无效词,在文本挖掘中,作为特征词来讲,没有贡献,这里是onix整理的基本涵盖无效词的列表(429):

    a
    
    about
    
    above
    
    across
    
    after
    
    again
    
    against
    
    all
    
    almost
    
    alone
    
    along
    
    already
    
    also
    
    although
    
    always
    
    among
    
    an
    
    and
    
    another
    
    any
    
    anybody
    
    anyone
    
    anything
    
    anywhere
    
    are
    
    area
    
    areas
    
    around
    
    as
    
    ask
    
    asked
    
    asking
    
    asks
    
    at
    
    away
    
    b
    
    back
    
    backed
    
    backing
    
    backs
    
    be
    
    became
    
    because
    
    become
    
    becomes
    
    been
    
    before
    
    began
    
    behind
    
    being
    
    beings
    
    best
    
    better
    
    between
    
    big
    
    both
    
    but
    
    by
    
    c
    
    came
    
    can
    
    cannot
    
    case
    
    cases
    
    certain
    
    certainly
    
    clear
    
    clearly
    
    come
    
    could
    
    d
    
    did
    
    differ
    
    different
    
    differently
    
    do
    
    does
    
    done
    
    down
    
    down
    
    downed
    
    downing
    
    downs
    
    during
    
    e
    
    each
    
    early
    
    either
    
    end
    
    ended
    
    ending
    
    ends
    
    enough
    
    even
    
    evenly
    
    ever
    
    every
    
    everybody
    
    everyone
    
    everything
    
    everywhere
    
    f
    
    face
    
    faces
    
    fact
    
    facts
    
    far
    
    felt
    
    few
    
    find
    
    finds
    
    first
    
    for
    
    four
    
    from
    
    full
    
    fully
    
    further
    
    furthered
    
    furthering
    
    furthers
    
    g
    
    gave
    
    general
    
    generally
    
    get
    
    gets
    
    give
    
    given
    
    gives
    
    go
    
    going
    
    good
    
    goods
    
    got
    
    great
    
    greater
    
    greatest
    
    group
    
    grouped
    
    grouping
    
    groups
    
    h
    
    had
    
    has
    
    have
    
    having
    
    he
    
    her
    
    here
    
    herself
    
    high
    
    high
    
    high
    
    higher
    
    highest
    
    him
    
    himself
    
    his
    
    how
    
    however
    
    i
    
    if
    
    important
    
    in
    
    interest
    
    interested
    
    interesting
    
    interests
    
    into
    
    is
    
    it
    
    its
    
    itself
    
    j
    
    just
    
    k
    
    keep
    
    keeps
    
    kind
    
    knew
    
    know
    
    known
    
    knows
    
    l
    
    large
    
    largely
    
    last
    
    later
    
    latest
    
    least
    
    less
    
    let
    
    lets
    
    like
    
    likely
    
    long
    
    longer
    
    longest
    
    m
    
    made
    
    make
    
    making
    
    man
    
    many
    
    may
    
    me
    
    member
    
    members
    
    men
    
    might
    
    more
    
    most
    
    mostly
    
    mr
    
    mrs
    
    much
    
    must
    
    my
    
    myself
    
    n
    
    necessary
    
    need
    
    needed
    
    needing
    
    needs
    
    never
    
    new
    
    new
    
    newer
    
    newest
    
    next
    
    no
    
    nobody
    
    non
    
    noone
    
    not
    
    nothing
    
    now
    
    nowhere
    
    number
    
    numbers
    
    o
    
    of
    
    off
    
    often
    
    old
    
    older
    
    oldest
    
    on
    
    once
    
    one
    
    only
    
    open
    
    opened
    
    opening
    
    opens
    
    or
    
    order
    
    ordered
    
    ordering
    
    orders
    
    other
    
    others
    
    our
    
    out
    
    over
    
    p
    
    part
    
    parted
    
    parting
    
    parts
    
    per
    
    perhaps
    
    place
    
    places
    
    point
    
    pointed
    
    pointing
    
    points
    
    possible
    
    present
    
    presented
    
    presenting
    
    presents
    
    problem
    
    problems
    
    put
    
    puts
    
    q
    
    quite
    
    r
    
    rather
    
    really
    
    right
    
    right
    
    room
    
    rooms
    
    s
    
    said
    
    same
    
    saw
    
    say
    
    says
    
    second
    
    seconds
    
    see
    
    seem
    
    seemed
    
    seeming
    
    seems
    
    sees
    
    several
    
    shall
    
    she
    
    should
    
    show
    
    showed
    
    showing
    
    shows
    
    side
    
    sides
    
    since
    
    small
    
    smaller
    
    smallest
    
    so
    
    some
    
    somebody
    
    someone
    
    something
    
    somewhere
    
    state
    
    states
    
    still
    
    still
    
    such
    
    sure
    
    t
    
    take
    
    taken
    
    than
    
    that
    
    the
    
    their
    
    them
    
    then
    
    there
    
    therefore
    
    these
    
    they
    
    thing
    
    things
    
    think
    
    thinks
    
    this
    
    those
    
    though
    
    thought
    
    thoughts
    
    three
    
    through
    
    thus
    
    to
    
    today
    
    together
    
    too
    
    took
    
    toward
    
    turn
    
    turned
    
    turning
    
    turns
    
    two
    
    u
    
    under
    
    until
    
    up
    
    upon
    
    us
    
    use
    
    used
    
    uses
    
    v
    
    very
    
    w
    
    want
    
    wanted
    
    wanting
    
    wants
    
    was
    
    way
    
    ways
    
    we
    
    well
    
    wells
    
    went
    
    were
    
    what
    
    when
    
    where
    
    whether
    
    which
    
    while
    
    who
    
    whole
    
    whose
    
    why
    
    will
    
    with
    
    within
    
    without
    
    work
    
    worked
    
    working
    
    works
    
    would
    
    x
    
    y
    
    year
    
    years
    
    yet
    
    you
    
    young
    
    younger
    
    youngest
    
    your
    
    yours
    
    z
  • 相关阅读:
    【CSS 第五天】背景,边框
    CSS Sprite雪碧图
    【ASP】session实现购物车
    【ASP】response和sever对象实现用户登录
    【操作系统】银行家算法
    【操作系统】先来先服务
    【操作系统】多级反馈队列算法
    【页面置换算法】LRC算法和FIFS算法
    Alpha版(内部测试版)发布
    项目结束--事后诸葛亮会议总结
  • 原文地址:https://www.cnblogs.com/KarayLee/p/4466056.html
Copyright © 2011-2022 走看看