zoukankan      html  css  js  c++  java
  • Correlation rule tuning

    Lots
    of organizations are deploying SIEM systems either to do their due diligence or
    because it’s part of a regulatory requirement.  One of the misconceptions that
    typically is derived from marketing material is that you plug it in, turn it on,
    and voila, instant security.  This couldn’t be further from the truth.  I look
    at SIEM like a meta-IDS (Intrusion Detection System).  It is attempting to find
    those needles in the haystack.  Most of the deployments I’ve worked on receive
    millions of events per day.  Many of the events are informational.  Sometimes it
    is mandatory to send those events to the SIEM because of regulatory
    requirements, so my goal is always to maximize our resources and make the best
    of the situation.  When you’re getting millions of firewall events per day for
    example you can either have them take up space on your SAN uselessly or you can
    try to detect misuse with them. 


    The
    first thing you need to do is identify which systems will be forwarding events.
    Typically all switches, routers, servers, application, and security systems
    (Network/Host Intrusion Prevention, Firewalls, anti-malware, etc).  The number
    of devices you forward events from to the SIEM will depend on how much money you
    are willing to spend on event collectors that receive and normalize events, and
    the storage necessary to keep all of this data around.  Deciding what events to
    send to your SIEM is often challenging.  The system you are investigating is
    going to have two capacity limits to be aware of.  The first is storage.  How
    much space will your events take?  To get a rough estimate I would go to every
    system that will be forwarding events and report on how much space they logged
    in a day then multiply that by your retention policy and add them all together.
    So for instance take your (firewall logs for the day * 90) + (IPS logs for the
    day * 90) = required storage.  The second is events per second.  At the very
    least it is recommended to go to all of the devices that will be forwarding
    events and report on how many they generated in a day and divide that by 86400
    (number of seconds in a day).  This will get an approximate number of total
    events per second which will determine the number and size of event
    collectors.

    The
    purpose of this post is to help develop ideas for custom correlation rule use
    cases.  Maybe a SIEM sizing and requirements guide can come later.  So for now
    let’s assume that you already have a SIEM in place and you want to get started
    with it.

    Vendor
    Provided Correlation Rules

    My
    general methodology with SIEM (and any Intrusion Prevention System for that
    matter) is to enable everything to see what happens and tune back what you are
    not interested in.  In many cases you have paid for the content and what better
    way to get the best bang for your buck that to see how it works in your
    environment.  The idea would be to enable the correlation rules once your events
    are being forwarded to see how they react.  If there is a specific firewall
    event of your network monitoring system sending UDP packets on port 162 to poll
    system information via SNMP triggering a port scanning detection rule for
    example, you would not turn off the entire correlation rule.  The idea would be
    to find the mechanism to ignore that specific traffic for that specific rule.


    I
    have seen rules that need to be modified slightly to become effective.  For
    example a correlation rule monitoring for TCP port 31337 is going to trigger
    backdoor rules.  Firewall events will trigger this occasionally accidentally
    because of an outbound connection.  Not to get too detailed here but when a
    computer initiates a connection to a web server on TCP port 80 it has to open a
    random port between 1024-65535 which could trigger here.  Modifying the rule to
    monitor for 31337 as a destination port may be a good way to tune this
    rule.

    Using
    the same example, McAfee Rogue System Detector scans hosts for TCP 31337 during
    service discovery of the network.  Even though internal firewalls/routers may be
    permitting and logging this traffic the target hosts may not (hopefully not) be
    running these services.  In this case you may want to ignore the Rogue System
    Detectors with a destination TCP port of 31337.

    Potential
    Malware Calling Home

    The
    way malware behaves in our networks is a moving target, but it does tend to move
    like cars on a highway rather than at light speed.  So today there are several
    indicators we can monitor for that would allow us to infer that there is either
    an infection or misuse internally by an employee or contractor.

    Resolving
    domain names can be important to keep stability in the malware and allow for
    quick changes of IP addresses.  For example if I program my malware to connect
    to a web server at pwnd.example.net it would be nice for me as the malware
    administrator to change the IP of my web server in the event that someone pulls
    the plug on the one I’m using.  If the malware is programmed to use a static IP
    to connect to I will lose that malware network.  If I use DNS I may be able to
    mitigate some of this risk by getting a new web server, setting up shop, and
    changing the IP of pwnd.example.net to the new IP.  In most environments I’ve
    been in, there are only a handful of DNS servers that all systems internally are
    configured to use.  Part of this correlation rule would be if the following is
    NOT true, source or destination port is UDP or TCP 53 and source or destination
    IPs your list of approved DNS servers then trigger the alert.

    Another
    stanza to add to this rule could be approved proxy servers if you are using one
    that is not in transparent mode.  From your border firewalls you should only see
    traffic from the LAN subnet coming from the proxy server to anywhere on TCP port
    80.  Anything else could be an attempt to subvert this control by an employee or
    contractor or malware configured to do so.  In addition to the above rule if the
    source IP is NOT your proxy and the destination is TCP port 80 trigger the
    alert.  You may also want to include an AND operator for the logging device
    being that of the border firewall to reduce the number of logs that need to be
    investigated.

    Another
    stanza may be to monitor for IRC traffic.  If IRC is permitted you will see
    pretty quickly how many people are using it (it won’t be many) and can hopefully
    tune the rule to only trigger when a certain amount of events are found in a
    certain amount of time.  They you could look for source or destination port of
    TCP 6666, 6667, 7777 and a few others.  Another thing I like to do with this is
    configure a rule on my Network Intrusion Prevention System to look for any
    packets with IRC as the protocol and trigger an IPS event.  Then look for that
    IPS event in this stanza of the rule too which should make sure you catch
    anything at your egress point.

    Yet
    another stanza could be hosts attempting to use an SMTP server other than
    yours.

    Misuse
    of Administration Account

    Every
    environment I have been in has Windows and *nix servers.  These systems have
    default administration accounts, administrator and root respectively.  It is
    best practice to provide actual system administrators with dedicated
    administration user accounts so that there is accountability during
    administration.  If someone were to login as root and shut down a service how
    would you know who it was?  You may be able to track it back by IP, but not
    certainly.  Typically administrators don’t want the administration team using
    their regular user accounts to have administrative privileges so that they
    mitigate mistakes.  Administrators typically will have a separate user account
    for administration to ensure a certain level of assurance that the changes are
    deliberate, for example username_a.  The default administration accounts are
    then printed and locked in a fireproof box somewhere and used for emergencies
    only.

    That
    means that if we someone logging into a system with the username administrator
    or root, either an administrator is misusing the default account or it may have
    been compromised.  It is important to alert specifically when the login was
    successful.  This rule can easily be tested.  Most environments will have
    systems and/or scripts that automate administration tasks so you will need to
    filter those out of the correlation rule.  This does leave residual risk, but we
    are doing the most with what we have available to us.  If you don’t like the
    risk with that, then do the right thing and change the user account
    ;).

    HTTP
    Tunneling

    This
    rule is similar to the malware calling home rule in the sense that we are
    looking for potential misuse by first looking at strange behavior.  If a network
    is enforcing least privilege the user network will be able to send HTTP and
    HTTPS from the inside network out to the Internet.  All of their SMTP traffic
    should go to the internal mail relay.  If users are tunneling other protocols
    through HTTP they are likely attempting to evade controls, or it could be
    malware attempting to evade controls.  This rule requires a Network Intrusion
    Detection/Prevention System or Application Layer Firewall.  You will need to
    create a rule that is monitoring for TCP port 80 OR 443 traffic that is NOT HTTP
    protocol.  On the SIEM you would just have to monitor for one of these events to
    be received to trigger the alert.  Again when you first create this rule you may
    need to tune the rule on the log generating device(s) and/or filter certain
    hosts from triggering the correlation rule.


    Potential
    Server Compromise

    This
    rule can be time consuming to create for your environment, but I have to say
    that this is one of my favorites.  It could be that you create this type of rule
    only for critical hosts.  Here is the concept.  We will use a public facing web
    server as the example but this obviously applies to any server.

    A
    typical web server is listening for connections on TCP port 80.  The only
    connections you should see in firewall logs are random source IP addresses being
    permitted to access TCP port 80 on your server as the destination.  When you
    open up a web browser and connect to a website your computer opens up one of
    these ports locally between 1024-65535 and makes a connection to TCP port 80 on
    the web server.  So if you see a firewall log that shows your web server making
    a connection on a high source port to any other system someone is initiating a
    connection from that webserver.  If they are browsing websites or hoping to
    other systems from here that should be frowned upon and corrected.  Maybe this
    is someone who has already compromised the system and is sending information
    back to their website or FTP server.  Similarly if you see someone connect to a
    port other than 80 on that webserver then you have another server running.
    Either someone set something new up, or maybe this is a backdoor
    running.

    In
    conclusion these are some ideas to get you started with developing correlation
    rules.  Be creative.  When building these rules you are always going to get a
    lot of false positives in the beginning.  Do not get discouraged.  Create your
    rule, either replay several weeks work of data through it or let it run and keep
    an eye on it.

    There
    are many other things to consider when deploying a SIEM.  One of the things that
    senior engineers should be doing with the SIEM at least a couple of times per
    week is perusing the base events to look at the logs that are NOT getting
    correlated.  There could be a lot of things happening that you don’t want to
    have happen but just don’t have a correlation rule yet.   Importing
    Vulnerability Assessment results can really help to increase effectiveness and
    efficiency.  Events need to be monitored to ensure that they are getting
    normalized correctly.  Perhaps we will dig into some of these issues another
    time.

    Strange
    Bandwidth Utilization

    There
    are a couple of ways to look at this, Potential DDoS Detections, and Potential
    Exfiltration.  The most common way to get this data would be to use switch and
    router flow events.  There may be other ways depending on the environment such
    as forwarding Arbor Networks events or Network Intrusion Prevention events, etc
    to the SIM.  Regardless, this can take some time to benchmark and tune because
    bandwidth utilization is typically somewhat sporadic.

    To
    detect potential DDoS attacks a good start would be to start with monitoring for
    traffic ingress to the network targeted to a handful of critical system assets
    that would prevent the organization from functioning should they become
    inaccessible.  The rule would look something like if the bandwidth directed to
    my web servers is greater than 40Mb/s for 10 minutes or more, trigger an
    alert.

    Exfiltration
    is the act of pulling data out of the network after it has been compromised.  As
    an example, bandwidth utilization may increase egress to the network from a file
    share server.  The rule would look similar to the DDoS rule where if traffic
    leaving an asset is greater than 3Mb/s for 10 minutes or more, trigger an
    event.


    The
    purpose of these rules are to provide you with some guidance on how to further
    leverage your SIEM solution.  Even if they do not apply to your network
    specifically I hope they help you to think about some custom correlation events
    you can create to fit your environment. Feel free to reach out if you want to
    discuss further. Some of my favorite SIM systems are ArcSight and Q1 Labs
    (QRadar
    )

  • 相关阅读:
    2020/11/15助教一周小结(第十一周)
    神经网络--理解
    案例一:鸢尾花数据的分类
    TF2基础知识
    软件工程助教工作总结
    软工课程改进建议
    2020-12-27 助教一周小结(第十七周)
    2020-12-20 助教一周小结(第十六周)
    2020-12-13 助教一周小结(第十五周)
    2020-12-06 助教一周小结(第十四周)
  • 原文地址:https://www.cnblogs.com/diyunpeng/p/3855961.html
Copyright © 2011-2022 走看看