zoukankan      html  css  js  c++  java
  • Some SQL basics

    1, Index

    An index is a set of data pointers stored on disk associated with a single table. The main advantage is they greatly speed up select, update, and delete statements, as the query optimizer only has to search the index rather than the entire table to find the relevant rows. There are 2 potential disadvantages: they slow down insert statements slightly, as an index pointer must be createdforeverynewrow that is inserted, and they increase the amount of storage requiredforthe database. In most cases the advantage of the increase in speedforselect, update, and delete statements far out ways the disadvantage of the slight increase in time to perform inserts and the additional storage requirements.
    CREATE INDEX index_name
    ON table_name (column_name);

    By default, when you create this table, your data will be stored on disk and sorted by the "Id" primary key column. This default sort is called the "Clustered Index". Affects the physical order of data so there can only one clustered index.

    But if you search by other non-primary key columns most of the time on this table, then you might want to consider changing the clustered index to this column instead.

    There a few things to keep in mind when changing the default clustered index in a table:

    1. Lookups from non-clustered indexes must look up the query pointer in the clustered index to get the pointer to the actual data records instead of going directly to the data on disk (usually this performance hit is negligble).
    2. Inserts will be slower because the insert must be added in the exact right place in the clustered index. (NOTE: This does not re-order the data pages. It just inserts the record in the correct order in the page that it corresponds to. Data pages are stored as doubly-linked lists so each page is pointed to by the previous and next. Therefore, it is not important to reorder the pages, just their pointers and that is only in the case where the newly inserted row causes a new data page to be created.)

    Non-clustered indexes are not copies of the table but a sorting of the columns you specify that "point" back to the data pages in the clustered index. With a non clustered index there is a second list that has pointers to the physical rows. You can have many non clustered indexes, although each new index will increase the time it takes to write new records.

    It is generally faster to read from a clustered index if you want to get back all the columns. You do not have to go first to the index and then to the table.

    Writing to a table with a clustered index can be slower, if there is a need to rearrange the data.

    小总:Clustered index意思是在选取某个column A,以它排序来存储table里的所有records,所以当你以A为选择条件来做query的时候,因为physically records locate in the same order as the index,通过clustered index可以很快找到符合条件的records。

    Non-clustered index意思是比如还是column A,index会存储A的值以及a pointer to the in the table where that value is actually stored.而clusterd index会在leaf node里存储整条record。所以clustered index会更快。

    2, SQL aggregate functions:

    AVG(), MIN(), MAX(), COUNT(), LAST(), SUM(), FIRST()...

    SELECT COUNT(CustomerID) AS OrdersFromCustomerID7 FROM Orders WHERE CustomerID=7;

    SELECT AVG(Price) AS PriceAverage FROM Products;

    1. DDL – Data Definition Language
    DDL is used to define the structure that holds the data. For example. table
    
    2. DML– Data Manipulation Language
    DML is used for manipulation of the data itself. Typical operations are Insert,Delete,Update and retrieving the data from the table 

    3, Transactions

    A transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions. Can rollback if system fails.

    1. Atomicity
    A transaction consists of many steps. When all the steps in a transaction gets completed, it will get reflected in DB or if any step fails, all the transactions are rolled back.
    
    2. Consistency
    The database will move from one consistent state to another, if the transaction succeeds and remain in the original state, if the transaction fails.
    
    3. Isolation
    Every transaction should operate as if it is the only transaction in the system
    
    4. Durability
    Once a transaction has completed successfully, the updated rows/records must be available for all other transactions on a permanent basis

    Database lock tells a transaction if the data items in question is being used by other transactions. Share lock enables you to read it while exclusive lock not read.

    4, Normalization

    The process of removing the redundant data, by splitting up the table in a well defined fashion is called normalization.
    1. First Normal Form (1NF)
    A relation is said to be in first normal form if and only if all underlying domains contain atomic values only. After 1NF, we can still have redundant data
    
    2. Second Normal Form (2NF)
    A relation is said to be in 2NF if and only if it is in 1NF and every non key attribute is fully dependent on the primary key. After 2NF, we can still have redundant data
    
    3. Third Normal Form (3NF)
    A relation is said to be in 3NF, if and only if it is in 2NF and every non key attribute is non-transitively dependent on the primary key

    5, Primary Key & Foreign Key

    A primary key is a column whose values uniquely identify every row in a table. Cannot be null.Value cannot be modified.
    A Composite primary key is a set of columns whose values uniquely identify every row in a table. 
    foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.

    6, Select/Insert/Select/Delete 

    INSERT INTO table_name
    VALUES (value1,value2,value3,...);

    INSERT INTO table_name (column1,column2,column3,...)
    VALUES (value1,value2,value3,...);

    SELECT * FROM Customers                                         select Name, count(Name) from table
    WHERE Country='Mexico';                                           group by Name Having count > 1         //Having is used on aggregate function

    SELECT DISTINCT City FROM Customers;

    SELECT * FROM Customers
    ORDER BY Country; or ORDER BY Country DESC;

    UPDATE Customers
    SET ContactName='Alfred Schmidt', City='Hamburg'
    WHERE CustomerName='Alfreds Futterkiste';

    DELETE FROM Customers
    WHERE CustomerName='Alfreds Futterkiste' AND ContactName='Maria Anders';

    SELECT * FROM Customers
    WHERE City LIKE 's%';

    SELECT Shippers.ShipperName,COUNT(Orders.OrderID) AS NumberOfOrders FROM Orders
    LEFT JOIN Shippers
    ON Orders.ShipperID=Shippers.ShipperID
    GROUP BY ShipperName;

    SELECT sID, sName FROM Student WHERE sID IN (SELECT sID FROM Apply WHERE major='CS'); AND sID NOT IN (SELECT sID FROM Appy WHERE major='EE');

    其它关键字: in, not in, all, any, exists, = <=, <, >

    SELECT sID FROM Student WHERE sizeHS>any(SELECT sizeHS FROM Student);

    7,Join

    • INNER JOIN: Returns all rows when there is at least one match in BOTH tables
    • LEFT JOIN: Return all rows from the left table, and the matched rows from the right table
    • RIGHT JOIN: Return all rows from the right table, and the matched rows from the left table
    • FULL JOIN: Return all rows when there is a match in ONE of the tables

    Cross join: 返回的是两个table的笛卡儿积

    SELECT * FROM employee CROSS JOIN department; 或 SELECT * FROM employee, department;

    Inner join: SELECT * FROM employee, department WHERE employee.DepartmentID = department.DepartmentID 或SELECT * FROM employee inner join department on employee.DepartmentID = department.DepartmentID.

    Outer join: SELECT * FROM employee LEFT OUTER JOIN department ON employee.DepartmentID = department.DepartmentID;

    Self join: SELECT F.EmployeeID, F.LastName, S.EmployeeID, S.LastName FROM Employee F INNER JOIN Employee S ON F.Country = S.Country

    WHERE F.EmployeeID<S.EmployeeID

    SELECT Customers.CustomerName, Orders.OrderID
    FROM Customers
    LEFT JOIN Orders  //This will display all rows from table Customers.
    ON Customers.CustomerID=Orders.CustomerID
    ORDER BY Customers.CustomerName;

    SELECT City FROM Customers
    UNION      //Use UNION ALL to also select duplicated values.
    SELECT City FROM Suppliers
    ORDER BY City;

    8, Constrains: NOT NULL, PRIMARY KEY, FOREIGN KEY, UNIQUE.

    SQL constraints are used to specify rules for the data in a table.

    If there is any violation between the constraint and the data action, the action is aborted by the constraint.

    Constraints can be specified when the table is created (inside the CREATE TABLE statement) or after the table is created (inside the ALTER TABLE statement).

    9, Table create/alter/drop

    CREATE TABLE Persons(           DROP TABLE table_name           ALTER TABLE Persons

    PersonID int,                                                                          ADD DateOfBirth date
    LastName varchar(255),
    FirstName varchar(255),
    Address varchar(255),
    City varchar(255)
    );

    CREATE INDEX PIndex             Drop index: ALTER TABLE table_name DROP INDEX index_name
    ON Persons (LastName)

    10,View

    Views are virtual tables. Unlike tables that contain data, views simply contain queries that dynamically retrieve data when used. Modify to view rewritten to modify base tables.
    create view CSaccept as
    select sID, cName
    from Apply
    where mahor = 'CS' and decision='Y';

    Fix:
    FIX (financial information exchange) protocol is the global protocol used for Electronic trading of different asset classes e.g Equity, Fixed Income FX (foreign exchange) , Derivatives Futures and Options and its knowledge is essential to understand Electronic trading and FIX messages. FIX is widely used by both the buy side (institutions) as well as the sell side(brokers/dealers) of the financial markets. 
    Since different exchange uses there proprietary exchange protocol (e.g. HKSE uses OG, TSE uses Arrowhead protocol and NASDAQ uses OUCH protocol), use of FIX engine on exchange side is less as compared to client side, As clients and firms prefer to use FIX protocol for sending orders and receiving executions.

    They are composed of a header, a body, and a trailer.

    Up to FIX.4.4, the header contained three fields: 8 (BeginString), 9 (BodyLength), and 35 (MsgType) tags. The body of the message is entirely dependent on the message type defined in the header (tag 35, MsgType). The last field of the message is tag 10, FIX Message Checksum. It is always expressed as a three-digit number (e.g. 10=002). The checksum algorithm of FIX consists of summing up the decimal value of the ASCII representation all the bytes up to the checksum field (which is last) and return the value modulo 256.

    Fix Engine:
    1) Establish Fix Connectivity by sending session level messages.
    2) manage FIX Session
    3) recover if FIX session lost
    4) creating, sending, parsing FIX messages for electronic trading.
    5) handles replay
    6) supports different FIX protocol version and tags.
    Fix session parameters: SocketConnectHost, SocketConnectPort, DataDictionary, SenderCompID, TargetCompID, StartTime, ENdTime, HeartBtInt, BeginString.

    ClOrdID11 and OrderID37?
    ClOrdId is a unique id assigned by buy-side while the later is assigned by sell-side. OrderID normally remains same for a message chain (e.g. on Cancel and mod orders) while ClOrdID changes with Cancel and Modification.
    TransactTime60 and Sending Time52?
    TransactTime: Time of execution/order creation (expressed in UTC (Universal Time Coordinated, also known as 'GMT')
    SendingTime: Time of message transmission (always expressed in UTC (Universal Time Coordinated, also known as 'GMT')
    MsgSeqNum34?
    All FIX messages are identified by a unique sequence number. Sequence numbers are initialized at the start of each FIX session starting at 1 and increment throughout the session. Monitoring sequence numbers will enable parties to identify and react to missed messages and to gracefully synchronize applications when reconnecting during a FIX session.
    Each session will establish an independent incoming and outgoing sequence series; participants will maintain a sequence series to assign to outgoing messages and a separate series to monitor for sequence gaps on incoming messages. Logically we can divide sequence number into two Incoming and Outgoing Sequence number. 
    Incoming sequence number is the number any FIX Engine expecting from Counter Party and Outgoing sequence number is the number any FIX engine is sending to Counter Party.

    NewOrderSingle message is denoted by MsgType=D and its used to place an Order, OrderCancelReplace Request is modification request denoted by MsgType=G in fix protocol and used to modify Order e.g for changing quantity or price of Order.
    OrderCancelRequest is third in this category denoted by MsgType=F in fix protocol and used to cancel Order placed into Market. OrderCancelReject 9. ExecutionReport 8.

    What are most common issues encounter when two FIX Engine communicates ?
    When Clients connect to broker via fix protocol, their FIX engine connects to each other, while setting up and during further communication many issues can occur below are some of most common ones:
    Issues related to network connectivity, 
    Issues related to Firewall rules
    Issue related to incorrect host/port name while connecting.
    Incorrect SenderCompID49 and TargetCompID50
    Sequence Number mismatch
    Issue related to fix version mismatch
    What happens if Client connects with Sequence No higher than expected?
    If Client FIX Engine connects to Broker Fix Engine with Sequence Number higher than expected (e.g. broker is expecting Sequence Number = 10 and Client is sending = 15). As per fix protocol Broker will accept the connection and issue a Resend Request (MsgType=2) asking Client to resend missing messages (from messages 10 -15) , Now Client can either replay those messages or can issue a Gap Fill Message (MsgType=4 as per fix protocol) in case replaying those messages  doesn't make any sense (could be admin messages e.g. Heartbeat etc).
    What happens if Client connects with Sequence No lower than expected?
    If Client FIX engine connects to broker FIX engine with Sequence No lower than expected than broker FIX engine will disconnect the connection. As per fix protocol client then may try by increasing its sequence Number until broker accepts its connection.

    Which of the following orders would be automatically canceled if not executed immediately?
    Fill or Kill (FOK) and Immediate or Cancel (IOC) orders are types of order which either executed immediately or get cancelled by exchange. TimeInForce (tag 59) in fix protocol is used to mark an order as FOK or IOC.

    What is the difference between FOK order and IOC Order?
    Main difference between FOK and IOC Order is that FOK demands full execution of order i.e. all quantity has to be filled while IOC order is ready to accept partial fills also?
    Initial connection
    Once network connection gets established you are ready to connect to client. Now client will send logon request (MessagType=A) with sequence no 1, 1 (At start of day) and with SenderCompID and TargetCompID agreed upon. On TCP layer first of all socket connection gets established in client IP and your IP and your Fix Engine listens on port specified. once your Fix Engine gets logon request its validate content and authenticity of client and if all is OK it replies with another logon request message. Now your fix session is established and you are ready to send orders via this connection.
     
     
  • 相关阅读:
    spring事务调用失效问题
    redis的主从、哨兵配置
    Lucene介绍与入门使用
    超详细“零”基础kafka入门篇
    消息队列——RabbitMQ学习笔记
    linux中RabbitMQ安装教程
    JAVA正则
    JAVA String类
    JAVA lang包介绍
    JAVA枚举
  • 原文地址:https://www.cnblogs.com/chayu3/p/3878562.html
Copyright © 2011-2022 走看看