zoukankan      html  css  js  c++  java
  • PG JDBC COPY感谢原作者

    Fast Inserts to PostgreSQL with JDBC and COPY FROM

    I was reading some materials on how to make database inserts as efficient as possible from Java. it was motivated by an already existing application for storing of some measurements into PosgtreSQL. So I decided to compare known approaches to see if there is some way how to improve the application already using batched inserts.

    For the purpose of  the test I created following table:

    CREATE TABLE measurement
    (
      measurement_id bigint NOT NULL,
      valid_ts timestamp with time zone NOT NULL,
      measurement_value numeric(19,4) NOT NULL,
      CONSTRAINT pk_mv_raw PRIMARY KEY (measurement_id, valid_ts)
    )
    WITH (OIDS=FALSE)
    

    I decided to test the insertion of 1000 records to the table. The data for the recors was generated before running of any of test methods. Four test methods were created to reflect ususal approaches:
    • VSI (Very Stupid Inserts) - executing queries made of concatenated Strings one by one
    • SPI  (Stupid Prepared Inserts) - similar to VSI but using prepared statements
    • BPI (Batched Prepared Inserts) - prepared inserts, executed in batches of various length
    • CPI (Copy Inserts) - inserts based on COPY FROM, executed in batches of various length
    Prior to each inserts the table is cleared, the same after all data are succesfully inserted. Commit is called only once in each test method, following all the insert calls.  The following code exerpts illustrate the above listed approaches:

    VSI

    for (int i=0; i<testSize; i++)
    {
      insertSQL = "insert into measurement values (" 
                + measurementIds[i] +",'"+ timestamps[i] +"',"+values[i]+")";
      insert.execute(insertSQL);
    }
    

    SPI
    PreparedStatement insert = conn.prepareStatement("insert into measurement values (?,?,?)");
    for (int i=0; i<testSize; i++)
    {
      insert.setLong(1,measurementIds[i]);
      insert.setTimestamp(2, timestamps[i]);
      insert.setBigDecimal(3, values[i]);
      insert.execute();
    }
    

    BPI

    PreparedStatement insert = conn.prepareStatement("insert into measurement values (?,?,?)");
    
    for (int i=0; i<testSize; i++)
    {
      insert.setLong(1,measurementIds[i]);
      insert.setTimestamp(2, timestamps[i]);
      insert.setBigDecimal(3, values[i]);
      insert.addBatch();
      if (i % batchSize == 0) { insert.executeBatch(); }
    }
    insert.executeBatch();
    

    CPI

    StringBuilder sb = new StringBuilder();
    CopyManager cpManager = ((PGConnection)conn).getCopyAPI();
    PushbackReader reader = new PushbackReader( new StringReader(""), 10000 );
    for (int i=0; i<testSize; i++)
    {
        sb.append(measurementIds[i]).append(",'")
          .append(timestamps[i]).append("',")
          .append(values[i]).append("\n");
        if (i % batchSize == 0)
        {
          reader.unread( sb.toString().toCharArray() );
          cpManager.copyIn("COPY measurement FROM STDIN WITH CSV", reader );
          sb.delete(0,sb.length());
        }
    }
    reader.unread( sb.toString().toCharArray() );
    cpManager.copyIn("COPY measurement FROM STDIN WITH CSV", reader );
    

    I hoped to get some improvements for using COPY FROM instead of batched inserts but not expected no big gain. But the results were a pleasant surprise. For a batch of size 50 (as defined in the original aplication I wanted to improve) the COPY FROM gave 40% improvement.  I expect some improvements when data come from a stream and skip the StringBuffer-with-PushbackReader exercise.

    See the graphs yourself - the number following the method abbreviation is the size of the batch.
  • 相关阅读:
    tp5 宝塔open_basedir restriction in effect 错误; IIS open_basedir restriction in effect
    如何封装一个自己的win7系统并安装到电脑做成双系统
    推荐7个模板代码和其他游戏源码下载的网址
    PHP简单实现异步多文件上传并使用Postman测试提交图片
    PHP公众号开发给用户发微信消息提醒功能
    解决在页面中无法获取qrcode.js生成的base64的图片
    如何使用GUID硬盘分区格式安装新windows系统
    超详细的纯净windows系统重装示例
    Spark之join、leftOuterJoin、rightOuterJoin及fullOuterJoin
    Spark中groupByKey、reduceByKey与sortByKey
  • 原文地址:https://www.cnblogs.com/liuyuanyuanGOGO/p/3066572.html
Copyright © 2011-2022 走看看