zoukankan      html  css  js  c++  java
  • mysql 造1亿条记录的单表--大数据表

    读写文件

    背景及木:现有数据1000w单表,为压力测试准备1亿条数据。

    步骤:

    1.将1000w条记录,除id外都导入到多个文件中:

    //DELIMITER
    DROP PROCEDURE if EXISTS createManyTable;
    create PROCEDURE createManyTable()
    BEGIN
    DECLARE i int;
    DECLARE fileName VARCHAR(30);
    set i=1;
    
    
    while i<251
    DO
    SET fileName=CONCAT('f_log_',i,'.txt');
    
    SET @STMT :=CONCAT("select `xx`,`xx`,`xx`,`xx`,.... into outfile 'temp/",fileName,
    "' lines terminated by '
    ' from `f_log` WHERE id>= ",40000*(i-1)," AND id< ",40000*i);
    
    PREPARE STMT FROM @STMT;
    EXECUTE STMT;
    
    
    set i=i+1;
    end while;
    END;
    //DELIMITER
    CALL createManyTable();

    2. 将上述多个文件合并到同一个文件,并且在第一列加入id列:

        public static void main(String[] args) throws IOException {
            int i=10000000;
            int step=40000;
            File out=new File("E:/data/f_log_data.txt");
            for(int k=1;k<251;k++){
                File file=new File("E:/data/temp/f_log_"+k+".txt");
                StringBuffer sb=new StringBuffer();
                if(file.exists()){
                    sb=readFile(file,i+step*k);
                    writeFile(out,sb);
                }
            }
            
        }
        
        public static StringBuffer readFile(File file,int start) throws IOException{
            StringBuffer sb=new StringBuffer();
            BufferedReader reader=new BufferedReader(new FileReader(file));
            String line="";    
            
    
            while(line != null){
                line = reader.readLine();
                if(line == null){
                    break;
                }
                if(line.trim().equalsIgnoreCase("")){
                    continue;
                }
                start++;
                sb.append(start+"	"+line.trim()+"
    ");
            }
            reader.close();
            return sb;
        }
        
        public static void writeFile(File file,StringBuffer sb) throws IOException{
            BufferedWriter writer = new BufferedWriter(new FileWriter(file, true));
            writer.write(sb.toString());
            writer.close();
        }
        
        
        
        public void writeFile11() throws IOException{
    
            // TODO Auto-generated method stub
            BufferedWriter writer = new BufferedWriter(new FileWriter(new File("D:/driver/data.txt"), true));
            for(int i=0;i<1000000;i++){
                if(i%10==0){
                    writer.write("赵"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }if(i%10==1){
                    writer.write("钱"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }
                if(i%10==2){
                    writer.write("孙"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }if(i%10==3){
                    writer.write("李"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }
                if(i%10==4){
                    writer.write("郑"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }if(i%10==5){
                    writer.write("吴"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }
                if(i%10==6){
                    writer.write("周"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }if(i%10==7){
                    writer.write("王"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }
                if(i%10==8){
                    writer.write("张"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }if(i%10==9){
                    writer.write("刘"+(i/10)+"	"+ (int)(Math.random()*100)+"
    ");
                }
            }
            writer.close();
        
        }

    3. 将合并后的文件导入到数据表中:

    load data local infile '/tmp/finance_log_data.txt' into table f_log(`id`,`xx`,
    `xx`,.........................
    );

    注意事项:开始考虑使用存储过程来逐步导入到数据表中,但load data命令不能在存储过程中使用。

      另外,数据的合并也可以以shell脚本完成,但习惯使用java了,因此以java来完成,显得比较复杂。不过,可以随便复习一下java的读写文件,有算不错的经历。

    Q&A

    时间问题:生成1亿条数据(在有索引的情况下),用时3个小时。如果使用insert语句,估计会疯掉!

  • 相关阅读:
    leetcode33. Search in Rotated Sorted Array
    pycharm 设置sublime text3 monokai主题
    django class Meta
    leetcode30, Substring With Concatenation Of All Words
    Sublime text3修改tab键为缩进为四个空格,
    sublime text3 python打开图像的问题
    安装上imesupport输入法依然不跟随的解决办法,
    sublime text3 的插件冲突弃用问题,
    sublime text3 BracketHighlighter括号匹配的设置
    windows 下wget的使用
  • 原文地址:https://www.cnblogs.com/davidwang456/p/4691549.html
Copyright © 2011-2022 走看看