LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name
' [REPLACE | IGNORE] INTO TABLEtbl_name
[CHARACTER SETcharset_name
] [{FIELDS | COLUMNS} [TERMINATED BY 'string
'] [[OPTIONALLY] ENCLOSED BY 'char
'] [ESCAPED BY 'char
'] ] [LINES [STARTING BY 'string
'] [TERMINATED BY 'string
'] ] [IGNOREnumber
{LINES | ROWS}] [(col_name_or_user_var
,...)] [SETcol_name
=expr
,...]
The LOAD DATA INFILE
statement reads rows from a text file into a table at a very high speed. LOAD DATA INFILE
is the complement of SELECT ... INTO OUTFILE
. (See Section 13.2.9.1, “SELECT ... INTO
Syntax”.) To write data from a table to a file, use SELECT ... INTO OUTFILE
. To read the file back into a table, use LOAD DATA INFILE
. The syntax of the FIELDS
and LINES
clauses is the same for both statements. Both clauses are optional, but FIELDS
must precede LINES
if both are specified.
You can also load data files by using the mysqlimport utility; it operates by sending a LOAD DATA INFILE
statement to the server. The --local
option causes mysqlimport to read data files from the client host. You can specify the --compress
option to get better performance over slow networks if the client and server support the compressed protocol. See Section 4.5.5, “mysqlimport — A Data Import Program”.
For more information about the efficiency of INSERT
versus LOAD DATA INFILE
and speeding up LOAD DATA INFILE
, see Section 8.2.2.1, “Speed of INSERT
Statements”.
The file name must be given as a literal string. On Windows, specify backslashes in path names as forward slashes or doubled backslashes. The character_set_filesystem
system variable controls the interpretation of the file name.
The character set indicated by the character_set_database
system variable is used to interpret the information in the file. SET NAMES
and the setting of character_set_client
do not affect interpretation of input. If the contents of the input file use a character set that differs from the default, it is usually preferable to specify the character set of the file by using the CHARACTER SET
clause. A character set of binary
specifies “no conversion.”
LOAD DATA INFILE
interprets all fields in the file as having the same character set, regardless of the data types of the columns into which field values are loaded. For proper interpretation of file contents, you must ensure that it was written with the correct character set. For example, if you write a data file with mysqldump -T or by issuing aSELECT ... INTO OUTFILE
statement in mysql, be sure to use a --default-character-set
option so that output is written in the character set to be used when the file is loaded with LOAD DATA INFILE
.
It is not possible to load data files that use the ucs2
, utf16
, or utf32
character set.
If you use LOW_PRIORITY
, execution of the LOAD DATA
statement is delayed until no other clients are reading from the table. This affects only storage engines that use only table-level locking (such as MyISAM
, MEMORY
, andMERGE
).
If you specify CONCURRENT
with a MyISAM
table that satisfies the condition for concurrent inserts (that is, it contains no free blocks in the middle), other threads can retrieve data from the table while LOAD DATA
is executing. This option affects the performance of LOAD DATA
a bit, even if no other thread is using the table at the same time.
With row-based replication, CONCURRENT
is replicated regardless of MySQL version. With statement-based replication CONCURRENT
is not replicated prior to MySQL 5.5.1 (see Bug #34628). For more information, seeSection 17.4.1.16, “Replication and LOAD DATA INFILE
”.
The LOCAL
keyword affects expected location of the file and error handling, as described later. LOCAL
works only if your server and your client both have been configured to permit it. For example, if mysqld was started with --local-infile=0
, LOCAL
does not work. See Section 6.1.6, “Security Issues with LOAD DATA LOCAL
”.
The LOCAL
keyword affects where the file is expected to be found:
-
If
LOCAL
is specified, the file is read by the client program on the client host and sent to the server. The file can be given as a full path name to specify its exact location. If given as a relative path name, the name is interpreted relative to the directory in which the client program was started.When using
LOCAL
withLOAD DATA
, a copy of the file is created in the server's temporary directory. This is not the directory determined by the value oftmpdir
orslave_load_tmpdir
, but rather the operating system's temporary directory, and is not configurable in the MySQL Server. (Typically the system temporary directory is/tmp
on Linux systems andC:WINDOWSTEMP
on Windows.) Lack of sufficient space for the copy in this directory can cause theLOAD DATA LOCAL
statement to fail. -
If
LOCAL
is not specified, the file must be located on the server host and is read directly by the server. The server uses the following rules to locate the file:-
If the file name is an absolute path name, the server uses it as given.
-
If the file name is a relative path name with one or more leading components, the server searches for the file relative to the server's data directory.
-
If a file name with no leading components is given, the server looks for the file in the database directory of the default database.
-
In the non-LOCAL
case, these rules mean that a file named as ./myfile.txt
is read from the server's data directory, whereas the file named as myfile.txt
is read from the database directory of the default database. For example, if db1
is the default database, the following LOAD DATA
statement reads the file data.txt
from the database directory for db1
, even though the statement explicitly loads the file into a table in the db2
database:
LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;
For security reasons, when reading text files located on the server, the files must either reside in the database directory or be readable by all. Also, to use LOAD DATA INFILE
on server files, you must have the FILE
privilege. See Section 6.2.1, “Privileges Provided by MySQL”. For non-LOCAL
load operations, if the secure_file_priv
system variable is set to a nonempty directory name, the file to be loaded must be located in that directory.
Using LOCAL
is a bit slower than letting the server access the files directly, because the contents of the file must be sent over the connection by the client to the server. On the other hand, you do not need the FILE
privilege to load local files.
LOCAL
also affects error handling:
-
With
LOAD DATA INFILE
, data-interpretation and duplicate-key errors terminate the operation. -
With
LOAD DATA LOCAL INFILE
, data-interpretation and duplicate-key errors become warnings and the operation continues because the server has no way to stop transmission of the file in the middle of the operation. For duplicate-key errors, this is the same as ifIGNORE
is specified.IGNORE
is explained further later in this section.
The REPLACE
and IGNORE
keywords control handling of input rows that duplicate existing rows on unique key values:
-
If you specify
REPLACE
, input rows replace existing rows. In other words, rows that have the same value for a primary key or unique index as an existing row. See Section 13.2.8, “REPLACE
Syntax”. -
If you specify
IGNORE
, input rows that duplicate an existing row on a unique key value are skipped. -
If you do not specify either option, the behavior depends on whether the
LOCAL
keyword is specified. WithoutLOCAL
, an error occurs when a duplicate key value is found, and the rest of the text file is ignored. WithLOCAL
, the default behavior is the same as ifIGNORE
is specified; this is because the server has no way to stop transmission of the file in the middle of the operation.
To ignore foreign key constraints during the load operation, issue a SET foreign_key_checks = 0
statement before executing LOAD DATA
.
If you use LOAD DATA INFILE
on an empty MyISAM
table, all nonunique indexes are created in a separate batch (as for REPAIR TABLE
). Normally, this makes LOAD DATA INFILE
much faster when you have many indexes. In some extreme cases, you can create the indexes even faster by turning them off with ALTER TABLE ... DISABLE KEYS
before loading the file into the table and using ALTER TABLE ... ENABLE KEYS
to re-create the indexes after loading the file. See Section 8.2.2.1, “Speed of INSERT
Statements”.
For both the LOAD DATA INFILE
and SELECT ... INTO OUTFILE
statements, the syntax of the FIELDS
andLINES
clauses is the same. Both clauses are optional, but FIELDS
must precede LINES
if both are specified.
If you specify a FIELDS
clause, each of its subclauses (TERMINATED BY
, [OPTIONALLY] ENCLOSED BY
, andESCAPED BY
) is also optional, except that you must specify at least one of them.
If you specify no FIELDS
or LINES
clause, the defaults are the same as if you had written this:
FIELDS TERMINATED BY ' ' ENCLOSED BY '' ESCAPED BY '\' LINES TERMINATED BY ' ' STARTING BY ''
(Backslash is the MySQL escape character within strings in SQL statements, so to specify a literal backslash, you must specify two backslashes for the value to be interpreted as a single backslash. The escape sequences ' '
and '
'
specify tab and newline characters, respectively.)
In other words, the defaults cause LOAD DATA INFILE
to act as follows when reading input:
-
Look for line boundaries at newlines.
-
Do not skip over any line prefix.
-
Break lines into fields at tabs.
-
Do not expect fields to be enclosed within any quoting characters.
-
Interpret characters preceded by the escape character “
” as escape sequences. For example, “
\
” signify tab, newline, and backslash, respectively. See the discussion ofFIELDS ESCAPED BY
later for the full list of escape sequences.
Conversely, the defaults cause SELECT ... INTO OUTFILE
to act as follows when writing output:
-
Write tabs between fields.
-
Do not enclose fields within any quoting characters.
-
Use “
” to escape instances of tab, newline, or “
” that occur within field values.
-
Write newlines at the ends of lines.
If you have generated the text file on a Windows system, you might have to use LINES TERMINATED BY '
'
to read the file properly, because Windows programs typically use two characters as a line terminator. Some programs, such as WordPad, might use
as a line terminator when writing files. To read such files, use LINES TERMINATED BY '
'
.
If all the lines you want to read in have a common prefix that you want to ignore, you can use LINES STARTING BY '
to skip over the prefix, and anything before it. If a line does not include the prefix, the entire line is skipped. Suppose that you issue the following statement:prefix_string
'
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test FIELDS TERMINATED BY ',' LINES STARTING BY 'xxx';
If the data file looks like this:
xxx"abc",1 something xxx"def",2 "ghi",3
The resulting rows will be ("abc",1)
and ("def",2)
. The third row in the file is skipped because it does not contain the prefix.
The IGNORE
option can be used to ignore lines at the start of the file. For example, you can usenumber
LINESIGNORE 1 LINES
to skip over an initial header line containing column names:
LOAD DATA INFILE '/tmp/test.txt' INTO TABLE test IGNORE 1 LINES;
When you use SELECT ... INTO OUTFILE
in tandem with LOAD DATA INFILE
to write data from a database into a file and then read the file back into the database later, the field- and line-handling options for both statements must match. Otherwise, LOAD DATA INFILE
will not interpret the contents of the file properly. Suppose that you use SELECT ... INTO OUTFILE
to write a file with fields delimited by commas:
SELECT * INTO OUTFILE 'data.txt' FIELDS TERMINATED BY ',' FROM table2;
To read the comma-delimited file back in, the correct statement would be:
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY ',';
If instead you tried to read in the file with the statement shown following, it wouldn't work because it instructs LOAD DATA INFILE
to look for tabs between fields:
LOAD DATA INFILE 'data.txt' INTO TABLE table2 FIELDS TERMINATED BY ' ';
The likely result is that each input line would be interpreted as a single field.
LOAD DATA INFILE
can be used to read files obtained from external sources. For example, many programs can export data in comma-separated values (CSV) format, such that lines have fields separated by commas and enclosed within double quotation marks, with an initial line of column names. If the lines in such a file are terminated by carriage return/newline pairs, the statement shown here illustrates the field- and line-handling options you would use to load the file:
LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
FIELDS TERMINATED BY ',' ENCLOSED BY '"'
LINES TERMINATED BY '
'
IGNORE 1 LINES;
If the input values are not necessarily enclosed within quotation marks, use OPTIONALLY
before the ENCLOSED BY
keywords.
Any of the field- or line-handling options can specify an empty string (''
). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY
and FIELDS ESCAPED BY
values must be a single character. The FIELDS TERMINATED BY
, LINES STARTING BY
, and LINES TERMINATED BY
values can be more than one character. For example, to write lines that are terminated by carriage return/linefeed pairs, or to read a file containing such lines, specify a LINES TERMINATED BY '
'
clause.
To read a file containing jokes that are separated by lines consisting of %%
, you can do this
CREATE TABLE jokes (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY, joke TEXT NOT NULL); LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes FIELDS TERMINATED BY '' LINES TERMINATED BY ' %% ' (joke);
FIELDS [OPTIONALLY] ENCLOSED BY
controls quoting of fields. For output (SELECT ... INTO OUTFILE
), if you omit the word OPTIONALLY
, all fields are enclosed by the ENCLOSED BY
character. An example of such output (using a comma as the field delimiter) is shown here:
"1","a string","100.20" "2","a string containing a , comma","102.20" "3","a string containing a " quote","102.20" "4","a string containing a ", quote and comma","102.20"
If you specify OPTIONALLY
, the ENCLOSED BY
character is used only to enclose values from columns that have a string data type (such as CHAR
, BINARY
, TEXT
, or ENUM
):
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20
Note that occurrences of the ENCLOSED BY
character within a field value are escaped by prefixing them with theESCAPED BY
character. Also note that if you specify an empty ESCAPED BY
value, it is possible to inadvertently generate output that cannot be read properly by LOAD DATA INFILE
. For example, the preceding output just shown would appear as follows if the escape character is empty. Observe that the second field in the fourth line contains a comma following the quote, which (erroneously) appears to terminate the field:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20
For input, the ENCLOSED BY
character, if present, is stripped from the ends of field values. (This is true regardless of whether OPTIONALLY
is specified; OPTIONALLY
has no effect on input interpretation.) Occurrences of theENCLOSED BY
character preceded by the ESCAPED BY
character are interpreted as part of the current field value.
If the field begins with the ENCLOSED BY
character, instances of that character are recognized as terminating a field value only if followed by the field or line TERMINATED BY
sequence. To avoid ambiguity, occurrences of theENCLOSED BY
character within a field value can be doubled and are interpreted as a single instance of the character. For example, if ENCLOSED BY '"'
is specified, quotation marks are handled as shown here:
"The ""BIG"" boss" -> The "BIG" boss The "BIG" boss -> The "BIG" boss The ""BIG"" boss -> The ""BIG"" boss
FIELDS ESCAPED BY
controls how to read or write special characters:
-
For input, if the
FIELDS ESCAPED BY
character is not empty, occurrences of that character are stripped and the following character is taken literally as part of a field value. Some two-character sequences that are exceptions, where the first character is the escape character. These sequences are shown in the following table (using “” for the escape character). The rules for
NULL
handling are described later in this section.Character Escape Sequence