sqoop相关整理记录

生产背景:

在从mysql导入到hive中,遇到如下问题:

从hive导出到mysql中,遇到如下问题:

sqoop 缺点:

1  基于命令行的操作方式,易出错,且不安全。
2  数据传输和数据格式是紧耦合的,这使得connector无法支持所有的数据格式
3  用户名和密码暴漏出来
4  sqoop安装需要root权限

Sqoop优点:

1   高效可控的利用资源,任务并行度,超时时间。
2   数据类型映射与转化,可自动进行,用户也可自定义 .
3   支持多种主流数据库,MySQL,Oracle,SQL Server,DB2等等 。

Sqoop原理:

Sqoop的inport原理:

Sqoop的export原理:

验证sqoop的各种报错:

1 mysql字段太短
2 hive的空字段转换
3 分隔符错误
4  mysql的网络不在集群网络中
5  mysql停止服务
6 mysql utf8编码只是3个字节,可能是因为某些unicode字符转成utf8之后变成了4个字节,需要mysql支持utf8mb4
7  Sqoop调式信息
8 修改生成的Java类,重新打包。

Sqoop命令行说明

生产背景:

 在从mysql导入到hive,遇到如下问题:

       1) 源mysql和集群机器不在同一个网段中,导致执行导入命令,网络连接失败。

       2) 某些字符导入到hive中,出现报错终止。 

    2.1  sqoop使用的JDBC-connector 版本太低(更换版本)。

hive导出到mysql中,遇到如下问题:

 1)某些字符插入mysql,出现报错终止。

   1.1 可能mysql本身编码的限制,某些字符不支持,比如uft8utf8mb4

   1.2  sqoop使用的JDBC-connector 版本太低(更换版本)。

 

sqoop 缺点:

 1  基于命令行的操作方式,易出错,且不安全。

 2  数据传输和数据格式是紧耦合的,这使得connector无法支持所有的数据格式

 3  用户名和密码暴漏出来

 4  sqoop安装需要root权限

Sqoop优点:

 

1   高效可控的利用资源,任务并行度,超时时间。

2   数据类型映射与转化,可自动进行,用户也可自定义 .

3   支持多种主流数据库,MySQL,OracleSQL ServerDB2等等 

 

 

Sqoop原理:

 

Sqoopinport原理:

    Sqoopimport时,需要制定split-by参数。Sqoop根据不同的split-by参数值来进行切分,然后将切分出来的区域分配到不同map中。每个map中再处理数据库中获取的一行一行的值,写入到HDFS中。同时split-by根据不同的参数类型有不同的切分方法,如比较简单的int型,Sqoop会取最大和最小split-by字段值,然后根据传入的num-mappers来确定划分几个区域。 比如select max(split_by),min(split-by) from得到的max(split-by)min(split-by)分别为10001,而num-mappers2的话,则会分成两个区域(1,500)(501-100),同时也会分成2sql2map去进行导入操作,分别为select XXX from table where split-by>=1 and split-by<500select XXX from table where split-by>=501 and split-by<=1000。最后每个map各自获取各自SQL中的数据进行导入工作。

Sqoopexport原理:根据mysql表名称,生成一个以表名称命名的Java类,该类继承了sqoopRecord的,是一个只有MapMR,且自定义了输出字段。

   

sqoop export –connect jdbc:mysql://$url:3306/$3?characterEncoding=utf8 –username $username –password $password –table $1 –export-dir $2 –input-fields-terminated-by ‘|’ –null-non-string ‘0’ –null-string ‘0’;

验证sqoop的各种报错:

mysql表

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(20),

click_url    varchar(200)

)

hive 表

CREATE TABLE TRLOG

(

PLATFORM string,

USER_ID int,

CLICK_TIME string,

CLICK_URL string

)

row format delimited

fields terminated by ‘/t’;

sqoop list-databases –connect jdbc:mysql://192.168.119.129:3306/ –username li72 –password 123

1  mysql字段太短

 

sqoop export –connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 –username li72 –password 123 –table dm_trlog –export-dir /home/bigdata/hive/data/db1.db/trlog –input-fields-terminated-by ‘/t’ –null-non-string ‘0’ –null-string ‘0’;

 

Warning: $HADOOP_HOME is deprecated.

14/11/06 01:42:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:42:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:42:32 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:42:38 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.jar

14/11/06 01:42:38 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:42:43 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:42:44 INFO mapred.JobClient: Running job: job_201411060114_0001

14/11/06 01:42:45 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:43:15 INFO mapred.JobClient: Task Id : attempt_201411060114_0001_m_000000_0, Status : FAILED

java.io.IOException: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column ‘click_time’ at row 1

        at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:192)

        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column ‘click_time’ at row 1

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4118)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4052)

        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503)

        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664)

        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2794)

        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)

        at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)

        at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)

mysql字段太短了

drop table  dm_trlog;

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(200),

click_url    varchar(200)

)

 

2 hive的空字段转换 

由于Hive的NULL用/N来表示,字段用/001来分割,换行用/n来换行,导出分隔符一定要和hive表保持一致,如果为空可以指定转换为0,有些mysql数字字段不能插入/N

加上两个参数:–input-null-string ‘//N’ –input-null-non-string ‘//N’多加一个‘/’,是为转义

sqoop export –connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 –username li72 –password 123 –table dm_trlog –export-dir /home/bigdata/hive/data/db1.db/trlog –input-fields-terminated-by ‘/t’ –null-non-string ‘0’ –null-string ‘0’;

14/10/23 04:53:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/915d24128af59c9e517580e0f07411d4/dm_pc_play_kpi.jar 
14/10/23 04:53:47 INFO mapreduce.ExportJobBase: Beginning export of dm_pc_play_kpi 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 WARN snappy.LoadSnappy: Snappy native library is available 
14/10/23 04:53:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/10/23 04:53:48 INFO snappy.LoadSnappy: Snappy native library loaded 
14/10/23 04:53:48 INFO mapred.JobClient: Running job: job_201408301703_84117 
14/10/23 04:53:49 INFO mapred.JobClient: map 0% reduce 0% 
14/10/23 04:55:45 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_0, Status : FAILED 
java.io.IOException: Can’t export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: “N” 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
… 10 more 

14/10/23 04:55:53 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_1, Status : FAILED 
14/10/23 04:55:58 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_2, Status : FAILED 
java.io.IOException: Can’t export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: “N” 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
… 10 more 

 

3 分隔符错误

Hive中的分隔符是’/t’ 但是导出写成’|’

sqoop export –connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 –username li72 –password 123 –table dm_trlog –export-dir /home/bigdata/hive/data/db1.db/trlog –input-fields-terminated-by ‘|’ –null-non-string ‘0’ –null-string ‘0’;

 为了测试,特意把分隔符改成 “|”

14/11/06 01:50:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:50:19 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:50:19 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:50:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:50:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.jar

14/11/06 01:50:25 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:50:45 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:50:51 INFO mapred.JobClient: Running job: job_201411060114_0003

14/11/06 01:50:52 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:51:20 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000000_0, Status : FAILED

java.io.IOException: Can’t export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        … 10 more

 

14/11/06 01:51:21 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000001_0, Status : FAILED

java.io.IOException: Can’t export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        … 10 more

 

4  mysql的网络不在集群网络中

 

sqoop export –connect jdbc:mysql://192.168.119.1:3306/student?characterEncoding=utf8 –username li72 –password 123 –table dm_trlog –export-dir /home/bigdata/hive/data/db1.db/trlog –input-fields-terminated-by ‘/t’ –null-non-string ‘0’ –null-string ‘0’;

 Warning: $HADOOP_HOME is deprecated.

14/11/06 02:04:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 02:04:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 02:04:30 INFO tool.CodeGenTool: Beginning code generation

14/11/06 02:07:40 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        … 32 more

 

5  mysql停止服务

14/11/06 04:55:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.

14/11/06 04:55:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:

14/11/06 04:55:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@8a0d5d

14/11/06 04:55:25 INFO tool.CodeGenTool: Beginning code generation

14/11/06 04:55:26 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        … 32 more

14/11/06 04:55:27 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.CatalogQueryManager: Failed to list columns from query: SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA())   AND TABLE_NAME = ‘dm_trlog’ 

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.CatalogQueryManager.getColumnNames(CatalogQueryManager.java:147)

        at org.apache.sqoop.orm.ClassWriter.getColumnNames(ClassWriter.java:1222)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1074)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        … 28 more

14/11/06 04:55:27 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

6  mysql utf8编码只是3个字节,可能是因为某些unicode字符转成utf8之后变成了4个字节,需要mysql支持utf8mb4

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail. 
Please set $HCAT_HOME to the root of your HCatalog installation. 
14/11/09 07:00:33 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
14/11/09 07:00:33 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
14/11/09 07:00:33 INFO tool.CodeGenTool: Beginning code generation 
14/11/09 07:00:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_go_snger` AS t LIMIT 1 
14/11/09 07:00:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop/hadoop-1.1.2 
Note: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
14/11/09 07:00:35 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/211b9679d4ac771d3ff710cbbd1c7277/dm_go_snger.jar 
14/11/09 07:00:35 INFO mapreduce.ExportJobBase: Beginning export of dm_go_snger 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 INFO input.FileInputFormat: Total input paths to process : 1 
14/11/09 07:00:36 WARN snappy.LoadSnappy: Snappy native library is available 
14/11/09 07:00:36 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/11/09 07:00:36 INFO snappy.LoadSnappy: Snappy native library loaded 
14/11/09 07:00:37 INFO mapred.JobClient: Running job: job_201408301703_121362 
14/11/09 07:00:38 INFO mapred.JobClient: map 0% reduce 0% 
14/11/09 07:00:52 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_0, Status : FAILED 
java.io.IOException: Can’t export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
… 10 more 
Caused by: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:00:59 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_1, Status : FAILED 
java.io.IOException: Can’t export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
… 10 more 
Caused by: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:07 INFO mapred.JobClient: Task Id : attempt_201408301703_121362_m_000000_2, Status : FAILED 
java.io.IOException: Can’t export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.io.IOException: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:220) 
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.write(AsyncSqlRecordWriter.java:46) 
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:639) 
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:84) 
… 10 more 
Caused by: java.sql.SQLException: Incorrect string value: ‘xF3x90x8Cx92xEFxBF…’ for column ‘singer’ at row 1 
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1072) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3563) 
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3495) 
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1959) 
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2113) 
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2693) 
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2102) 
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1364) 
at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233) 

14/11/09 07:01:20 INFO mapred.JobClient: Job complete: job_201408301703_121362 
14/11/09 07:01:20 INFO mapred.JobClient: Counters: 8 
14/11/09 07:01:20 INFO mapred.JobClient: Job Counters 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=31500 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 
14/11/09 07:01:20 INFO mapred.JobClient: Rack-local map tasks=3 
14/11/09 07:01:20 INFO mapred.JobClient: Launched map tasks=4 
14/11/09 07:01:20 INFO mapred.JobClient: Data-local map tasks=1 
14/11/09 07:01:20 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 
14/11/09 07:01:20 INFO mapred.JobClient: Failed map tasks=1 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 44.4052 seconds (0 bytes/sec) 
14/11/09 07:01:20 INFO mapreduce.ExportJobBase: Exported 0 records. 
14/11/09 07:01:20 ERROR tool.ExportTool: Error during export: Export job failed! 
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. 
Logging initialized using configuration in jar:file:/home/hadoop/hadoop/hive-0.10.0.20140629/lib/hive-common-0.10.0.jar!/hive-log4j.properties 
Hive history file=/tmp/hadoop/hive_job_log_hadoop_201411090701_1423933997.txt 

7  Sqoop调式信息

 增加关键字--verbose

sqoop export –connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 –username li72 –password 123 –verbose –table dm_trlog –export-dir /home/bigdata/hive/data/db1.db/trlog –input-fields-terminated-by ‘/t’ –null-non-string ‘0’ –null-string ‘0’;

8 修改生成的Java类,重新打包。

每次通过sqoop导入MySql的时,都会生成一个以MySql表命名的.java文件,然后打成JAR包,给sqoop提交给hadoop MR来解析Hive表中的数据。那可以根据报的错误,找到对应的行,改写该文件,编译,重新打包,sqoop可以通过 -jar-file –class-name 组合让我们指定运行自己的jar包中的某个class。来解析该hive表中的每行数据。脚本如下:一个完整的例子如下: 
./bin/sqoop export –connect “jdbc:mysql://192.168.119.129:3306/student?useUnicode=true&characterEncoding=utf-8” 
–username li72 –password 123 –table dm_trlog 
–export-dir /hive/warehouse/trlog –input-fields-terminated-by ‘/t’ 
–input-null-string ‘//N’ –input-null-non-string ‘//N’ 
–class-name com.li72.trlog 
–jar-file /tmp/sqoopTempjar/trlog.jar 
上面–jar-file 参数指定jar包的路径。–class-name 指定jar包中的class 
这样就可以解决所有解析异常了。 

Sqoop命令行说明

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
Common arguments:
   --connect <jdbc-uri> Specify JDBC connect
                                                string
   --connection-manager <class-name> Specify connection manager
                                                class name
   --connection-param-file <properties-file> Specify connection
                                                parameters file
   --driver <class-name> Manually specify JDBC
                                                driver class to use
   --hadoop-home <hdir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --hadoop-mapred-home <dir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --help Print usage instructions
-P Read password from console
   --password <password> Set authentication
                                                password
   --password-file <password-file> Set authentication
                                                password file path
   --relaxed-isolation Use read-uncommitted
                                                isolation for imports
   --skip-dist-cache Skip copying jars to
                                                distributed cache
   --username <username> Set authentication
                                                username
   --verbose Print more information
                                                while working
  
Import control arguments:
   --append Imports data
                                                              in append
                                                              mode
   --as-avrodatafile Imports data
                                                              to Avro data
                                                              files
   --as-sequencefile Imports data
                                                              to
                                                              SequenceFile
                                                              s
   --as-textfile Imports data
                                                              as plain
                                                              text
                                                              (default)
   --boundary-query <statement> Set boundary
                                                              query for
                                                              retrieving
                                                              max and min
                                                              value of the
                                                              primary key
   --columns <col,col,col...> Columns to
                                                              import from
                                                              table
   --compression-codec <codec> Compression
                                                              codec to use
                                                              for import
   --delete-target-dir Imports data
                                                              in delete
                                                              mode
   --direct Use direct
                                                              import fast
                                                              path
   --direct-split-size <n> Split the
                                                              input stream
                                                              every 'n'
                                                              bytes when
                                                              importing in
                                                              direct mode
-e,--query <statement> Import
                                                              results of
                                                              SQL
                                                              'statement'
   --fetch-size <n> Set number
                                                              'n' of rows
                                                              to fetch
                                                              from the
                                                              database
                                                              when more
                                                              rows are
                                                              needed
   --inline-lob-limit <n> Set the
                                                              maximum size
                                                              for an
                                                              inline LOB
-m,--num-mappers <n> Use 'n' map
                                                              tasks to
                                                              import in
                                                              parallel
   --mapreduce-job-name <name> Set name for
                                                              generated
                                                              mapreduce
                                                              job
   --split-by <column-name> Column of
                                                              the table
                                                              used to
                                                              split work
                                                              units
   --table <table-name> Table to
                                                              read
   --target-dir <dir> HDFS plain
                                                              table
                                                              destination
   --validate Validate the
                                                              copy using
                                                              the
                                                              configured
                                                              validator
   --validation-failurehandler <validation-failurehandler> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationFa
                                                              ilureHandler
   --validation-threshold <validation-threshold> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationTh
                                                              reshold
   --validator <validator> Fully
                                                              qualified
                                                              class name
                                                              for the
                                                              Validator
   --warehouse-dir <dir> HDFS parent
                                                              for table
                                                              destination
   --where <where clause> WHERE clause
                                                              to use
                                                              during
                                                              import
-z,--compress Enable
                                                              compression
  
Incremental import arguments:
   --check-column <column> Source column to check for incremental
                                  change
   --incremental <import-type> Define an incremental import of type
                                  'append' or 'lastmodified'
   --last-value <value> Last imported value in the incremental
                                  check column
  
Output line formatting arguments:
   --enclosed-by <char> Sets a required field enclosing
                                      character
   --escaped-by <char> Sets the escape character
   --fields-terminated-by <char> Sets the field separator character
   --lines-terminated-by <char> Sets the end-of-line character
   --mysql-delimiters Uses MySQL's default delimiter set:
                                      fields: , lines: /n escaped-by: /
                                      optionally-enclosed-by: '
   --optionally-enclosed-by <char> Sets a field enclosing character
  
Input parsing arguments:
   --input-enclosed-by <char> Sets a required field encloser
   --input-escaped-by <char> Sets the input escape
                                            character
   --input-fields-terminated-by <char> Sets the input field separator
   --input-lines-terminated-by <char> Sets the input end-of-line
                                            char
   --input-optionally-enclosed-by <char> Sets a field enclosing
                                            character
  
Hive arguments:
   --create-hive-table Fail if the target hive
                                               table exists
   --hive-database <database-name> Sets the database name to
                                               use when importing to hive
   --hive-delims-replacement <arg> Replace Hive record /0x01
                                               and row delimiters (/n/r)
                                               from imported string fields
                                               with user-defined string
   --hive-drop-import-delims Drop Hive record /0x01 and
                                               row delimiters (/n/r) from
                                               imported string fields
   --hive-home <dir> Override $HIVE_HOME
   --hive-import Import tables into Hive
                                               (Uses Hive's default
                                               delimiters if none are
                                               set.)
   --hive-overwrite Overwrite existing data in
                                               the Hive table
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --hive-table <table-name> Sets the table name to use
                                               when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HBase arguments:
   --column-family <family> Sets the target column family for the
                               import
   --hbase-bulkload Enables HBase bulk loading
   --hbase-create-table If specified, create missing HBase tables
   --hbase-row-key <col> Specifies which input column to use as the
                               row key
   --hbase-table <table> Import to <table> in HBase
  
HCatalog arguments:
   --hcatalog-database <arg> HCatalog database name
   --hcatalog-home <hdir> Override $HCAT_HOME
   --hcatalog-table <arg> HCatalog table name
   --hive-home <dir> Override $HIVE_HOME
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
  
HCatalog import specific options:
   --create-hcatalog-table Create HCatalog before import
   --hcatalog-storage-stanza <arg> HCatalog storage stanza for table
                                      creation
  
Accumulo arguments:
   --accumulo-batch-size <size> Batch size in bytes
   --accumulo-column-family <family> Sets the target column family for
                                         the import
   --accumulo-create-table If specified, create missing
                                         Accumulo tables
   --accumulo-instance <instance> Accumulo instance name.
   --accumulo-max-latency <latency> Max write latency in milliseconds
   --accumulo-password <password> Accumulo password.
   --accumulo-row-key <col> Specifies which input column to
                                         use as the row key
   --accumulo-table <table> Import to <table> in Accumulo
   --accumulo-user <user> Accumulo user name.
   --accumulo-visibility <vis> Visibility token to be applied to
                                         all rows imported
   --accumulo-zookeepers <zookeepers> Comma-separated list of
                                         zookeepers (host:port)
  
Code generation arguments:
   --bindir <dir> Output directory for compiled
                                         objects
   --class-name <name> Sets the generated class name.
                                         This overrides --package-name.
                                         When combined with --jar-file,
                                         sets the input class.
   --input-null-non-string <null-str> Input null non-string
                                         representation
   --input-null-string <null-str> Input null string representation
   --jar-file <file> Disable code generation; use
                                         specified jar
   --map-column-java <arg> Override mapping for specific
                                         columns to java types
   --null-non-string <null-str> Null non-string representation
   --null-string <null-str> Null string representation
   --outdir <dir> Output directory for generated
                                         code
   --package-name <name> Put auto-generated classes in
                                         this package

原创文章,作者:Maggie-Hunter,如若转载,请注明出处:https://blog.ytso.com/191009.html

(0)
上一篇 2021年11月14日
下一篇 2021年11月14日

相关推荐

发表回复

登录后才能评论