In case if managers that present in sqoop is not enough you need create your own managers.
Create project
Project should have structure as shown below. You should add to build path sqoop.jar and log4j.jar (in my case: sqoop-1.4.3-SNAPSHOT.jar and commons-logging-1.1.1.jar).Create new Manager Factory
Create manager factory where you will define your custom managers. It has to extend com.cloudera.sqoop.manager.ManagerFactory. In my case it's CustomManagerFactory.package sqoop.manager; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.sqoop.manager.Db2Manager; import org.apache.sqoop.manager.DirectMySQLManager; import org.apache.sqoop.manager.DirectPostgresqlManager; import org.apache.sqoop.manager.HsqldbManager; import org.apache.sqoop.manager.OracleManager; import org.apache.sqoop.manager.PostgresqlManager; import org.apache.sqoop.manager.SQLServerManager; import com.cloudera.sqoop.SqoopOptions; import com.cloudera.sqoop.manager.ConnManager; import com.cloudera.sqoop.metastore.JobData; import com.cloudera.sqoop.manager.ManagerFactory; public class CustomManagerFactory extends ManagerFactory{ public static final Log LOG = LogFactory.getLog( CustomManagerFactory.class.getName()); public ConnManager accept(JobData data) { SqoopOptions options = data.getSqoopOptions(); String scheme = extractScheme(options); if (null == scheme) { // We don't know if this is a mysql://, hsql://, etc. // Can't do anything with this. LOG.warn("Null scheme associated with connect string."); return null; } LOG.debug("Trying with scheme: " + scheme); if (scheme.equals("jdbc:mysql:")) { if (options.isDirect()) { return new DirectMySQLManager(options); } else { return new MysqlManager(options); } } else if (scheme.equals("jdbc:postgresql:")) { if (options.isDirect()) { return new DirectPostgresqlManager(options); } else { return new PostgresqlManager(options); } } else if (scheme.startsWith("jdbc:hsqldb:")) { return new HsqldbManager(options); } else if (scheme.startsWith("jdbc:oracle:")) { return new OracleManager(options); } else if (scheme.startsWith("jdbc:sqlserver:")) { return new SQLServerManager(options); } else if (scheme.startsWith("jdbc:db2:")) { return new Db2Manager(options); } else { return null; } } protected String extractScheme(SqoopOptions options) { String connectStr = options.getConnectString(); // java.net.URL follows RFC-2396 literally, which does not allow a ':' // character in the scheme component (section 3.1). JDBC connect strings, // however, commonly have a multi-scheme addressing system. e.g., // jdbc:mysql://...; so we cannot parse the scheme component via URL // objects. Instead, attempt to pull out the scheme as best as we can. // First, see if this is of the form [scheme://hostname-and-etc..] int schemeStopIdx = connectStr.indexOf("//"); if (-1 == schemeStopIdx) { // If no hostname start marker ("//"), then look for the right-most ':' // character. schemeStopIdx = connectStr.lastIndexOf(':'); if (-1 == schemeStopIdx) { // Warn that this is nonstandard. But we should be as permissive // as possible here and let the ConnectionManagers themselves throw // out the connect string if it doesn't make sense to them. LOG.warn("Could not determine scheme component of connect string"); // Use the whole string. schemeStopIdx = connectStr.length(); } } return connectStr.substring(0, schemeStopIdx); } }
Create custom Manager
Create new manager that will works with data. Manager shoud extend org.apache.sqoop.manager.InformationShemaManagerpackage sqoop.manager; import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.sqoop.manager.*; import com.cloudera.sqoop.SqoopOptions; public class MysqlManager extends InformationSchemaManager { public static final Log LOG = LogFactory.getLog(MysqlManager.class.getName()); private static final String STRING = ""; public MysqlManager(SqoopOptions opts) { super(STRING, opts); } @Override protected String getSchemaQuery() { return null; } @Override protected String getListDatabasesQuery() { return null; } }
Configure sqoop for work with external factory manager
First of all you have to define SQOOP_CONF_DIR for example '/home/cloudera/sqoop/conf'. After that in this folder you have to create folder 'managers.d'. In this folder you have to create property file, for example 'managers.property'.In this file you can define you own factory manager. Structure of file you can see below.
sqoop.manager.CustomManagerFactory=/home/cloudera/workspace/extention-conn-manager/manager.jar |
No comments:
Post a Comment