In case if managers that present in sqoop is not enough you need create your own managers.
Create project
Project
should have structure as shown below. You should add to build path
sqoop.jar and log4j.jar (in my case: sqoop-1.4.3-SNAPSHOT.jar and
commons-logging-1.1.1.jar).
Create new Manager Factory
Create
manager factory where you will define your custom managers. It has to
extend com.cloudera.sqoop.manager.ManagerFactory. In my case it's
CustomManagerFactory.
package sqoop.manager;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.sqoop.manager.Db2Manager;
import org.apache.sqoop.manager.DirectMySQLManager;
import org.apache.sqoop.manager.DirectPostgresqlManager;
import org.apache.sqoop.manager.HsqldbManager;
import org.apache.sqoop.manager.OracleManager;
import org.apache.sqoop.manager.PostgresqlManager;
import org.apache.sqoop.manager.SQLServerManager;
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.manager.ConnManager;
import com.cloudera.sqoop.metastore.JobData;
import com.cloudera.sqoop.manager.ManagerFactory;
public class CustomManagerFactory extends ManagerFactory{
public static final Log LOG = LogFactory.getLog(
CustomManagerFactory.class.getName());
public ConnManager accept(JobData data) {
SqoopOptions options = data.getSqoopOptions();
String scheme = extractScheme(options);
if (null == scheme) {
// We don't know if this is a mysql://, hsql://, etc.
// Can't do anything with this.
LOG.warn("Null scheme associated with connect string.");
return null;
}
LOG.debug("Trying with scheme: " + scheme);
if (scheme.equals("jdbc:mysql:")) {
if (options.isDirect()) {
return new DirectMySQLManager(options);
} else {
return new MysqlManager(options);
}
} else if (scheme.equals("jdbc:postgresql:")) {
if (options.isDirect()) {
return new DirectPostgresqlManager(options);
} else {
return new PostgresqlManager(options);
}
} else if (scheme.startsWith("jdbc:hsqldb:")) {
return new HsqldbManager(options);
} else if (scheme.startsWith("jdbc:oracle:")) {
return new OracleManager(options);
} else if (scheme.startsWith("jdbc:sqlserver:")) {
return new SQLServerManager(options);
} else if (scheme.startsWith("jdbc:db2:")) {
return new Db2Manager(options);
} else {
return null;
}
}
protected String extractScheme(SqoopOptions options) {
String connectStr = options.getConnectString();
// java.net.URL follows RFC-2396 literally, which does not allow a ':'
// character in the scheme component (section 3.1). JDBC connect strings,
// however, commonly have a multi-scheme addressing system. e.g.,
// jdbc:mysql://...; so we cannot parse the scheme component via URL
// objects. Instead, attempt to pull out the scheme as best as we can.
// First, see if this is of the form [scheme://hostname-and-etc..]
int schemeStopIdx = connectStr.indexOf("//");
if (-1 == schemeStopIdx) {
// If no hostname start marker ("//"), then look for the right-most ':'
// character.
schemeStopIdx = connectStr.lastIndexOf(':');
if (-1 == schemeStopIdx) {
// Warn that this is nonstandard. But we should be as permissive
// as possible here and let the ConnectionManagers themselves throw
// out the connect string if it doesn't make sense to them.
LOG.warn("Could not determine scheme component of connect string");
// Use the whole string.
schemeStopIdx = connectStr.length();
}
}
return connectStr.substring(0, schemeStopIdx);
}
}
Create custom Manager
Create new manager that will works with data. Manager shoud extend org.apache.sqoop.manager.InformationShemaManager
package sqoop.manager;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.sqoop.manager.*;
import com.cloudera.sqoop.SqoopOptions;
public class MysqlManager extends InformationSchemaManager {
public static final Log LOG = LogFactory.getLog(MysqlManager.class.getName());
private static final String STRING = "";
public MysqlManager(SqoopOptions opts) {
super(STRING, opts);
}
@Override
protected String getSchemaQuery() {
return null;
}
@Override
protected String getListDatabasesQuery() {
return null;
}
}
Configure sqoop for work with external factory manager
First
of all you have to define SQOOP_CONF_DIR for example
'/home/cloudera/sqoop/conf'. After that in this folder you have to
create folder 'managers.d'. In this folder you have to create property
file, for example 'managers.property'.
In this file you can define you own factory manager. Structure of file you can see below.
sqoop.manager.CustomManagerFactory=/home/cloudera/workspace/extention-conn-manager/manager.jar
|
After next start of sqoop you have to see defined your factory manager, as shown below.
Example Project
https://drive.google.com/#folders/0BynTyQpk3OoFNkg0dnJaWHFlVXc