Manual Installation¶
If you wish to install HATCHet directly from this repository, the steps are a bit more involved.
Note that the complexity of manual installation is largely because the compute-cn step (determination of
allele-specific copy numbers) of the HATCHet pipeline uses custom-written C++11 code that uses the
Gurobi optimizer. If you do not have a valid Gurobi license (though it is
easily available for users in academia), then the C++ parts of
HATCHet do not necessarily need to be compiled, and you can read the
Compiling HATCHet without the built-in Gurobi optimizer section of this page.
Compiling HATCHet with the built-in Gurobi optimizer¶
The core optimization module of HATCHet is written in C++11 and thus requires a modern C++ compiler (GCC >= 4.8.1, or Clang).
As long as you have a recent version of GCC or Clang installed, setuptools should automatically be able to download a
recent version of cmake and compile the Hatchet code into a working package.
The installation process can be broken down into the following steps:
Get Gurobi (v9.0.2)
The coordinate-method applied by HATCHet is based on several integer linear programming (ILP) formulations. Gurobi is a commercial ILP solver with two licensing options: (1) a single-host license where the license is tied to a single computer and (2) a network license for use in a compute cluster (using a license server in the cluster). Both options are freely and easily available for users in academia. Download Gurobi for your specific platform.
Set GUROBI_HOME environment variable
$ export GUROBI_HOME=/path/to/gurobi902
Set
GUROBI_HOMEto where you download Gurobi. HereXXXis the 3-digit version of gurobi.Build Gurobi
$ cd "${GUROBI_HOME}" $ cd linux64/src/build/ $ make $ cp libgurobi_c++.a ../../lib
Substitute
mac64forlinux64if using the Mac OSX platform.Create a new venv/conda environment for Hatchet
Hatchetis a Python 3 package. Unless you want to compile/install it in your default Python 3 environment, you will want to create either a new Conda environment for Python 3 and activate it:conda create --name hatchet python=3.8 conda activate hatchet
or use
virtualenvthroughpip:python3 -m pip virtualenv env source env/bin/activate
Install basic packages
It is highly recommended that you upgrade your
pipandsetuptoolsversions to the latest, using:pip install -U pip pip install -U setuptools
Build and install HATCHet
Execute the following commands from the root of HATCHet’s repository.
$ pip install .
NOTE: If you experience a failure of compilation with an error message like:
_undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'_.
you may need to set
CXXFLAGSto-pthreadbefore invoking the command:$ CXXFLAGS=-pthread pip install .
When the compilation process fails or when the environment has special requirements, you may have to manually specify the required paths to Gurobi by following the detailed intructions.
Install required utilities
For reading BAM files, read counting, allele counting, and SNP calling, you need to install SAMtools, BCFtools, and tabix as well as mosdepth. If you want to perform reference-based phasing, you must also install shapeit, picard, and bgzip. The easiest way to install these is via
conda, as all are available from thebiocondachannel (exceptshapeitwhich is available from channeldranew).
Compiling HATCHet without the built-in Gurobi optimizer¶
If you wish to use an alternate ILP optimizer, then you do not need a C++ compiler.
In this case, set the HATCHET_BUILD_NOEXT environment variable to 1 (using export HATCHET_BUILD_NOEXT=1),
set the environment variable HATCHET_COMPUTE_CN_SOLVER to a Pyomo-supported solver (see the
Using a different Pyomo-supported solver section of the README for more details)
and proceed directly to step (4) above.