Commit 74bc8cd3 authored by Sean Solari's avatar Sean Solari
Browse files

Bug fixes for new factoring

parent 0e01e20b
...@@ -165,7 +165,7 @@ Run metagenomic reads against a succesfully built database. See :doc:`Tutorial 2 ...@@ -165,7 +165,7 @@ Run metagenomic reads against a succesfully built database. See :doc:`Tutorial 2
.. code-block:: console .. code-block:: console
$ expam run -db DB_NAME [args...] $ expam classify -db DB_NAME [args...]
.. option:: -d <file path>, --directory <file path> .. option:: -d <file path>, --directory <file path>
...@@ -226,7 +226,7 @@ Run metagenomic reads against a succesfully built database. See :doc:`Tutorial 2 ...@@ -226,7 +226,7 @@ Run metagenomic reads against a succesfully built database. See :doc:`Tutorial 2
.. code-block:: console .. code-block:: console
$ expam run ... --group #FF0000 sample_one sample_two $ expam classify ... --group #FF0000 sample_one sample_two
.. option:: --alpha <float> .. option:: --alpha <float>
...@@ -260,7 +260,7 @@ Example ...@@ -260,7 +260,7 @@ Example
.. code-block:: console .. code-block:: console
$ expam run -db DB_NAME -d /path/to/paired/reads --paired --out ~/paired_reads_analysis --taxonomy $ expam classify -db DB_NAME -d /path/to/paired/reads --paired --out ~/paired_reads_analysis --taxonomy
.. _download taxonomy: .. _download taxonomy:
...@@ -305,7 +305,7 @@ Translate phylogenetic classification output to NCBI taxonomy. ...@@ -305,7 +305,7 @@ Translate phylogenetic classification output to NCBI taxonomy.
Plotting results on phylotree Plotting results on phylotree
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Results are automatically visualised on top of a phylogenetic tree when during the :code:`expam run` command, Results are automatically visualised on top of a phylogenetic tree when during the :code:`expam classify` command,
but can also be done after classification using the :code:`phylotree` command. but can also be done after classification using the :code:`phylotree` command.
.. code-block:: .. code-block::
...@@ -393,7 +393,7 @@ Example ...@@ -393,7 +393,7 @@ Example
.. note:: .. note::
The :code:`expam_limit` context works the same for any command. :code:`expam build` The :code:`expam_limit` context works the same for any command. :code:`expam build`
can be replaced with :code:`expam run`, or any other command. can be replaced with :code:`expam classify`, or any other command.
The following is an example of the (tab-separated) log file output: The following is an example of the (tab-separated) log file output:
......
...@@ -3,27 +3,27 @@ ...@@ -3,27 +3,27 @@
A programmatic API to interact with phylogenetic trees, particularly those used in reference databases. A programmatic API to interact with phylogenetic trees, particularly those used in reference databases.
expam.tree.Location expam.tree.location.Location
------------------- ----------------------------
.. autoclass:: expam.tree.Location .. autoclass:: expam.tree.location.Location
.. autofunction:: expam.tree.Location.__init__ .. autofunction:: expam.tree.location.Location.__init__
expam.tree.Index expam.tree.tree.Index
---------------- ---------------------
.. autoclass:: expam.tree.Index .. autoclass:: expam.tree.tree.Index
.. autofunction:: expam.tree.Index.load_newick .. autofunction:: expam.tree.tree.Index.load_newick
.. autofunction:: expam.tree.Index.from_newick .. autofunction:: expam.tree.tree.Index.from_newick
Example loading an Index object from a Newick string. Example loading an Index object from a Newick string.
.. code-block:: python .. code-block:: python
>>> from expam.tree import Index >>> from expam.tree.tree import Index
>>> tree_string = "(B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);" >>> tree_string = "(B:6.0,(A:5.0,C:3.0,E:4.0):5.0,D:11.0);"
>>> leaves, index = Index.from_newick(tree_string) >>> leaves, index = Index.from_newick(tree_string)
* Initialising node pool... * Initialising node pool...
...@@ -42,13 +42,13 @@ expam.tree.Index ...@@ -42,13 +42,13 @@ expam.tree.Index
>>> index['A'].coordinate >>> index['A'].coordinate
[0, 0, 1, 0] [0, 0, 1, 0]
.. autofunction:: expam.tree.Index.resolve_polytomies .. autofunction:: expam.tree.tree.Index.resolve_polytomies
.. autofunction:: expam.tree.Index.coord .. autofunction:: expam.tree.tree.Index.coord
.. autofunction:: expam.tree.Index.to_newick .. autofunction:: expam.tree.tree.Index.to_newick
.. autofunction:: expam.tree.Index.yield_child_nodes .. autofunction:: expam.tree.tree.Index.yield_child_nodes
.. code-block:: python .. code-block:: python
...@@ -70,9 +70,9 @@ expam.tree.Index ...@@ -70,9 +70,9 @@ expam.tree.Index
Internal node (branch) names can start with 'p', but this may also be neglected. Internal node (branch) names can start with 'p', but this may also be neglected.
.. autofunction:: expam.tree.Index.yield_leaves .. autofunction:: expam.tree.tree.Index.yield_leaves
.. autofunction:: expam.tree.Index.get_child_nodes .. autofunction:: expam.tree.tree.Index.get_child_nodes
.. code-block:: python .. code-block:: python
...@@ -81,4 +81,4 @@ expam.tree.Index ...@@ -81,4 +81,4 @@ expam.tree.Index
>>> index.get_child_nodes('E') >>> index.get_child_nodes('E')
['E'] ['E']
.. autofunction:: expam.tree.Index.get_child_leaves .. autofunction:: expam.tree.tree.Index.get_child_leaves
...@@ -93,7 +93,7 @@ Phylogenetic classification results ...@@ -93,7 +93,7 @@ Phylogenetic classification results
.. code-block:: console .. code-block:: console
$ expam run -db my_database -d /path/to/sample_one.fq --out sample_one $ expam classify -db my_database -d /path/to/sample_one.fq --out sample_one
* In :code:`./sample_one`, there will be a :code:`phy` subdirectory containing three files: * In :code:`./sample_one`, there will be a :code:`phy` subdirectory containing three files:
...@@ -199,11 +199,11 @@ Taxonomic results ...@@ -199,11 +199,11 @@ Taxonomic results
.. code-block:: console .. code-block:: console
$ expam run -d /path/to/reads --out example --taxonomy $ expam classify -d /path/to/reads --out example --taxonomy
.. code-block:: console .. code-block:: console
$ expam run -d /path/to/reads --out example_one $ expam classify -d /path/to/reads --out example_one
$ expam to_taxonomy --out example_one $ expam to_taxonomy --out example_one
* Where before the results directory contained only a :code:`phy` subdirectory, it will now also contain a :code:`tax` folder. * Where before the results directory contained only a :code:`phy` subdirectory, it will now also contain a :code:`tax` folder.
......
Graphical output Graphical output
================ ================
* When a :code:`run` or :code:`to_taxonomy` command is executed, raw summary files are produced (as described in :doc:`Classification <classify>`) and a phylotree is also produced as a graphical depiction of the sample summary. * When a :code:`classify` or :code:`to_taxonomy` command is executed, raw summary files are produced (as described in :doc:`Classification <classify>`) and a phylotree is also produced as a graphical depiction of the sample summary.
* This graphical representation has some customisable features: * This graphical representation has some customisable features:
* Multiple samples can be plotted on the same tree, with different colours for different samples. * Multiple samples can be plotted on the same tree, with different colours for different samples.
...@@ -61,8 +61,8 @@ Example of grouping ...@@ -61,8 +61,8 @@ Example of grouping
.. code-block:: console .. code-block:: console
$ expam run ... --group a1 a2 a3 --group b1 b2 b3 $ expam classify ... --group a1 a2 a3 --group b1 b2 b3
$ expam run ... --group "#FF0000" a1 a2 a3 --group "#00FF00" b1 b2 b3 $ expam classify ... --group "#FF0000" a1 a2 a3 --group "#00FF00" b1 b2 b3
.. note:: .. note::
...@@ -82,7 +82,7 @@ Example of grouping ...@@ -82,7 +82,7 @@ Example of grouping
.. code-block:: console .. code-block:: console
$ expam run ... --paired --group a1_f a2_f --group b1_f b2_f $ expam classify ... --paired --group a1_f a2_f --group b1_f b2_f
Visual flags Visual flags
^^^^^^^^^^^^ ^^^^^^^^^^^^
...@@ -110,7 +110,7 @@ Example of colour list ...@@ -110,7 +110,7 @@ Example of colour list
.. code-block:: console .. code-block:: console
$ expam run ... --colour_list "#FF0000" "#00FF00" "#0000FF" $ expam classify ... --colour_list "#FF0000" "#00FF00" "#0000FF"
.. _itol integration: .. _itol integration:
...@@ -126,7 +126,7 @@ folder containing two files: ...@@ -126,7 +126,7 @@ folder containing two files:
* :code:`tree.nwk` - Newick format tree that can be inserted into iTOL. * :code:`tree.nwk` - Newick format tree that can be inserted into iTOL.
* :code:`style.txt` - An iTOL formatted text document that contains all the information needed for iTOL to style the tree. * :code:`style.txt` - An iTOL formatted text document that contains all the information needed for iTOL to style the tree.
For instance, say we previously ran :code:`expam run --out my_run -d /some/samples`, and For instance, say we previously ran :code:`expam classify --out my_run -d /some/samples`, and
now run :code:`expam phylotree --out my_run --itol`, the corresponding files now run :code:`expam phylotree --out my_run --itol`, the corresponding files
would be located at would be located at
......
...@@ -97,7 +97,7 @@ Running classifications ...@@ -97,7 +97,7 @@ Running classifications
* :code:`../expam/test/data/reads/` * :code:`../expam/test/data/reads/`
* We use the :code:`run` command to classify reads. * We use the :code:`classify` command to classify reads.
* These are paired reads, but for now we'll treat them as separate. * These are paired reads, but for now we'll treat them as separate.
* We supply the :code:`-o` or :code:`--out` flag to tell *expam* where to save classification results. * We supply the :code:`-o` or :code:`--out` flag to tell *expam* where to save classification results.
...@@ -105,7 +105,7 @@ Running classifications ...@@ -105,7 +105,7 @@ Running classifications
.. code-block:: console .. code-block:: console
$ expam run -db test -d /Users/seansolari/Documents/expam/test/data/reads/ --out test/results/unpaired_test $ expam classify -db test -d /Users/seansolari/Documents/expam/test/data/reads/ --out test/results/unpaired_test
Clearing old log files... Clearing old log files...
Results directory created at /Users/seansolari/Documents/Databases/test/results/unpaired_test. Results directory created at /Users/seansolari/Documents/Databases/test/results/unpaired_test.
Loading the map and phylogeny. Loading the map and phylogeny.
...@@ -202,7 +202,7 @@ Running paired data ...@@ -202,7 +202,7 @@ Running paired data
.. code-block:: console .. code-block:: console
$ expam run -db test -d /Users/seansolari/Documents/expam/test/data/reads/ --out test/results/paired_test --paired $ expam classify -db test -d /Users/seansolari/Documents/expam/test/data/reads/ --out test/results/paired_test --paired
Clearing old log files... Clearing old log files...
Results directory created at /Users/seansolari/Documents/Databases/test/results/paired_test. Results directory created at /Users/seansolari/Documents/Databases/test/results/paired_test.
Loading the map and phylogeny. Loading the map and phylogeny.
......
...@@ -99,13 +99,10 @@ def run_classifier( ...@@ -99,13 +99,10 @@ def run_classifier(
if taxonomy: if taxonomy:
# Attempt to update taxon ids. # Attempt to update taxon ids.
tax_obj: TaxonomyNCBI = TaxonomyNCBI(database_config) tax_obj: TaxonomyNCBI = TaxonomyNCBI(database_config)
tax_obj.accession_to_taxonomy(db_dir) tax_obj.accession_to_taxonomy()
tax_results_path = os.path.join(out_dir, output_config.tax) name_to_lineage, taxon_to_rank = tax_obj.load_taxonomy_map()
os.mkdir(output_config.tax) results.to_taxonomy(name_to_lineage, taxon_to_rank)
name_to_lineage, taxon_to_rank = tax_obj.load_taxonomy_map(db_dir)
results.to_taxonomy(name_to_lineage, taxon_to_rank, tax_results_path)
results.draw_results(itol_mode=itol_mode) results.draw_results(itol_mode=itol_mode)
finally: finally:
...@@ -469,7 +466,7 @@ class ClassificationResults: ...@@ -469,7 +466,7 @@ class ClassificationResults:
self.tax_id_hierarchy = {"1": set()} # Map from tax_id -> immediate children. self.tax_id_hierarchy = {"1": set()} # Map from tax_id -> immediate children.
self.tax_id_pool = ["1"] # Children must appear later than parent this list. self.tax_id_pool = ["1"] # Children must appear later than parent this list.
def to_taxonomy(self, name_to_lineage, taxon_to_rank, tax_dir): def to_taxonomy(self, name_to_lineage, taxon_to_rank):
col_names = ["c_perc", "c_cumul", "c_count", "s_perc", "s_cumul", "s_count", "rank", "scientific name"] col_names = ["c_perc", "c_cumul", "c_count", "s_perc", "s_cumul", "s_count", "rank", "scientific name"]
class_counts = pd.read_csv(self.results_config.phy_classified, sep="\t", index_col=0, header=0) class_counts = pd.read_csv(self.results_config.phy_classified, sep="\t", index_col=0, header=0)
...@@ -544,7 +541,7 @@ class ClassificationResults: ...@@ -544,7 +541,7 @@ class ClassificationResults:
cutoff = max(self.cutoff, (total_counts / 1e6) * self.cpm) cutoff = max(self.cutoff, (total_counts / 1e6) * self.cpm)
df = df[(df['c_cumul'] > cutoff) | (df['s_cumul'] > cutoff) | (df.index == 'unclassified')] df = df[(df['c_cumul'] > cutoff) | (df['s_cumul'] > cutoff) | (df.index == 'unclassified')]
df.to_csv(os.path.join(tax_dir, sample_name + ".csv"), sep="\t", header=True) df.to_csv(os.path.join(self.results_config.tax, sample_name + ".csv"), sep="\t", header=True)
# #
# Map raw read output to taxonomy. # Map raw read output to taxonomy.
......
...@@ -7,7 +7,7 @@ import shutil ...@@ -7,7 +7,7 @@ import shutil
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
import numpy as np import numpy as np
from expam.utils import die, ls, make_path_absolute from expam.utils import die, ls, make_path_absolute, parse_float, parse_int
ExpamOptions = namedtuple( ExpamOptions = namedtuple(
...@@ -166,42 +166,19 @@ class CommandGroup: ...@@ -166,42 +166,19 @@ class CommandGroup:
except AttributeError: except AttributeError:
raise AttributeError("Command %s not found!" % command) raise AttributeError("Command %s not found!" % command)
@staticmethod @classmethod
def parse_ints(*params): def parse_ints(cls, *params):
for param in params: if len(params) == 1:
INVALID_PARAM_MSG = ("Invalid parameter (%s), must be integer!" % str(param)) return parse_int(params[0])
else:
if param is not None: return (parse_int(param) for param in params)
try:
# Convert to float.
param = float(param)
except ValueError:
die(INVALID_PARAM_MSG)
# Convert to int and see if the value changes.
new_param = int(param)
if new_param != param:
die(INVALID_PARAM_MSG)
param = new_param
yield param
@staticmethod
def parse_floats(*params):
for param in params:
INVALID_PARAM_MSG = ("Invalid parameter (%s), must be integer!" % str(param))
if param is not None:
try:
param = float(param)
except ValueError:
die(INVALID_PARAM_MSG)
yield param @classmethod
def parse_floats(cls, *params):
if len(params) == 1:
return parse_float(params[0])
else:
return (parse_float(param) for param in params)
@staticmethod @staticmethod
def get_user_confirmation(msg): def get_user_confirmation(msg):
......
...@@ -8,7 +8,6 @@ import traceback ...@@ -8,7 +8,6 @@ import traceback
import numpy as np import numpy as np
import pandas as pd import pandas as pd
from expam.tree import PHYLA_COLOURS from expam.tree import PHYLA_COLOURS
from expam.tree.location import Location from expam.tree.location import Location
......
...@@ -80,3 +80,31 @@ def is_hex(string): ...@@ -80,3 +80,31 @@ def is_hex(string):
return True return True
def parse_int(param):
INVALID_PARAM_MSG = ("Invalid parameter (%s), must be integer!" % str(param))
if param is not None:
try:
# Convert to float.
param = float(param)
except ValueError:
die(INVALID_PARAM_MSG)
# Convert to int and see if the value changes.
new_param = int(param)
if new_param != param:
die(INVALID_PARAM_MSG)
else:
new_param = None
return new_param
def parse_float(param):
INVALID_PARAM_MSG = ("Invalid parameter (%s), must be integer!" % str(param))
if param is not None:
try:
param = float(param)
except ValueError:
die(INVALID_PARAM_MSG)
return param
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment