Using conda install within a python script

12,093

Solution 1

I was looking at the latest Conda Python API and noticed that there are actually only 2 public modules with “very long-term stability”:

  1. conda.cli.python_api
  2. conda.api

For your question, I would work with the first:

NOTE: run_command() below will always add a -y/--yes option (i.e. it will not ask for confirmation)

import conda.cli.python_api as Conda
import sys

###################################################################################################
# The below is roughly equivalent to:
#   conda install -y 'args-go-here' 'no-whitespace-splitting-occurs' 'square-brackets-optional'

(stdout_str, stderr_str, return_code_int) = Conda.run_command(
    Conda.Commands.INSTALL, # alternatively, you can just say "install"
                            # ...it's probably safer long-term to use the Commands class though
                            # Commands include:
                            #  CLEAN,CONFIG,CREATE,INFO,INSTALL,HELP,LIST,REMOVE,SEARCH,UPDATE,RUN
    [ 'args-go-here', 'no-whitespace-splitting-occurs', 'square-brackets-optional' ],
    use_exception_handler=True,  # Defaults to False, use that if you want to handle your own exceptions
    stdout=sys.stdout, # Defaults to being returned as a str (stdout_str)
    stderr=sys.stderr, # Also defaults to being returned as str (stderr_str)
    search_path=Conda.SEARCH_PATH  # this is the default; adding only for illustrative purposes
)
###################################################################################################


The nice thing about using the above is that it solves a problem that occurs (mentioned in the comments above) when using conda.cli.main():

...conda tried to interpret the comand line arguments instead of the arguments of conda.cli.main(), so using conda.cli.main() like this might not work for some things.


The other question in the comments above was:

How [to install a package] when the channel is not the default?

import conda.cli.python_api as Conda
import sys

###################################################################################################
# Either:
#   conda install -y -c <CHANNEL> <PACKAGE>
# Or (>= conda 4.6)
#   conda install -y <CHANNEL>::<PACKAGE>

(stdout_str, stderr_str, return_code_int) = Conda.run_command(
    Conda.Commands.INSTALL,
    '-c', '<CHANNEL>',
    '<PACKAGE>'
    use_exception_handler=True, stdout=sys.stdout, stderr=sys.stderr
)
###################################################################################################

Solution 2

You can use conda.cli.main. For example, this installs numpy:

import conda.cli

conda.cli.main('conda', 'install',  '-y', 'numpy')

Use the -y argument to avoid interactive questions:

-y, --yes Do not ask for confirmation.

Solution 3

Having worked with conda from Python scripts for a while now, I think calling conda with the subprocess module works the best overall. In Python 3.7+, you could do something like this:

import json
from subprocess import run


def conda_list(environment):
    proc = run(["conda", "list", "--json", "--name", environment],
               text=True, capture_output=True)
    return json.loads(proc.stdout)


def conda_install(environment, *package):
    proc = run(["conda", "install", "--quiet", "--name", environment] + packages,
               text=True, capture_output=True)
    return json.loads(proc.stdout)

As I pointed out in a comment, conda.cli.main() was not intended for external use. It parses sys.argv directly, so if you try to use it in your own script with your own command line arguments, they will get fed to conda.cli.main() as well.

@YenForYang's answer suggesting conda.cli.python_api is better because this is a publicly documented API for calling conda commands. However, I have found that it still has rough edges. conda builds up internal state as it executes a command (e.g. caches). The way conda is usually used and usually tested is as a command line program. In that case, this internal state is discarded at the end of the conda command. With conda.cli.python_api, you can execute several conda commands within a single process. In this case, the internal state carries over and can sometimes lead to unexpected results (e.g. the cache becomes outdated as commands are performed). Of course, it should be possible for conda to handle this internal state directly. My point is just that using conda this way is not the main focus of the developers. If you want the most reliable method, use conda the way the developers intend it to be used -- as its own process.

conda is a fairly slow command, so I don't think one should worry about the performance impact of calling a subprocess. As I noted in another comment, pip is a similar tool to conda and explicitly states in its documentation that it should be called as a subprocess, not imported into Python.

Solution 4

I found that conda.cli.python_api and conda.api are limited, in the sense that, they both don't have the option to execute commands like this:

conda export env > requirements.txt

So instead I used subprocess with the flag shell=True to get the job done.

subprocess.run(f"conda env export --name {env} > {file_path_from_history}",shell=True)

where env is the name of the env to be saved to requirements.txt.

Share:
12,093

Related videos on Youtube

slaw
Author by

slaw

Scientist, Data Wrangler, and Python Lover

Updated on June 06, 2022

Comments

  • slaw
    slaw almost 2 years

    According to this answer you can import pip from within a Python script and use it to install a module. Is it possible to do this with conda install?

    The conda documentation only shows examples from the command line but I'm looking for code that can be executed from within a Python script.

    Yes, I could execute shell commands from within the script but I am trying to avoid this as it is basically assuming that conda cannot be imported and its functions called.

    • blacksite
      blacksite over 7 years
      Why don't you try and report back? :)
    • slaw
      slaw over 7 years
      Report what back?
    • Corey Goldberg
      Corey Goldberg over 7 years
      you can always use subprocess
    • slaw
      slaw over 7 years
      Yes, and I can also just use a bash script too but I'd like to avoid executing shell commands from within Python as it seems hackish. It is possible to import pip from within the Python script and install packages and it looks like I can import conda as well. Just don't see how to use conda from within a Python script.
    • ws_e_c421
      ws_e_c421 almost 5 years
      While it is possible to import pip in Python, that is explicitly discouraged in pip's documentation. I don't see anything hackish about using the public facing API's that pip and conda recommend -- which are based on calling new processes with arguments and option flags. conda now has a Python API that mimics the command line (see @YenForYang's answer). This API did not exist when this question was first posted.
  • slaw
    slaw over 7 years
    This seems hackish as it is executing a shell command. I'd like to import conda and call its equivalent install function.
  • gather bar
    gather bar over 7 years
    Check the above link out which gives info on conda and python modules. Conda is an environment and a utility which is available in the OS and not inside python module. Is there any specific anaconda package that you have in mind?
  • slaw
    slaw over 7 years
    Please see the answer by @mike-muller as, in fact, you can access conda from within a Python script.
  • slaw
    slaw over 7 years
    Thanks! This did the trick and was exactly what I was looking for. I had seen this in the conda script itself but couldn't decipher how to use it. I was passing a list to it rather than a set of commands.
  • ws_e_c421
    ws_e_c421 about 6 years
    I found that when I tried to do this with a script that had command line arguments that conda tried to interpret the comand line arguments instead of the arguments of conda.cli.main(), so using conda.cli.main() like this might not work for some things.
  • pceccon
    pceccon over 5 years
    How do you do when the channel is not the default?
  • YenForYang
    YenForYang almost 5 years
    @ws_e_c421 See my answer below if you are looking for a solution to that issue (still).
  • YenForYang
    YenForYang almost 5 years
    @pceccon Simply add the arguments '-c', '<CHANNEL>'. See my answer below for an example.
  • ws_e_c421
    ws_e_c421 almost 5 years
    This should be the accepted answer. The originally accepted answer relies on an internal API with no guarantee of stability. Also, I think a complete answer should mention calling conda with the subprocess module. Calling conda from the subprocess module with the --json option and parsing the results is perfectly valid and likely even more stable. It also does not require conda to be installed in the conda environment. Avoiding subprocess for performance reasons is likely premature optimization.
  • salotz
    salotz over 4 years
    I would like to use your solution but I am running into the issue where it complains about conda init not being run. I think it actually reads the source of your bashrc (or whatever) and stops you from continuing. If there is a way on the command line to disable this (its not documented AFAIK) then this is the best way for the reasons you list.
  • ws_e_c421
    ws_e_c421 over 4 years
    I usually run conda activate base in the environment before running Python scripts that interact with conda with subprocess. I am not sure what you mean by environment stuff. I use these scripts to interact with conda environments.
  • salotz
    salotz over 4 years
    I am using a remote invocation framework (invoke/fabric) which complicates things so that I can't just run conda activate beforehand. Re 'environment stuff': I am also trying to automate the creation, destruction, and activation of environments which is not a part of the conda.cli.python_api
  • ws_e_c421
    ws_e_c421 over 4 years
    That does make it tougher. I usually wrap my Python script in a bash script that uses conda activate when I need the conda environment. All conda activate does is set environment variables, so you could look at what it does and set them manually. An alternative is to wrap all the conda calls in conda run but it is still an alpha level feature. With conda run try to use as new a version of conda as possible for better support (one important improvement is not yet in a release of conda).
  • salotz
    salotz over 4 years
    Didn't know about conda run but I will definitely keep an eye out on that. Manually setting variables might work as a quick patch, but sounds really fragile for automating stuff.
  • salotz
    salotz over 4 years
    You shouldn't have to rely on admin permissions for anything with conda (unless it is distributed via your OS package manager of which there are none that I am aware of). Also goes against the request of the question.
  • Mahsa Seifikar
    Mahsa Seifikar over 3 years
    what is CHANNEL?
  • Rich Lysakowski PhD
    Rich Lysakowski PhD about 3 years
    This is the best short and clean example here. I find it easy to understand and extend. However, as @ws_e_c421 above points out, it may not be the best performant if it is executed multiple times in the same Python session, because the conda cache is not flushed until conda exits.
  • Mike Müller
    Mike Müller about 3 years
    @RichLysakowskiPhD Isn't caching a good thing in terms of performance? Or, are you concerned about using a lot of memory and therefore slowing things down?
  • ws_e_c421
    ws_e_c421 about 3 years
    Caching is good for performance but bad for behavior. I last looked closely at this two years ago when I posted my answer. At that time, conda would read and cache the package metadata at start up. If I tried to do multiple installs and removals in a single process, I would run into errors because the in memory cache of metadata was not updated by the new operations. My guess was that the developers only tested performing one package operation per process so invalid metadata was never a problem for their tests. The handling of metadata could have improved in the time since then.
  • ws_e_c421
    ws_e_c421 about 3 years
    Here is one example of the caching affecting multiple conda commands in one process. In this case the negative effect is fairly minor -- just that pip packages do not show up for conda list: github.com/conda/conda/issues/8730
  • Mike Müller
    Mike Müller about 3 years
    Good to know. So the subprocess approach seems best after all.
  • Rich Lysakowski PhD
    Rich Lysakowski PhD about 3 years
    This is a hack, but it works. Use it in limited doses, and understand the ramifications !!
  • Rich Lysakowski PhD
    Rich Lysakowski PhD over 2 years
    I would like to programmatically export an environment to a requirements.txt file. But what is the correct value of 'file_path_from_history' supposed to be here? I get a "NameError: name 'file_path_from_history' is not defined" message when I run your command.