In the early days of QuantStart we posted an article on setting up an Algorithmic Trading Research Environment with Ubuntu Linux and Python. In 2013 when the article was first written, installing Python was not a trivial task. Problems with GCC compilers, cross dependencies between libraries and operating system intricacies all played a role in making the job of installing Python much harder than it needed to be. These days the problem is largely solved. In fact there are now so many options for installing Python that it is easy to get confused.
There are many different approaches you can take to installing Python, and there are plenty of contradictory opinions on the best appraoch. With that in mind it is better to choose the method based on how you intend to use the programming language. If you plan to use Python to explore algorithmic trading then this article will show you how to get an environment up and running in the simplest way. If you are familiar with programming and installing software then you might prefer to install the Official Python Distribution. There is an excellent tutorial on for this method here.
Currently we recommend using the Anaconda Python distribution by Continuum Analytics. The main reasons for this are discussed below.
- Anaconda comes with everything you need to get started analysing your data.
To quote their website Anaconda is a Python and R distribution that aims to provide everything you need (python wise) for data science tasks.
- Anaconda comes with Conda.
Conda is a package manager that allows you to install, upgrade and uninstall all your Python libraries. It can install from pre-built conda packages and it can build from source code. Conda also allows you to create and manage your virtual environments.
- Anaconda works well with Jupyter Notebooks.
By using IPyKernel you can quickly and easily hook up your virtual environments to your notebooks.
When you install Ananconda you get immediate access to over 1500 Python libraries including NumPy, SciPy, Pandas, Beautiful Soup and Requests. As you will see in later tutorials you can even control the versions of these libraries by creating your own virtual environments. Some of the criticisms of Anaconda have been that it is bloated, not all of the packages are relevant and it takes up too much space. If you would prefer a more streamlined version Continuum Analytics offers Miniconda which gives you access to Python and the Conda package manager, but you will have to install all the libraries yourself. If you have limited disk space and feel this is a better option for you there is a good tutorial on installing Miniconda here.
This post is part of a series on how to install the Anaconda Python distribution on different operating systems. In this article we will discuss how to install Anaconda3 version 2021.11 (Python 3.9), on Windows. This will require 3 Gb of free space. Please ensure you have that much room available before you begin. Other posts in the series concentrate on installation with:
Installing Anaconda on Windows
Open up your web browser and head to the following address: https://www.anaconda.com/products/individual The website should determine the correct download for your system.
Click the green download button. Your download should begin immediately. Once the download is complete click open file and you will be taken to the Anaconda setup wizard.
You will need to agree to the license agreement to continue the install
Select how you wish to install Anaconda, we recommend installing just for the current user, and click next.
You will then be asked to confirm your download location. We recommend leaving this as the default. As you can see the download will take 3 GB of space.
After clicking next you will then be asked how you wish to install Anaconda, do you wish it to take the place of your default Python or would you prefer to install it to your path. We recommend that you install as the default version. If you are familiar with Python installations, however, then you may wish to choose not to replace your default Python.
After clicking Install your installation should begin. Once completed you should see the following screen.
You should then see a screen providing you with a link to download PyCharm, an Python IDE. You can download this at any time if you wish but it is not essential. Clicking next brings you to the final screen where you can click finish to complete the setup. You will then be taken to an installation success page on your browser.
In order to check Python has been correctly installed go to the search bar and type Ana. This should bring up the following options.
Select the Anaconda Prompt. This will bring up the command line prompt or shell. Type
python into the prompt and press enter. You should see a couple of lines of text telling you which version of Python you are running followed by three chevrons (>>>), this is the Python prompt, indicating that you can enter Python commands. You are now in a Python console and can begin coding in Python.
import pandas as pd into the prompt. After pressing enter you will see that nothing has changed. you will be presented with a new line containing the Python prompt. If you type
pd.__version__and press enter you can find the version of the Pandas library you have just imported into the Anaconda prompt.
To exit Python you can simply type
exit(). You are now back in the command line prompt.
Creating your first virtual environment
Once you have been using Python for a while or across multiple different projects you will quickly run into the issue of dependencies. A script you have written or a project you are working on may require you to use features that are available in the latest version of a Python library like Pandas, but you have other projects or scripts that use older versions. How do you manage and maintain your Python environment to allow you to run and work on both scripts or projects? The answer is to use a virtual environment.
A virtual environment is an isolated Python environment that has its own dependencies, or in other words, its own versions of libraries and packages. Virtual environments can be created for each of your projects so that you can use whatever versions of libraries are necessary for each one. With Anaconda you can also specify versions of Python when you create them.
One of the benefits of Anaconda is that it comes with the package manager Conda, which allows you to create virtual environments easily. Anaconda currently allows you to create virtual environments for Python 2.7, 3.5, 3.6, 3.7, 3.8 and 3.9. Most package versions can be found using conda or conda-forge or, as a last resort, you can use the python package manager pip.
If you have used pip to install your libraries within your conda environments they will be installed into a different channel and you will not be able to uprade them using the command
conda upgrade. If you prefer not to use anaconda as your Python distribution and have installed Python directly from the Official Python Distribution, multiple Python versions can be created using pyenv and pipenv or virtualenv to manage virtual environments. A good tutorial on this can be found here.
In the anaconda prompt you will notice that there is a prefix in brackets before the directory information about the user. This appears as
(base) and indicates that you are in the base anaconda environment. Here you have access to all the packages that were downloaded and installed by Anaconda and if you type
python --version into the prompt you will see that you are running the default version of Python in this case Python 3.9.7.
We'll create a virtual environment with Python 3.8 and install some basic packages and display 5 years of Apple data with only a few lines of code. Let's create the environment first. In the anaconda prompt enter the following line
conda create -n py3.8 python=3.8
The first part "conda create -n" uses the package manager conda to create a new environment. The second part "py3.8" is the name of the environment, this can be anything you want. If you forget the name of your environments you can use
conda env list at anytime in the Anaconda prompt to display a list of all the environments you have created. The final part "python=3.8" specifies that we want Python 3.8 to be our Python version for this environment. The prompt will then provide you with a list of what will be installed and downloaded into your environment and ask you if you are happy to proceed. Once the installation is complete you type the following into the prompt to activate the environment.
conda activate py3.8
You will notice that the prompt prefix in brackets has changed to display
(py3.8) We can now begin to add our dependencies.
In order to view our stock data we need to install only three libraries: Pandas to analyse and plot our data and Pandas-datareader to obtain our data. Finally we also install Matplotlib which will allow us to plot our data using the Pandas plotting interface. In the prompt type the following:
conda install pandas pandas-datareader matplotlib
We will also be using datetime from the standard library (which is supplied with Python). Type
python into the prompt to open a Python console. You should see the familiar three chevrons. We can now import our libraries into our namespace and begin to obtain and analyse our data.
>>>import matplotlib.pyplot as plt >>>import pandas as pd >>>import pandas_datareader.data as web >>>from datetime import datetime as dt
This takes care of the libraries we need to import. Now we can begin to obtain our data. First we define our start and end dates. Then we create a DataFrame object and use the Pandas-Datareader library to obtain five years of daily OHLCV Apple data from Yahoo Finance. Pandas-Datareader allows you to download data from multiple sources including Quandl, AlphaVantage and IEX. A full list of data sources can be found here.
>>>start = dt(2016, 11, 1) >>>end = dt(2021, 11, 1) >>>aapl = web.DataReader("AAPL", "yahoo", start, end)
We now have five years of Apple data stored as a DataFrame. We can display the first fews rows using the Pandas command
Plotting our data is simple using Pandas, just type the following lines:
>>>aapl.plot(y="Adj Close") >>>plt.show()
Notice that the last line of code uses
plt.show(). This command is making use of the Matplotlib.pyplot library that we imported at the start. It allows us to display the graph directly. The graph of Apple adjusted close price will open in a new window.
And that's it! Using Pandas and Pandas-Datareader you can import multiple stocks, from different data providers. You can perform simple tasks from plotting the close price to building complex strategies all using just three open source Python libraries. The only issue with this approach is that once you exit the Python console you will lose all your work. You can exit the Python console by typing
exit() and then deactivate your virtual enviroment by typing
In the next article we will be looking at how to use Jupyter Notebooks to build candlestick plots and moving averagesThere is a great conda cheat sheet available here, it's a really useful reference in case you need to quickly check a command.