image0

Python Basics

To start, we recognize that many potential users are newcomers to Python and Jupyter so we begin with a simple notebook that focuses on how to use these tools. Please skip this notebook if you are familiar with Python and Jupyter. Note that there are many tutorials on Python and its various libraries so we only provide a most cursory look here. Refer to the links below for more detailed tutorials.

Concepts in this Notebook

  • Jupyter Notebook + Python Basics
  • What is an “object” or “class” in Python?
  • Vectorized Calculations in Numpy
  • Plotting

Notebook Setup

To begin, please run Kernel-> Restart & Clear Output from the menu at the top of the notebook. It is a good idea to run this before starting any notebook so that the notebook is fresh for the user.

Now, let’s do some basic Python sanity checking to make sure we can import neccessary libraries. Run the cell below. Click the cell to activate it and then use the Run button in the toolbar above. Alternatively, the cell can be run by pressing <Shift-Enter>. That is, hold Shift and then press Enter while holding Shift. The purpose of this cell is to load in external functions and libraries.

If successful, you should see a set of logos appear below the cell. Which logos appear depend on what is inside the hv.extension() command at the bottom of the cell. If no logos appear and the cell throws an error, there is likely something wrong with your environment.

Troubleshooting:

  • Did you activate the correct conda environment before starting the jupyter notebook?
  • If not using anaconda, did you install all dependencies before starting the jupyter notebook?
  • Is pyPRISM installed in your current environment on your PYTHONPATH?

Holoviews + Bokeh Logos: Logos

[1]:
import pyPRISM
import numpy as np
import matplotlib.pyplot as plt
import holoviews as hv

hv.extension('bokeh')

All Variables are Global in Jupyter

Variables in Jupyter notebooks live globally “between” cells. The first time you run the cell below, it will throw an error because the ‘b’ variable is commented out. If you run the following cell, and then rerurn the cell below, you’ll see the cell runs fine. This is an important concept to remember when using Jupyter notebooks because variables defined at the bottom of the notebook can affect the cells at the top.

Tip:

When all else fails, use the Kernel -> Restart functionality from the top-level menu. This will clear all variables from memory.

[4]:
a = 1
print('Value of a =',a)

#b = 2
print('Value of b =',b)
Value of a = 1
Value of b = 3
[3]:
b = 3
print('Value of b =',b)
Value of b = 3

What is an “object” or “class” in Python?

pyPRISM uses an object-oriented design to hold data and carry out the PRISM calculations. While a full description of what “object-oriented” means is far outside the scope of this tutorial, we’ll try to introduce the most basic features of the methodology.

A class in Python describes a data structure for holding functions and variables. Below we describe a simple class for defining a shape.

[5]:
class Square:
    def __init__(self,length,width):
        self.length = length
        self.width = width
    def area(self):
        return self.length*self.width

But how do we use this class? We create an object from the class. The __init__ function actually describes how the object is initially created. We can create many objects from the same class with different input parameters.

[6]:
square_object1 = Square(length=10,width=2.5)
print('square_object1 = ',square_object1)

square_object2 = Square(length=5,width=4)
print('square_object2 = ',square_object2)
square_object1 =  <__main__.Square object at 0x7f4cf467b2e8>
square_object2 =  <__main__.Square object at 0x7f4cf4628c18>

We access the variables, which are called attributes in Python, in these objects using the ‘dot’ operator. These are defined via the self variable shown in the __init__ function above.

[7]:
print('square_object1 (length by width) =',square_object1.length,'by',square_object1.width)
print('square_object2 (length by width) =',square_object2.length,'by',square_object2.width)
square_object1 (length by width) = 10 by 2.5
square_object2 (length by width) = 5 by 4

Functions (called methods in Python) are also accessed via the ‘dot’ operation, but now we have to call the function using the paren.

[8]:
print('square_object1 area =',square_object1.area())
print('square_object2 area =',square_object2.area())
square_object1 area = 25.0
square_object2 area = 20

Using Efficient Vectorized Operations in Numpy

The NumPy library is one of the most ubiquitous libraries used in Python-driven data science. Numpy provides a number of functions and data structures that allow us to generate and manipulate data in a simple and efficient interface.

To start we generate two Numpy arrays that we will plot later.

[9]:
x = np.arange(100)
print('x =',x)

print('')

y = np.random.random(100)
print('y =',y)
x = [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]

y = [0.96871466 0.13760296 0.92705958 0.24109572 0.50563029 0.80134237
 0.59722568 0.21639133 0.37703911 0.43609433 0.02655977 0.71988708
 0.70907182 0.12334527 0.73284856 0.95141623 0.71099059 0.03064434
 0.76687427 0.92077048 0.84735594 0.23879096 0.27032655 0.42357319
 0.11700788 0.80653507 0.46032536 0.05262332 0.232715   0.63107869
 0.75308625 0.99781775 0.47262713 0.0691762  0.9136122  0.8119291
 0.78867143 0.74982783 0.94198547 0.20783126 0.04089624 0.88651836
 0.64336257 0.84467139 0.59211992 0.5142557  0.17448646 0.76043449
 0.40729017 0.10215748 0.15373949 0.44002134 0.60214855 0.38047772
 0.29815424 0.80502636 0.12819524 0.97573272 0.7035504  0.21003149
 0.38919393 0.64316055 0.68222711 0.4787152  0.42001444 0.94344592
 0.02804441 0.50508219 0.47422093 0.65023172 0.30454903 0.20138134
 0.38643276 0.09272204 0.93451919 0.81901792 0.96167536 0.8331298
 0.25388386 0.52126561 0.01819315 0.73659048 0.93054841 0.8445565
 0.76504531 0.83054956 0.43768718 0.6461208  0.2125739  0.98487238
 0.02612495 0.77296086 0.25881863 0.68742048 0.94622851 0.51406883
 0.48448949 0.99641145 0.12647969 0.02502326]

Accessing the elements of an array in Numpy is achieved via the square braces x[i]. Note that Python array indices always start at 0. The last element in the array can be accessed via the i=-1 index.

[10]:
print('x[0] =',x[0])
print('y[0] =',y[0])

print('x[10] =',x[10])
print('y[10] =',y[10])

print('x[-1] =',x[-1])
print('y[-1] =',y[-1])
x[0] = 0
y[0] = 0.9687146645976468
x[10] = 10
y[10] = 0.026559771669054655
x[-1] = 99
y[-1] = 0.025023261763978955

One of the primary features of Numpy is that it allows us to use vectorized operations. These are operations that are applied to the entire array ‘at once’ rather than having to manually apply to each member of the array. The two approaches are compared below in which we try to increment all of the values in the y array by one. We point out that the second approach is not possible in pure Python.

[11]:
# slow iteration
new_y1 = np.copy(y)
for i in range(100):
    new_y1[i] = y[i] + 1.0

#fast vectorized operation
new_y2 = y + 1.0

print('y1 =',new_y1)
print('y2 =',new_y2)
y1 = [1.96871466 1.13760296 1.92705958 1.24109572 1.50563029 1.80134237
 1.59722568 1.21639133 1.37703911 1.43609433 1.02655977 1.71988708
 1.70907182 1.12334527 1.73284856 1.95141623 1.71099059 1.03064434
 1.76687427 1.92077048 1.84735594 1.23879096 1.27032655 1.42357319
 1.11700788 1.80653507 1.46032536 1.05262332 1.232715   1.63107869
 1.75308625 1.99781775 1.47262713 1.0691762  1.9136122  1.8119291
 1.78867143 1.74982783 1.94198547 1.20783126 1.04089624 1.88651836
 1.64336257 1.84467139 1.59211992 1.5142557  1.17448646 1.76043449
 1.40729017 1.10215748 1.15373949 1.44002134 1.60214855 1.38047772
 1.29815424 1.80502636 1.12819524 1.97573272 1.7035504  1.21003149
 1.38919393 1.64316055 1.68222711 1.4787152  1.42001444 1.94344592
 1.02804441 1.50508219 1.47422093 1.65023172 1.30454903 1.20138134
 1.38643276 1.09272204 1.93451919 1.81901792 1.96167536 1.8331298
 1.25388386 1.52126561 1.01819315 1.73659048 1.93054841 1.8445565
 1.76504531 1.83054956 1.43768718 1.6461208  1.2125739  1.98487238
 1.02612495 1.77296086 1.25881863 1.68742048 1.94622851 1.51406883
 1.48448949 1.99641145 1.12647969 1.02502326]
y2 = [1.96871466 1.13760296 1.92705958 1.24109572 1.50563029 1.80134237
 1.59722568 1.21639133 1.37703911 1.43609433 1.02655977 1.71988708
 1.70907182 1.12334527 1.73284856 1.95141623 1.71099059 1.03064434
 1.76687427 1.92077048 1.84735594 1.23879096 1.27032655 1.42357319
 1.11700788 1.80653507 1.46032536 1.05262332 1.232715   1.63107869
 1.75308625 1.99781775 1.47262713 1.0691762  1.9136122  1.8119291
 1.78867143 1.74982783 1.94198547 1.20783126 1.04089624 1.88651836
 1.64336257 1.84467139 1.59211992 1.5142557  1.17448646 1.76043449
 1.40729017 1.10215748 1.15373949 1.44002134 1.60214855 1.38047772
 1.29815424 1.80502636 1.12819524 1.97573272 1.7035504  1.21003149
 1.38919393 1.64316055 1.68222711 1.4787152  1.42001444 1.94344592
 1.02804441 1.50508219 1.47422093 1.65023172 1.30454903 1.20138134
 1.38643276 1.09272204 1.93451919 1.81901792 1.96167536 1.8331298
 1.25388386 1.52126561 1.01819315 1.73659048 1.93054841 1.8445565
 1.76504531 1.83054956 1.43768718 1.6461208  1.2125739  1.98487238
 1.02612495 1.77296086 1.25881863 1.68742048 1.94622851 1.51406883
 1.48448949 1.99641145 1.12647969 1.02502326]

Plotting

One of the most common plotting libraries in Python is Matplotlib. Below is a simple demostration of how we can plot our x and y arrays that we created above. The syntax here should be familiar to those who have experience with Matlab. (Note that the plot below will not appear in the static version of this tutorial.)

[12]:
plt.plot(x,y,'r')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
tutorial/../_build/doctrees-readthedocs/nbsphinx/tutorial_NB1.PythonBasics_23_0.png

A powerful wrapper library that works with Matplotlib and other plotting back ends is Holoviews. When the hv.extension command at the top of the notebook is invoked with ‘matplotlib’, the plots produced by Holoviews will be static. When ‘bokeh’ is used, the backend will be the Bokeh library which will allow us to create live, interactive charts in the notebook. The downside to the Bokeh backend is that the plots do not show up when these notebooks are viewed on Github or when the notebooks are saved as html. We’ll use the the Matplotlib backend by default so that the notebooks can be previewed on Github, but we encourage users to switch to the Bokeh backend if working with the notebooks directly.

[13]:
%opts Curve [width=600,height=400]
hv.Curve((x,y))
[13]:

Summary

Hopefully this notebook serves to give a brief glance into how Python, Jupyter, and some associated libraries work. As was stated at the top of the notebook, this is not intended to be comprehensive, but rather just a broad introduction for the purposes of orienting those new to Python and/or Jupyter.

image0

NB0.Introduction \(\cdot\) NB1.PythonBasics \(\cdot\) NB2.Theory.General \(\cdot\) NB3.Theory.PRISM \(\cdot\) NB4.pyPRISM.Overview \(\cdot\) NB5.CaseStudies.PolymerMelts \(\cdot\) NB6.CaseStudies.Nanocomposites \(\cdot\) NB7.CaseStudies.Copolymers \(\cdot\) NB8.pyPRISM.Internals \(\cdot\) NB9.pyPRISM.Advanced

[ ]: