User Tools

Site Tools


labs:sqlalchemy

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
labs:sqlalchemy [2017/05/18 13:26]
brunnegi
labs:sqlalchemy [2020/08/31 21:03] (current)
Line 3: Line 3:
 In this lab you will get to know the basics of Python and some of its most useful libraries. After completing this lab, you will be awarded the title of SQL Alchemist. In this lab you will get to know the basics of Python and some of its most useful libraries. After completing this lab, you will be awarded the title of SQL Alchemist.
  
-===== Getting Started ===== 
  
-Ubuntu comes pre-installed with Python 2 and 3. It can also easily be installed using apt-get. However, these are system wide installations that are used by all Python programs. Since Python heavily relies on external libraries, and not all libraries are compatible with all versions of Python, it is often necessary to use different versions of Python for different projects. Thus, the use of Virtual Environments is highly encouraged. Every virtual environment has its own Python distributions and installed libraries. Thus, different projects can be cleanly separated. There are several tools to create and manage virtual environments.  +===== Lab Setup =====
-In this lab we will use [[https://​conda.io/​docs/​using/​envs.html|Anaconda]].+
  
-Follow ​the link to figure out how to create ​and activate ​a new virtual environment. Specify the Python version to be 3.6 and install the following libraries using pip: +  - Open terminal 
-  * sqlalchemy +  - Execute ​the script <​code>​ /​opt/​Uebungen/​DatabaseLab/​setup.sh </​code>​ 
-  ​* sqlite3 +  - Execute <​code>​ source ~/.bashrc </​code>​ 
-  ​* pandas +  - Navigate ​to the folder for this lab: <​code>​ cd /​home/​lab/​local-data/​DatabaseLab </​code>​ 
-  ​* numpy +  - Create ​and activate virtual environment ​ 
-  ​* matplotlib+  - <​code>​conda create --name labEnvironment python=3.6 sqlalchemy ​sqlite pandas numpy matplotlib 
 +</​code>​ 
 +  ​- <​code>​ source activate labEnvironment </​code>​ 
 +  ​- Start notebook server <​code>​ jupyter notebook PythonSQL_Lab.ipynb </​code>​ 
 +  ​- Copy the link with the token and paste it into the browser'​s address bar 
 +  ​- Open the Notebook
  
-The command you need to use is: 
  
-<​code>​ conda create --name labEnvironment python=3.6 sqlalchemy sqlite pandas numpy matplotlib + 
-</code>+===== Verifying Setup ===== 
 + 
 +Ubuntu comes with pre-installed distributions of Python 2 and 3. It can also easily be installed using apt-get. However, these are system wide installations that are used by all Python programs. Since Python heavily relies on external libraries, and not all libraries are compatible with all versions of Python, it is often necessary to use different versions of Python for different projects. Thus, the use of Virtual Environments is highly encouraged. Every virtual environment has its own Python distributions and installed libraries. This way, different projects can be cleanly separated. There are several tools to create and manage virtual environments. ​ 
 +In this lab we will use [[https://​conda.io/​docs/​using/​envs.html|Anaconda]].
  
 When programming with Python, one has the choice to use IDEs (Integrated Development Enviroments),​ text editors, an interactive python interpreter or more advanced tools like Jupyter Notebooks. In this lab, we will use the latter. ​ When programming with Python, one has the choice to use IDEs (Integrated Development Enviroments),​ text editors, an interactive python interpreter or more advanced tools like Jupyter Notebooks. In this lab, we will use the latter. ​
Line 24: Line 29:
 Let's verify that everything is working. ​ Let's verify that everything is working. ​
  
-  - Open a terminal window and activate ​your virtual environment ​<​code>​ source activate labEnvironment </​code>​ +  - Open a terminal window and make sure that your virtual environment ​is activated 
-  - List all installed packages and verify that sqlalchemy, sqlite3, pandas, numpy and matplotlib are instaleld ​<​code>​conda list</​code>​+  - List all installed packages and verify that sqlalchemy, sqlite3, pandas, numpy and matplotlib are installed ​<​code>​conda list</​code>​
   - Check that the Python version of the virtual environment is indeed 3.6 <​code>​python -V</​code>​   - Check that the Python version of the virtual environment is indeed 3.6 <​code>​python -V</​code>​
   - Open the interactive Python interpreter <​code>​ipython</​code>​   - Open the interactive Python interpreter <​code>​ipython</​code>​
Line 31: Line 36:
  
 If everything is set up correctly, you should see the print statement'​s output in the terminal. If everything is set up correctly, you should see the print statement'​s output in the terminal.
- 
-  - Open a text editor 
-  - Add a print statement <​code>​print("​Something Else"​)</​code>​ 
-  - Save the file as print.py 
-  - Inside the terminal, navigate to the location where you saved print.py and execute the script <​code>​python print.py</​code>​ 
- 
-If everything is set up correctly, you should see the print statement'​s output in the terminal. 
- 
 Now that all is set up, you are ready for the next step.  Now that all is set up, you are ready for the next step. 
  
-===== Python Quickstart ​===== +===== SQLAlchemy Lab ===== 
-Python is easy to learn, especially if you already know some programming. There are no pointers, you do not need to specify explicit types, no need to compile, etc. Many beginners start learning programming with Python for these and other reasons.  +As mentioned earlier, there are many ways to code Python. ​In this lab you will be introduced ​to Jupyter Notebooks. A powerful ​and convenient method ​for programming with Python. 
- +We have already ​opened ​the notebook ​in the browser.  
-In order to help you getting started, we will go through some of the basics together.  +For the remainder ​of this lab, you will be working in the Jupyter NotebookJust follow ​the instructions thereHave fun!
- +
-Files of Python code are called //​scripts//,​ since Python is a scripting language. The file extension is //.py//. In this Lab we will mostly use a standard text editor to edit our Python script. ​As with other programming languages, there exist IDEs (Integrated Development Environments) such as Eclipse for Java and C++. There are also many more advanced text editors such as Sublime Text that use Extensions ​to make Python programming easier and even allow the execution of scripts directly from the editor. Furthermore,​ there exists Jupyter Notebook, which basically provides a web interface for your Python programming,​ where you can write code and documentation and also browse your data in a convenient way. For many applications,​ this is the way to go. However, we are just getting started with Python ​and will thus stick to the basics +
- +
-With a barebones editor such as gedit, we need to open a terminal/​command line and use following command to run a Python script:  +
- +
-<​code>​python myFirstScript.py</​code>​ +
- +
-Very often when programming,​ we get confused by all the different data types, or we just want to try if an algorithm actually works. This involves a lot of trial and error. It can be annoying to re-run a Python script every time we change a little bit of code just to see another runtime error. During ​this prototyping/​trial-and-error stage, it is often convenient to use an //​interactive Python shell//.  +
-In the Linux command line, type the following command to open the Python shell:  +
- +
-<​code>​ipython</​code>​ +
- +
-Now, you can just type Python code. The cool thing is that whenever you make a mistake, i.e., there is a runtime error, ​you will see that something went wrong, but the interactive shell will stay open and all the code you have typed up to that point is still in memory. Thus, all variables are still assigned and you can just try again.  +
- +
-Lets create a list with some numbers:  +
- +
-<​code>​myList = [1,​2,​3,​4]</​code>​ +
- +
-Access the first element of the list:  +
- +
-<​code>​myList[0]</​code>​ +
- +
-Now, try to access a non-existing element: +
- +
-<​code>​myList[5]</​code>​ +
- +
-You will see an IndexError. However, you can now just continue using your list without having to start all over.  +
- +
-<​code>​myList[2]</​code>​ +
- +
-Python'​s core has some useful functions, for example the //print()// method. We can print a string as follows:  +
- +
-<​code>​print("​blabla"​) or print('​blabla'​)</​code>​ +
- +
-We can also concatenate strings and numbers and print them together: +
- +
-<​code>​numDogs=5 +
-print("​I have ", numDogs, " dogs."​)</​code>​ +
- +
-Python does not use parantheses {} to delimit code blocks, but instead relies on indendation. The standard is to use //one tab// or //four white spaces//. The two indendation methods should not be mixed. Let's create a simple if statement. You have to press enter after every line. After the last line, you need to press enter twice in order to execute the code.  +
- +
-<​code>​numNeighbourDogs = 10  +
-numMyDogs = 2 +
-if (numMyDogs > numNeighbourDogs):​ +
-    print("​Muahaha I win!"​) +
-else: +
-    print("​Sh*t,​ I need to get more dogs!"​) +
-</​code>​ +
- +
-When you press enter after ''​if (numMyDogs > numNeighbourDogs):'',​ iPython will create the correct indent itself. However, after ''​print("​Muahaha I win!"​)'',​ you need to press backspace four times to delete the indent, since ''​if''​ and ''​else''​ need to be on the same indendation-level. +
- +
-Now, let's try to use different indendation-depths after ''​if''​ and ''​else'':​ +
- +
-<​code>​numNeighbourDogs = 10  +
-numMyDogs = 2 +
-if (numMyDogs > numNeighbourDogs):​ +
-  print("​Muahaha I win!") #two white-spaces indentation +
-else: +
-     ​print("​Sh*t,​ I need to get more dogs!"​) #five white-spaces indentation +
-</​code>​ +
- +
-As you see, the code still runs without errors. Now, let's try to put ''​if''​ and ''​else''​ themselves onto different indentation levels. +
- +
-<​code>​numNeighbourDogs = 10  +
-numMyDogs = 2 +
-if (numMyDogs > numNeighbourDogs):​ +
-    print("​Muahaha I win!"​) +
- else: # 1 white-space indendation +
-    print("​Sh*t,​ I need to get more dogs!"​) +
-</​code>​ +
- +
-This time we get an ''​IndentationError''​.  +
- +
-Standard Python has several different data structures. The most important ones are: ''​lists'',​ ''​tubles''​ and ''​dictionaries''​. A ''​list''​ is basically an array. A list can itself contain more lists. A list can contain different types of data types, i.e., strings ​and integers. Lists are defined and accessed using square brackets ''​[ ]''​. Let's create a new list to show these concepts: +
- +
-<​code>​ myList = [1, 2, ["​this",​ "is also", "a list"​],​ "​three"​]</​code>​ +
- +
-You can access all list elements normally using square brackets. If you want to access an element of the list at index 2, you just use another pair of square brackets:  +
- +
-<​code>​ myList[2][0] </​code>​ +
- +
-We can also assign new values to list element. Lists also support functions such as ''​append'',​ ''​count'',​ ''​sort'',​ etc. Use the dir() command to get a list of member functions of the List object. This works for any object in Python. Sometimes this is faster than Googling or looking through the documentation +
- +
-<​code>​ dir(myList) </​code>​ +
- +
-The next datatype are tuples. Tuples are basically immutable one-dimensional arrays.  +
- +
-<​code>​myTuple = (1,2,3)  +
-myTuple[0] +
-myTuple[0] = 2 # Results in TypeError because a tuple is immutable +
-</​code>​ +
- +
-Dictionaries are associative array, i.e., hash tables with key-value pairs. Again, data types can be mixed. +
- +
-<​code>​dogAges = {"​Fluffy":​ 12, "​Lessie":​ 5, "​Pluto":​ 10}  +
-dogAges["​Fluffy"​] # Use Key to access Value </​code>​ +
- +
-We have already ​seen an ''​if''​ statement. Now we will see how easy it is to iterate through lists using a ''​for''​ loop. +
- +
-<​code>​myList = [1, 2, 3, 4, 5, 6, 7, 8, 9] +
-for number in myList: +
-    print("​The current number is: ", number) +
-</​code>​ +
- +
-In Python we can of course also define functions. We are now going to write a small function that takes two values and prints ​the bigger one. If they are the same, we print both.  +
- +
-<​code>​ def compare_and_print(a,​ b): +
-    if (a>b): +
-        print(a) +
-    elif (a==b): +
-        print(a, " ", b) +
-    else: +
-        print(b) +
-</​code>​ +
- +
-Try calling the function to make sure it is doing what it should.  +
- +
-Like in Java and C++, we can also define classes and instantiate them as objects. Let's stick with the dog theme and create a class Dog. We overwrite the __init__ function (basically the constructor) such that we can pass arguments to constructor and assign member variables at creation time. ''​self''​ refers to the class itself and is used to access member variables inside the class itself.  +
- +
-<​code>​class Dog: +
-    def __init__(self,​ name, breed, age): +
-        self.name=name +
-        self.breed=breed +
-        self.age=age +
-         +
-    def bark(self):​ +
-        if (self.name=="​Snoop"​):​ +
-            print("​I ain't no dawg, dawg!"​) +
-        else: +
-            print("​WUFF WUFF!"​) +
-         +
-    def bite(self):​ +
-        print("​*BITE*"​) +
-</​code>​ +
- +
-We can now instantiate a new Dog and call its methods and get its member variables, as well as assign new values. ​For example, we can change ​the name of our dog.  +
- +
-<​code>​doggy = Dog("​Snoop",​ "​Rapper",​ 45) +
-doggy.name +
-doggy.bark() +
-doggy.bite() +
-doggy.name="​Pluto"​ +
-</​code>​ +
- +
-So far we have only used Python'​s built-in functionalities. However, most if its power comes from external libraries/​packages. In order to include such libraries wie use the ''​import''​ statement. Let's use the NumPy package for some simple scientific computations. With NumPy we can do many of the things that Matlab can do.  +
- +
-<​code>​import numpy as np  +
-</​code>​ +
- +
-In NumPy, the main object is the ''​homogeneous multidimensional array''​. In contrast to Python lists, all the elements in a NumPy array have to be of the same type. The most common way to create NumPy arrays is to use the ''​numpy.array()''​ function and pass a Python list to it.  +
- +
-<​code>​myArray = np.array([4,​3,​2]) +
-myArray[0] +
-</​code>​ +
- +
-Let's look at some of the functions that can be called on NumPy arrays: +
- +
-<​code>​dir(myArray) +
-myArray.max() +
-myArray.min() +
-myArray.mean() +
-myArray.sort() +
-myArray.sum() +
-myArray.transpose() +
-# ... and many more </​code>​ +
- +
-Instead of calling these methods on the Numpy array object, we can also use the Numpy functions and pass the array as an argument. For example: ''​np.max(myArray)''​. +
- +
-We can also instantiate arrays of certain lengths and automatically fill them with zeros, ones or random numbers: +
- +
-<​code>​a = np.zeros(10) +
-b = np.ones(10) +
-c = np.random.rand(10) +
-</​code>​ +
- +
-To instantiate matrices, i.e., multidimensional arrays, we just specify the number of elements for each dimension:  +
- +
-<​code>​ m = np.random.rand(10,​10) +
-</​code>​ +
- +
-Again, we can perform all kinds of operations on this matrix. For examplewe can compute its eigenvalues and eigenvectors:​ +
- +
-<​code>​eigvals,​ eigvecs = np.linalg.eig(m) +
-</​code>​ +
- +
-The function returns an array with all eigenvalues (we assigned it to the variable ''​eigvals''​) and an array of eigenvectors,​ which we assigned to ''​eigvecs''​. +
- +
-Now, let's plot the eigenvalues in the complex plane. First, we need to get the real and imaginary values of each eigenvalue +
-<​code>​eigvals_real = eigvals.real +
-eigvals_imag = eigvals.imag +
-</​code>​ +
- +
-Now we need a plotting library. The most popular one is called Matplotlib, and works very similar to Matlabs plotting functionality (hence the name).  +
- +
-<​code>​import matplotlib.pyplot as plt +
-</​code>​ +
- +
-Now, we use the ''​scatter''​ command to create a 2D plot. The first argument corresponds to the x-axis, and the second argument to the y-axis.  +
- +
-<​code>​plt.scatter(eigvals_real,​ eigvals_imag) +
-plt.show() +
-</​code>​ +
- +
- +
-Finally, let's make our plot look a little nicer. We want the origin to be in the middle and the maximum value of the axes should be a little bigger than the largest eigenvalue. We will also add axes labels and a title. We will also add a grid. +
- +
-<​code>​fig = plt.scatter(eigvals_real,​ eigvals_imag) +
-x_abs_lim=np.max(np.abs(eigvals_real))+1 +
-y_abs_lim=np.max(np.abs(eigvals_imag))+1 +
-plt.xlim(-x_abs_lim,​ x_abs_lim) +
-plt.ylim(-y_abs_lim,​ y_abs_lim) +
-plt.xlabel("​Real"​) +
-plt.ylabel("​Imag"​) +
-plt.title("​Eigenvalues of a Random Matrix"​) +
-plt.grid() +
-plt.show() +
-</​code>​ +
- +
- +
-This was only a short overview of Pythons capabilities. There are many good tutorials online for those who want to dive deeper. However, learning by doing is probably the best way to learn Python. Also, do not be afraid to Google. Most often, this is by far the easiest and fastest way to get something done, especially in a programming language you are not yet familiar with.  +
- +
-===== SQL Alchemy ===== +
- +
-For the rest of this lab you will be working ​with the files in the ''​local-data''​ folder in your home directoryOpen a terminal and navigate to that directory. Inside you find following files: +
- +
-  - lab.py : Python file where you need to add your own code +
-  - ti2db.sqlite : A database of movies containing several tables. +
-  - romances.csv : A list of the best romance movies of all time. +
- +
-In addition to those files, we suggest you to open an interactive Python shell. It is often helpful for debugging, especially when one is unsure about the correct syntax. Furthermore,​ you will need to use Google to complete the tasks.  +
- +
-Open lab.py in a text editor and start filling in the missing code. For simplicity'​s sake, all the code is in one single file. Thus, the entire script will be executed every time you run it. You can comment out parts of the code you have already finished, e.g., some SQL queries, if you do not want to execute them every time. All the incomplete code is currently commented out, such that the script can be run without errors.+
  
-Before you start, create a copy of ti2db.sqlite,​ just in case it gets messed up. <​code>​cp ti2db.sqlite ti2db_backup.sqlite </​code>​ 
  
-You can have a look at the ti2db.sqlite database in the terminal: ​ <​code>​sqlite3 ti2db.sqlite</​code>​ 
-This opens a connection to the sqlite database and you can now enter standard SQLite commands. ​ 
  
-Most answers ​to the exercises ​can be found at [[http://​docs.sqlalchemy.org/​en/​latest/​orm/​tutorial.html]].+===== Python Experts Listen Up! ===== 
 +If you are already familiar with Python, you can skip the Tutorial in the Jupyter Notebook and scroll down to the actual lab exercises!
labs/sqlalchemy.1495106816.txt.gz · Last modified: 2020/08/31 21:03 (external edit)