{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What will we cover in this tutorial?\n", "\n", "- setting up Python on your computer\n", "- Introduction to Python\n", "- Introduction to plotting with Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What you need for this Lab\n", "\n", "- For the labs we'll be using x2go/vnc and you'll be working on the labs collaboratively\n", "- For today, any way to run a Jupyter Notebook will suffice (we'll cover what this is in a bit)\n", "- You will not need a linux laptop. As long as you can make Jupyter Notebooks work and use x2go, this is all you need. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## About Python\n", "\n", "Wikipedia:\n", "\n", "*Python is a widely used high-level, __general-purpose__, interpreted, dynamic programming language. Its design philosophy emphasizes __code readability__, and its syntax allows programmers to express concepts in __fewer lines__ of code than would be possible in languages such as C++ or Java. The language provides constructs intended to enable __clear programs__ on both a small and large scale.\n", "Python supports __multiple programming paradigms__, including object-oriented, imperative and functional programming or procedural styles. It features a __dynamic type system__ and __automatic memory management__ and has a large and __comprehensive standard library__.*" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Advantages:\n", "* many nice helper functions built in\n", "* many nice and helpful modules for science (numerics, physics, plotting, ...) and beyond\n", "* can be used as a glue language\n", "* (mostly) platform independent\n", "* easy to learn and use (compared to C++/Java/most other languages)\n", "* doesn't enforce a strong paradigm (object oriented, proceedural, functional)\n", "* many more\n", "\n", "### Disadvantages:\n", "* it's not very fast (can often be helped) and sometimes memory inefficient\n", "* no low level programming \n", "* not built for multi-threading\n", "* too easy (learning almost any other programming language may seem hard)\n", "* some more (may depend on personal preference)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Tutorials\n", "\n", "- https://realpython.com/ \n", "- http://www-static.etp.physik.uni-muenchen.de/kurs/Computing/python/\n", " homegrown (non-interactive) python course from Günter Dudek in German, includes exercises at the end of chapters, some with solutions\n", "- https://www.datacamp.com/courses/intro-to-python-for-data-science\n", " introduction to python for data science, slow paced (interactive) tutorial that checks your results\n", "- https://wiki.python.org/moin/BeginnersGuide/Programmers lists plenty of other options" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Etiquette\n", "\n", "* Use meaningful and explanatory variable and function names (n_samples instead of n or ns, plancks_law vs B, ...)\n", "* Use comments for code and functions\n", "* When you're done developing your code, remove statements that no longer serve a purpose (especially prints or one statement cells)\n", "* Especially if you're about to send your notebook to someone else, but also when you're done with your notebook for the day: restart the Kernel and run everything again (Kernel->Restart & Run All) to make sure everything still works as it should!\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What's the best way to learn Python?\n", "\n", "- write code and try out things\n", "- use the help function/documentation\n", "- google error messages and try to understand *what* is happening\n", "- when using stackoverflow: it's a great resource, but once you found the solution, read the entire explanation (and not just the code snippet you need).\n", "- talk to experienced Python programmers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Different ways to use python\n", "\n", "- for compiled programming languages like C++ and Fortran, there is only one way: write code and compile it\n", "- for Python there a various ways: interpreter, __scripts__, jupyter __notebooks__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Command Line Interpreter\n", "\n", "Statements are entered and executed line by line.\n", "\n", "Use case: quickly checking something simple" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "### standard python:\n", "\n", "```\n", "(base) [serenity:~ paech]$ python\n", "Python 3.7.3 (default, Mar 27 2019, 16:54:48)\n", "[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin\n", "Type \"help\", \"copyright\", \"credits\" or \"license\" for more information.\n", ">>>\n", "```\n", "You can enter python commands line by line. You also have a history and can search the history just like on your linux shell." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### ipython (strongly recommended):\n", "\n", "```\n", "(base) [serenity:~ paech]$ ipython\n", "Python 3.7.3 (default, Mar 27 2019, 16:54:48)\n", "Type 'copyright', 'credits' or 'license' for more information\n", "IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.\n", "\n", "In [1]:\n", "```\n", "\n", "Pro: Offers tab completion." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Scripts\n", "\n", "Scripts (= containing python code) are executed from the command line:\n", "```\n", "[serenity:~ paech]$ python my_example.py\n", "```\n", "You can edit the script with a text editor or some other text editor or IDE (Integrated Development Environment) of your choice that's available. \n", "\n", "Use case: Almost anything. Creating big projects, running production" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Text editors/IDEs:\n", "\n", "- emacs\n", "- atom\n", "- sublime (non-free)\n", "- eclipse\n", "- vim (very steep learning curve)\n", "- many, many more" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Jupyter Notebooks\n", "- Easiest way to describe them is to see them in action\n", "- Jupyter Notebooks are already installed on the machines you'll use during the lab class. \n", "- Also install Jupyter on your computer at home. See http://jupyter.readthedocs.org/en/latest/install.html on how to do that (including python if you haven't installed it).\n", "- Easiest way to get python and jupyter: download and install anaconda https://www.anaconda.com/products/individual\n", "\n", "Use case: small projects, data exploration, prototyping, reproducable research" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Let's get started" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Under the hood\n", "\n", "If you're using python, it's good to understand what an object and a method of an object is. This is not different from other programming languages. \n", "\n", "### Object oriented programming\n", "\n", "- A very well established programmnig paradigm\n", "- Objects contain data (often called attributes) and functions that operate on that data (methods). \n", "- Class vs. Object:\n", " * Class: definitions of available data and methods\n", " * Object: instance of a class that is filled with actual data \n", "- Everything in Python is an object (variables, functions, modules, ...)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### References \n", "\n", "- if you assign a variable name in python, i.e.\n", "```\n", "x = 3\n", "```\n", "then the variable really contains a *reference* (or link or address if you will)\n", "- more than one variable name can point to the same object\n", "- it's like having boxes with tools (methods) and hardware (data) - you create it once and remember where you put it, then you or anyone else that knows the location (variable names) can access and use it." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Python3 vs. Python2\n", "\n", "Python2 has reached EOL (end of life) on Jan 1, 2020. A lot of the big python libraries (matplotlib, astropy, ...) have already seized support for Python2. However, there\n", "still is a lot of legacy code around, so you should be able to work in either version.\n", "\n", "Unfortunately, backward compatibility was broken and you cannot just run Python2 code in a Python3 environment. Luckily, the differences are not very large in the everyday life of a physicist or astronomer. The most important differences are:\n", "\n", "- the '/' operator. \n", " * Python2 if the two operands were integer (i.e. 1/2=0), the result was cast to an integer. \n", " * In Python3 this now yields a float (i.e. 1/2=0.5) \n", " This one is the nastiest one, because it will likely not cause a runtime error but just produce strange results.\n", "- print is now a function (and not an expression anymore), i.e. you need parantheses '()' around your strings\n", "- the range function in Python3 now does what xrange did in Python2 (though the real life consequences of this are very limited)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Basic Syntax\n", "\n", "- expressions end with the end of a line (not ';'), but expressions can span several lines (more later)\n", "- instead of paranthesis and braces, whitespace and tabs are used (no curly or other brackets that are used for control structures, functions and classes)\n", "- comments are indicated with '#'\n", "\n", "## jupyter notebook basics\n", "- Check out the Help->Keyboard Shortcuts for helpfull keyboard shortcuts\n", "- to place content in a cell, click on it and then enter your code\n", "- to exectue the code in a cell use Shift+Return (i.e. pressing return while shift key is down)\n", "- cells are code cells by default. But you can also define Markdown cells to format text. Double click those cells to edit. Shift+return to typeset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('helloo World')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Intro Header\n", "\n", "this is a first intro" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Built in data types" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "my_int = 1 # int (integers)\n", "my_float = 1.2 # float\n", "my_float_scientific = 5.0e24\n", "my_string = 'string' # string \"string\"\n", "my_boolean = True # boolean\n", "my_complex = 4+1j # complex" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Typing\n", "\n", "In Python variables are not typed, only values have types ('duck typing').\n", "\n", "This means a variable can hold different data types in different places of the code." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = 1\n", "a = 1.7\n", "a = 'egg' # this works just fine, would create problems in Fortran, C/C++, Java..." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In C++/Fortran, all variables are statically typed. \n", "That's why you may hear the statement \"Python is not a typed language\" which is inaccurate.\n", "\n", "Python also does some checking on what you do with values, the following statement will throw an error" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "1 + 'egg'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Operators\n", "\n", "All the standard mathematical operations are defined" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1 + 3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1 - 3" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "3 * 5" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "5/6 # division operator " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "6 % 5 # modulus operator " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "4**2 # exponent" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# note: to define a float in scientific notation use\n", "a = 1e24 # do not use 10**24" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "But carefule with division, there is a difference between Python2 and Python3\n", "[the following may not work with your default installation]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%python2\n", "print(1/2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%python3\n", "print(1/2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Comparison Operators" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 == 2 ?\", 1 == 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 != 2 ?\", 1 != 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 > 2 ?\", 1 > 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 < 2 ?\", 1 < 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 >= 2 ?\", 1 >= 2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "\"1 <= 2 ?\", 1 <= 2" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Comparison operators can be chained:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x = 2\n", "1 < x < 4" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 5\n", "1 < x < 4" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Logical Operators" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x = False\n", "y = True" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "not x, not y # returns the oposite" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x and y # and returns True if x and y are true, otherwise False" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "x or y # if either x or y is True returns true, otherwise False" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## More Complex Python Data Types" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lists\n", "\n", "- Are very simple, yet powerful data types in Python\n", "- They can hold a list of literally anything in Python (values, functions, objects)\n", "- A list can contain different data types" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "Create an empty list" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "a = []\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Append values" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "a.append(1)\n", "a.append(3)\n", "a.append(5)\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Create a range of numbers" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = list(range(10))\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Access the n-th element - indexing starts at 0\n", "[depending on the programming language, indexing starts at 0 or 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"1st element a[0]: {a[0]}\")\n", "print(f\"5th element a[4]: {a[4]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Access the last element (or second to last, ...)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Last (a[-1]=a[9]): {a[-1]}\")\n", "print(f\"Second to last (a[-2]=a[8]): {a[-2]}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Slicing - accessing more than one element\n", "\n", "lower limit is inclusive, upper limit is exclusive, i.e. [low, high[\n", "\n", "\n", "syntax: list[low:high:step]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"print 2nd through 4th element (a[1:4]):\", a[1:4]) # prints elements all elements [1,4[" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Accessing every second element" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"every second element: (a[::2])\", a[::2])\n", "print(\"every second element - starting at index 1: (a[1::2])\", a[1::2])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Invert a list" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[::-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "List of lists work too:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = [[1, 2], [3, 4]]\n", "print(\"first element of the second list:\", b[1][0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating lists with the range function\n", "\n", "- Range creates integer numbers in a given range\n", "- syntac: range(low, high, step) or range(high) where low and high are lower and upper bound [low, high[\n", "- useful for loops etc." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "a = list(range(10))\n", "print(a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b = list(range(6, 12, 2))\n", "print(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = list(range(10, 1, -1))\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- In Python2 range creates a list\n", "- In Python3 range creates an object that will give you the integers one at a time. To create a list, you need to turn this object into a list first." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%python2\n", "print(range(10))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%python3\n", "print(range(10))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating Lists with List Comprehensions\n", "\n", "- List comprehensions are very useful\n", "- They are much faster than simple for loops!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[x**2 for x in range(5)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "You can also have 2 for loops in one or two list comprehensions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[i*j for i in range(5) for j in range(5)]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[[i*j for i in range(5)] for j in range(5)]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Caution with Lists\n", "\n", "- Lists are \"mutable\" data types, it can change once it is created (unlike values like 3, 'zzz', etc.)\n", "- This can lead to unexpected behaviour if you are not aware of this behaviour" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "a = [1, 'a', 5] # lists - they can contain mixed data types\n", "b = a\n", "a[0] = 'modifying list a'\n", "b[-1] = 'modifying list b'\n", "\n", "print('a:', a)\n", "print('b:', b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This is because ```b``` will only point to a reference of ```[1, 'a', 5]```, as does ```a```." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Copying Lists\n", "\n", "If you want independent lists, you need to explictly make a copy of the list:\n", "\n", "[Important: this only works for lists, and __not__ for lists of lists or __numpy arrays__]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = [1, 'a', 5]\n", "c = a[:]\n", "a[0] = 'modifying list a'\n", "c[-1] = 'modifying list c'\n", "print('a:', a)\n", "print('c:', c)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = [1, 'a', 5]\n", "d = list(a)\n", "a[0] = 'modifying list a'\n", "d[-1] = 'modifying list d'\n", "print('a:', a)\n", "print('d:', d)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Tuples\n", "- Are almost like lists, but once they're created, cannot be modified\n", "- Often they are returned when a function returns more than one value" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "t = (1, 'two', '3')\n", "print(t)\n", "print(t[1])\n", "print(t[:2])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "But they cannot be modified" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t[0] = 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Dictionaries\n", "\n", "- Hold values that are associated to a key (like a dictionary)\n", "- Other names: associative arrays or hash arrays\n", "- They are very fast, only a little bit slower than lists\n", "- Until Python3.6, dictionaries were unordered" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d = {} # curly braces indicate dictionaries\n", "d[5] = 'five'\n", "d['one'] = 1\n", "d['list'] = [1, 2, 3]\n", "d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To retrieve an item from a dictionary" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(d.get(5))\n", "print(d.get('non existent key')) # this will return None as default" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "You can list keys or values or a list of (key,value) pairs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.keys()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.values()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d.items()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for key, value in d.items():\n", " print(key, \":\", value)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## The len() function\n", "- a lot of data types have a certain number of elements\n", "- the ```len()``` functions tells you how many \n", "- works for lists, dictionaries, strings, ..." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "len(range(10))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "len('abcdefghij')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "len({'a':1, 'b':2})" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If you have elements that contain elements, len() will give you the len of the top level one:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = [1, [2,3,4,5,6]]\n", "len(a)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Sets\n", "\n", "Python also has an implementation of mathematical sets." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s1 = set(range(5))\n", "s2 = {4,5,6,7,8}\n", "s1, s2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s1.union(s2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "s1.intersection(s2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Formatting\n", "- Multiple ways to format a python string \n", "- You will find them all in legacy code \n", "- See https://realpython.com/python-string-formatting/ for a more detailed discussion\n", "- See https://pyformat.info/ for a more complete reference.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Python3.6 and above: f-strings\n", "- f-strings offer a very simple way to print variables and format strings\n", "- For more details see https://realpython.com/python-string-formatting/#3-string-interpolation-f-strings-python-36" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "field = 'redshift'\n", "value = 13423/349383\n", "print(field,value)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Plot for {field} = {value}\")\n", "print(\"Plot for\", field, \"=\", value)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- Though if printing numbers, you usually need to format them \n", "- In science you must only print the significant digits!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(f\"Plot for {field} = {value:.5f}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## How to format up to Python3.6 (Python 2 through 3.5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Plot for {}'.format(field))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Value for {} = {}'.format(field, value)) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Value for {} = {:.3f}'.format(field, value)) " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Even better because it increases code readability of the string" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Plot for {field} = {value:.3f}'.format(field=field, value=value))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Old ways you should be able to read, but __not__ use yourself" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You should not write the following old fashioned code anymore:\n", "\n", "[But you will see it a lot]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variable = 'redshift'\n", "value = 0.5\n", "mapping = {'variable': variable, 'value': value}\n", "print('Plot for %s' % variable) # %s indicates a string \n", "print('Plot for %s = %.1f' % (variable, value)) # %s indicates a string \n", "# an easier to maintain version is the following\n", "print('Plot for %(variable)s = %(value).1f' % mapping) # %s indicates a string " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Neither should you use the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Plot for'+variable+' = '+str(value)) # Old fashioned\n", "print('Plot for', variable,'=', str(value)) # Old fashioned" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Control structures\n", "- same as in other languages: for, while, if-else\n", "- no brackets, so you __have__ to use indentation, either the same number of whitespaces or tabs\n", "- this forced indentation may seems strange in the beginning, but feels natural very soon and helps you write cleaner code" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## If statement - making decisions" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "# the if statement\n", "x = 0\n", "\n", "if x < 0:\n", " x = 0\n", " print('Negative changed to zero')\n", "elif x == 0:\n", " print('Zero')\n", "elif x > 1:\n", " print(x)\n", "else: # you should always have this in case you forgot to cover a case\n", " raise Exception('Whoops, should not have gotten here - something went wrong')\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## For loop - definite iteration\n", "- loop through a list/sequence of values or multiple values at the same time\n", "- this works differently for C++ and Fortran" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(5):\n", " print(i)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in 'Hello':\n", " print(i)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in [[2,3],[3,4]]:\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "We can have more than one loop variable at a time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_dict = {'ham': 1, 'spam': False, 'eggs': [1,2,3,4]}\n", "for key, value in my_dict.items():\n", " print(f'{key}:\\t{value}')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Or if we have two lists we want to loop over at the same time using zip" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = range(5)\n", "b = range(5,10)\n", "for i,j in zip(a,b):\n", " c = i+j\n", " print(f\"{i} + {j} = {c}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sometimes you'd also like to simultaneously want to iterate over the index and corresponding value:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = ['a','b','c','d']\n", "for i in enumerate(a):\n", " print(i)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Terminate a loop prematurely\n", "- ```break``` terminates loop immediately and continues with code below the loop\n", "- ```continue``` terminates the __current__ loop immediately and continues with the next iteration" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(10):\n", " j = i**2\n", " if j == 4:\n", " break\n", " print(i, j)\n", "print('end')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for i in range(5):\n", " j = i**2\n", " if j == 4:\n", " continue\n", " print(i, j)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## While loop - indefinite iteration\n", "Loop will go on as long as `````` specified in the ```while ``` is evaluated to be true." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = 5\n", "while a > 0:\n", " a -= 1 # this is equivalent to a = a - 1\n", " print(a, a>0)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Control Structures can be combined\n", "- Example is from https://rosettacode.org/wiki/Sieve_of_Eratosthenes#Python - simple algorithm to find prime numbers\n", "- rosettacode.org is a very useful resource" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def eratosthenes(n):\n", " multiples = set()\n", " primes = []\n", " for i in range(2, n+1):\n", " if i not in multiples:\n", " primes.append(i)\n", " multiples.update(range(i*i, n+1, i))\n", " return primes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "eratosthenes(20)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Functions\n", "\n", "- Python functions are very flexible and straight forward to define. \n", "- Any variable, function or class can be an input or output value in python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A simple example" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def my_function(x):\n", " constant = 10\n", " x = constant*x\n", " print(x)\n", " z = x + x**2 + x**3\n", " return z" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Functions perform tasks and also can return results." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "b = my_function(1) # or: my_function(1)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Namespace separation \n", "Functions have their own namespace - i.e. the values and functions it can affect and use." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = 1\n", "print(my_function(1))\n", "print(f\"x has still the same value as before {x}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "But the variable ```constant``` can only be seen by the function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(constant)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- But if you define a variable (or another function) outside a function, then the function can \"see\" it.\n", "- Therefore: Always be careful with typos!\n", "- You should __only make use__ of this feature if you use a function or variable that never changes\n", "- Other use cases will create trouble once your code gets more complex" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "variable_outside = 1000\n", "def my_function2(x):\n", " print(variable_outside)\n", " x = x/variable_outside\n", " z = x + x**2 + x**3\n", " return z" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "my_function2(4)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Function Documentation\n", "Add documentation to functions that explain what the function is doing, what the inputs and outputs are" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def polynomial(x):\n", " \"\"\"Calculates a polynomial of order 3.\n", " \n", " Parameters\n", " ----------\n", " x : float or int\n", " \n", " Returns:\n", " --------\n", " z : the value of the polynomial\n", " \"\"\"\n", " constant = 10\n", " x = constant*x\n", " z = x + x**2 + x**3\n", " return z" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can acces the documentation of a function (or any object) by using the ```help``` function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(polynomial)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "This also works for other functions you already have encountered:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(print)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "You can have multiple inputs and outputs" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "def g(x, y):\n", " z1 = x**2\n", " z2 = y**3\n", " return z1, z2\n", "\n", "a, b = g(1,3)\n", "print(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "If you forget to return something, then the default return value is ```None```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def g(x, y):\n", " z1 = x**2\n", " z2 = y**3\n", "\n", "print(g(1, 3))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "And you can have optional arguments that come with a default value (a.k.a. keyword arguments)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def g(x, y, print_output=False):\n", " z1 = x**2\n", " z2 = y**3\n", " if print_output:\n", " print(f\"x**2 = {z1}\")\n", " print(f\"y**3 = {z2}\")\n", " return z1, z2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g(1, 3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "g(1, 3, print_output=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Modules\n", "Python modules can be thought of as libraries. If there is some code you use in several places, for example mathematical functions and constants, you put those in a file or directory structure and load them into python before using them. \n", "\n", "Frequently used modules:\n", "* math \n", "* argparse - parsing command line options\n", "* os (especially os.path sub-module) operating system interaction\n", "\n", "Frequently used 3rd party modules in physics, astronomy and/or data science:\n", "* numpy/scipy (http://www.scipy.org) data and numerics package\n", "* matplotlib (http://matplotlib.org/, http://matplotlib.org/gallery.html) plotting\n", "* astropy (http://www.astropy.org/) Astronomy related stuff\n", "* scikit-learn (http://scikit-learn.org/stable/) Machine Learning" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Importing Modules\n", "- In order to use modules, you have to import them first with ```import ```" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import math" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- You can rename a module when importing\n", "- There are very common imports renaming modules - you should use those" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- But __don't do__ imports like (or only with a really, really good reason)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import math as my_secret_module # though this is not good style, only use with a really good reason" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from math import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- You can also import subpackages of modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import scipy.spatial \n", "from scipy import spatial" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Using functions/objects from modules" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "math.sqrt(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.rand()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numpy and Scipy (Basics)\n", "- Numpy adds support for large, n-dimensional arrays and matrices, including high-level math functions to create and operate on these arrays. \n", "- Scipy adds optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.\n", "\n", "For a more detailed quickstart see https://docs.scipy.org/doc/numpy-dev/user/quickstart.html" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numpy Arrays\n", "- are used a lot, since under the hood they're implemented in C and hence very fast\n", "- they're a bit like python lists, but have a fixed data type\n", "- like lists, they are also 'mutable'" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import numpy as np # import numpy if not done already" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numpy Array Creation " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.arange(10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.array(range(10))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.zeros(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.ones(5, dtype=int)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.zeros((3,5))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Accessing information about the shape, size and type of an array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(50)\n", "b = np.arange(50).reshape(10,5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dimensions of the array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Total number of elements in the array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a.size " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.size" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "\"Length\" of the array, i.e. the size of it's first dimension" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Data type" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a.dtype" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Accessing Elements of a Numpy array\n", "\n", "Element access is very similar to lists (indexing also starts at 0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b[1, 2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[-1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b[-1, -1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Slicing \n", "\n", "Is also very similar to Lists" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[0:4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b[0, 0:4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```[:]``` means \"all elements\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b[:, 1]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[::10]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a[::-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numpy arrays are also mutable objects!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(10)\n", "b = a\n", "a[0] = 99\n", "print(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(10)\n", "b = a[:] # This would work for lists\n", "a[0] = 99\n", "print(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(10)\n", "b = a.copy()\n", "a[0] = 99\n", "print(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(50)\n", "b = a.reshape(10,5)\n", "a[0] = 99\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Mathematical operations\n", "- Element by element (unless you use specific linear algebra functions from numpy.linalg)\n", "- quite fast" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(10)*100\n", "b = np.arange(10) " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a, b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a + b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a * b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a + 2" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Plotting with matplotlib\n", "- visualize your data\n", "- the most common library used in science is matplotlib\n", "- you can find example plots at http://matplotlib.org/gallery.html\n", "- there are many other ways in Python to visualize data, some are built on top of matplotlib, some aren't\n", "- it's not that important what library you use, what is important is that your plots are \"good\" in a scientific way" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "import matplotlib.pyplot as plt # you need this to plot in scripts or notebooks\n", " # the following cell magic will display the plots inside your notebook [not needed for scripts]\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "x = np.linspace(0,2,20) # for the interval [0,2] get a numpy array of 100 linearly spaced numbers\n", "y = x**2\n", "\n", "plt.plot(x, y, label='$y=x^2$') ; # default is solid line, other options are --, -., :\n", "plt.plot(x, y+1, 'o', label='$y=x^2+1$') ; # but we can also use symbols, other options are +,^,*,. and others\n", "plt.plot(x, y+2, '--', label='$y=x^2+2$', color='red') ; # or specify a color\n", "plt.xlabel('x') ;\n", "plt.ylabel('y') ;\n", "plt.legend(loc='best');\n", "plt.title('Different functions')\n", "plt.savefig('my_plot.png') # no need to specify a format, matplotlib will guess it from the extension " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Adjust font sizes, line width/marker size, and order of plotting\n", "- The minimum requirement for a scientfic plot is __readability__\n", "- Defaults for a lot of plotting libraries make \"pretty\" plots, but often not well readable\n", "- font sizes should be large enough (take into account that plots will be smaller in a paper or lab report)\n", "- line width/marker size should be thick/large enough so you can tell them appart (also when shrunk for paper)\n", "- if possible, ordering in the legend should correspond to line ordering in plot\n", "- see https://matplotlib.org/tutorials/introductory/customizing.html for more information" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "x = np.linspace(0,2,20) # for the interval [0,2] get a numpy array of 100 linearly spaced numbers\n", "y = x**2\n", "\n", "plt.figure(figsize=(8,6))\n", "plt.plot(x, y+2, '--', linewidth=2, color='red', label='$y=x^2+2$') ; # or specify a color\n", "plt.plot(x, y+1, 'o', label='$y=x^2+1$') ; # but we can also use symbols, other options are +,^,*,. and others\n", "plt.plot(x, y, linewidth=2, label='$y=x^2$') ; # default is solid line, other options are --, -., :\n", "plt.xlabel('x', fontsize=20) ;\n", "plt.ylabel('y', fontsize=20) ;\n", "plt.xticks(fontsize=20)\n", "plt.yticks(fontsize=20)\n", "plt.legend(loc='best', prop={'size': 20});\n", "plt.savefig('my_plot.png') # no need to specify a format, matplotlib will guess it from the extension " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Plotting without python\n", "\n", "Sometimes you may want to take a quick look at some results that have been written to a file. Or visualize a function. Instead of using python (from the command line or notebook), a quick and easy alternative is gnuplot. For an introductions see http://www.usm.uni-muenchen.de/people/puls/lessons/intro_general/gnuplot/gnuplot_for_beginners.pdf" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Input, Output and Shell commands\n", "Some simple input and output examples." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "x = np.linspace(0,2,20) # for the interval [0,2] get a numpy array of 100 linearly spaced numbers\n", "y = x**2\n", "filename = 'test.txt'\n", "f = open(filename,'w') # open a file for writing\n", "f.write('# x y\\n') # write a header\n", "for i_x, i_y in zip(x,y):\n", " f.write( '{} {}\\n'.format( i_x, i_y ) )\n", "f.close() # close the file again" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true }, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "a = np.arange(50).reshape(10,5)\n", "np.savetxt('test_array.txt', a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "!cat test_array.txt" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true }, "slideshow": { "slide_type": "slide" } }, "source": [ "Let's see if we wrote everything as expected" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "!cat test.txt # this is an ipython feature and won't work in pure CPython" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "And let's read this back in" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "filename = 'test.txt'\n", "f = open(filename)\n", "lines = f.readlines()\n", "f.close()\n", "print(lines)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "Each line will be a string in a python list" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "How do we get numbers? " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "x = np.zeros(len(lines))\n", "y = np.zeros_like(x)\n", "for i, line in enumerate(lines):\n", " if line.startswith('#'):\n", " continue\n", " data = line.rstrip() # remove carriage returns, i.e. line ends\n", " data = data.split('#', 1)[0] # split the string at the first occuarance of # and keep only the first element\n", " x_str, y_str = data.split() # split the remaining string at the whitespace\n", " x[i] = float(x_str)\n", " y[i] = float(y_str)\n", "print(x)\n", "print(y)\n", "x = x[1:]\n", "y = y[1:]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Simple data files can be read in using numpy " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "jupyter": { "outputs_hidden": true } }, "outputs": [], "source": [ "data = np.loadtxt(filename)\n", "print(data)\n", "print(data.shape)\n", "x = data[:, 0]\n", "y = data[:, 1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## More numpy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Built in functions that operate on entire arrays and are hence quite fast" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(10)**2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.sin(a) # trigonometric functions: sin, cos, tan, arccos, arcsin, arctan, arctan2 -- see documentation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.sum(a) # sum all elements" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.max(a) # find the largest element in the array -- or the minimum with min" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.argmax(a) # find the index for the largest element" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "This also works for multidimensional arrays and some methods can also be applied to individual axes (e.g. rows or columns)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a = np.arange(50).reshape(10,5)\n", "a" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.max(a)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.max(a, axis=0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.max(a, axis=1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Linear Algebra:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = [1, 2, 3]\n", "y = [4, 5, 6]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.dot(x, y)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.cross(x, y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is also tensor products and Einstein summation convention on the operands: tensordot and einsum" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Advanced but practical ways to index an array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "x = np.arange(10)\n", "y = x**2\n", "print(x)\n", "print(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we now would like to get all the x values for which y is smaller than 20, we can do the following:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cond = ( y < 20 )\n", "cond" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(x[cond])\n", "print(y[cond])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For other fancy ways to index arrays see https://docs.scipy.org/doc/numpy-dev/user/quickstart.html#fancy-indexing-and-index-tricks" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Timing your code\n", "In jupyter notebooks:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "# if we have a function defined, in our notebook we can use %timeit\n", "n = 10000\n", "%timeit list(range(n))\n", "%timeit np.arange(n) # this is much faster - most of this is done in c" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "In jupyter notebooks or scripts:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# in a regular python script, we can use timeit manually\n", "import timeit\n", "\n", "# use numpy functions only\n", "start = timeit.default_timer()\n", "a = np.arange(n)\n", "y = a**2\n", "stop = timeit.default_timer()\n", "print(\"time: \", (stop-start))\n", "\n", "# define numpy array and then a loop\n", "start = timeit.default_timer()\n", "a = np.arange(n)\n", "for i in range(n):\n", " a[i] = a[i]**2\n", "stop = timeit.default_timer()\n", "print(\"time: \", (stop-start))\n", "\n", "# use a list comprehension\n", "start = timeit.default_timer()\n", "a = np.array([i**2 for i in range(n)])\n", "stop = timeit.default_timer()\n", "print(\"time: \", (stop-start))\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "But always keep in mind the quotes from D. Knuth:\n", "\n", " The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. \n", "\n", " Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true } }, "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernel_info": { "name": "python3" }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "nteract": { "version": "0.15.0" } }, "nbformat": 4, "nbformat_minor": 4 }