![]() |
| By JuTT BaDshaH |
2. Introduction to Python Programming
Objectives
In this chapter, you’ll:
- Continue using IPython interactive mode to enter code snippets and see their results immediately.
- Write simple Python statements and scripts.
- Create variables to store data for later use.
- Become familiar with builtin data types.
- Use arithmetic operators and comparison operators, and understand their precedence.
- Use single, double and triplequoted strings.
- Use builtin function print to display text.
- Use builtin function input to prompt the user to enter data at the keyboard and get that data for use in the program.
- Convert text to integer values with builtin function int.
- Use comparison operators and the if statement to decide whether to execute a statement or group of statements.
- Learn about objects and Python’s dynamic typing.
- Use built in function type to get an object’s types.
Outline
2.1 INTRODUCTION
2.2 VARIABLES AND ASSIGNMENT STATEMENTS
In [3]: y = 3
You can now use the values of x and y in expressions:
In [4]: x + y
Out[4]: 10
Calculations in Assignment Statements
The following statement adds the values of variables x and y and assigns the result to the variable total, which we then display:
In [5]: total = x + y
In [6]: total
Out[6]: 10
The = symbol is not an operator. The right side of the = symbol always executes first, then the result is assigned to the variable on the symbol’s left side.
Python Style
The Style Guide for Python Code helps you write code that conforms to Python’s coding conventions. The style guide recommends inserting one space on each side of the assignment symbol = and binary operators like + to make programs more readable.
Variable Names
A variable name, such as x, is an identifier. Each identifier may consist of letters, digits and underscores (_) but may not begin with a digit. Python is case sensitive, so number and Number are dif erent identifiers because one begins with a lowercase letter and the other begins with an uppercase letter.
Types
2.3 ARITHMETIC
Multiplication (*)
Python uses the asterisk (*) multiplication operator:
In [1]: 7 * 4
Out[1]: 28
Exponentiation (**)
The exponentiation (**) operator raises one value to the power of another:
In [2]: 2 ** 10
Out[2]: 1024
To calculate the square root, you can use the exponent 1/2 (that is, 0.5):
In [3]: 9 ** (1 / 2)
Out[3]: 3.0
True Division (/) vs. Floor Division
(//) True division (/) divides a numerator by a denominator and yields a floatingpoint number with a decimal point, as in:
In [4]: 7 / 4
Out[4]: 1.75
Floor division (//) divides a numerator by a denominator, yielding the highest integer that’s not greater than the result. Python truncates (discards) the fractional part:
In [5]: 7 // 4
Out[5]: 1
In [6]: 3 // 5
Out[6]: 0
In [7]: 14 // 7
Out[7]: 2
In true division, 13 divided by 4 gives 3.25:
In [8]: 13 / 4
Out[8]: 3.25
Floor division gives the closest integer that’s not greater than 3.25—which is 4:
In [9]: 13 // 4
Out[9]: 4
Exceptions and Tracebacks
Dividing by zero with / or // is not allowed and results in an exception—a sign that a problem occurred:
In [10]: 123 / 0
ZeroDivisionError Traceback (most recent call last
)<
ipythoninput10cd759d3fcf39> in <module>()
> 1 123 / 0
ZeroDivisionError: division by zero
Python reports an exception with a traceback. This traceback indicates that an exception of type ZeroDivisionError occurred—most exception names end with Error. In interactive mode, the snippet number that caused the exception is specified by the 10 in the line
<ipython-input-10-cd759d3fcf39> in <module>()
The line that begins with > shows the code that caused the exception. Sometimes snippets have more than one line of code—the 1 to the right of > indicates that line 1 within the snippet caused the exception. The last line shows the exception that occurred, followed by a colon (:) and an error message with more information about the exception:
ZeroDivisionError: division by zero
The “Files and Exceptions” chapter discusses exceptions in detail.
An exception also occurs if you try to use a variable that you have not yet created. The following snippet tries to add 7 to the undefined variable z, resulting in a NameError:
In [11]: z + 7
NameError Traceback (most recent call last
)<
ipythoninput11f2cdbf4fe75d> in <module>()
> 1 z + 7
NameError: name 'z' is not defined
Remainder Operator
Python’s remainder operator (%) yields the remainder after the left operand is divided by the right operand:
In [12]: 17 % 5
Out[12]: 2
In this case, 17 divided by 5 yields a quotient of 3 and a remainder of 2. This operator is most commonly used with integers, but also can be used with other numeric types:
In [13]: 7.5 % 3.5
Out[13]: 0.5
Straight-Line Form
Algebraic notations such as generally are not acceptable to compilers or interpreters. For this reason, algebraic expressions must be typed in straightline form using Python’s operators. The expression above must be written as a / b (or a // b for floor division) so that all operators and operands appear in a horizontal straight line.
Grouping Expressions with Parentheses
Parentheses group Python expressions, as they do in algebraic expressions. For example, the following code multiplies 10 times the quantity 5 + 3:
In [14]: 10 * (5 + 3)
Out[14]: 80
Without these parentheses, the result is dif erent:
In [15]: 10 * 5 + 3
Out[15]: 53
The parentheses are redundant (unnecessary) if removing them yields the same result.
Operator Precedence Rules
Python applies the operators in arithmetic expressions according to the following rules of operator precedence. These are generally the same as those in algebra:
- Expressions in parentheses evaluate first, so parentheses may force the order of evaluation to occur in any sequence you desire. Parentheses have the highest level of precedence. In expressions with nested parentheses, such as (a / (b c)), the expression in the innermost parentheses (that is, b c) evaluates first.
- Exponentiation operations evaluate next. If an expression contains several exponentiation operations, Python applies them from right to left.
- Multiplication, division and modulus operations evaluate next. If an expression contains several multiplication, truedivision, floordivision and modulus operations, Python applies them from left to right. Multiplication, division and modulus are “on the same level of precedence.”
- Addition and subtraction operations evaluate last. If an expression contains several addition and subtraction operations, Python applies them from left to right. Addition and subtraction also have the same level of precedence.
For the complete list of operators and their precedence (in lowesttohighest order), see https://docs.python.org/3/reference/expressions.html#operatorprecedence
Operator Grouping
When we say that Python applies certain operators from left to right, we are referring to the operators’ grouping. For example, in the expression
a + b + c
the addition operators (+) group from left to right as if we parenthesized the expression as (a + b) + c. All Python operators of the same precedence group lefttoright except for the exponentiation operator (**), which groups right-to-left.
Redundant Parentheses
You can use redundant parentheses to group subexpressions to make the expression clearer. For example, the seconddegree polynomial
y = a * x ** 2 + b * x + c
can be parenthesized, for clarity, as
y = (a * (x ** 2)) + (b * x) + c
Breaking a complex expression into a sequence of statements with shorter, simpler expressions also can promote clarity.
Operand Types
Each arithmetic operator may be used with integers and floatingpoint numbers. If both operands are integers, the result is an integer—except for the truedivision (/) operator, which always yields a floatingpoint number. If both operands are floatingpoint numbers, the result is a floatingpoint number. Expressions containing an integer and a floating-point number are mixedtype expressions—these always produce floating-point results.
2.4 FUNCTION PRINT AND AN INTRO TO SINGLE- AND DOUBLE-QUOTED STRINGS
The builtin print function displays its argument(s) as a line of text:
Welcome to Python!
In this case, the argument 'Welcome to Python!' is a string—a sequence of characters enclosed in single quotes ('). Unlike when you evaluate expressions in interactive mode, the text that print displays here is not preceded by Out[1]. Also, print does not display a string’s quotes, though we’ll soon show how to display quotes in strings.
You also may enclose a string in double quotes ("), as in:
Welcome to Python!
Python programmers generally prefer single quotes. When print completes its task, it positions the screen cursor at the beginning of the next line.
Printing a Comma-Separated List of Items
The print function can receive a commaseparated list of arguments, as in:
Welcome to Python!
It displays each argument separated from the next by a space, producing the same output as in the two preceding snippets. Here we showed a commaseparated list of strings, but the values can be of any type. We’ll show in the next chapter how to prevent automatic spacing between values or use a different separator than space.
Printing Many Lines of Text with One Statement
When a backslash (\) appears in a string, it’s known as the escape character. The backslash and the character immediately following it form an escape sequence. For example, \n represents the newline character escape sequence, which tells print to move the output cursor to the next line. The following snippet uses three newline characters to create several lines of output:i
Welcome
to
Python!
Other Escape Sequences
The following table shows some common escape sequences.
| Escape sequence | Description |
|---|---|
| \n | Insert a newline character in a string. When the string is displayed, for each newline, move the screen cursor to the beginning of the next line. |
| \t | Insert a horizontal tab. When the string is displayed, for each tab, move the screen cursor to the next tab stop. |
| \\ | Insert a backslash character in a string. |
| \" | Insert a double quote character in a string. |
| \' | Insert a single quote character in a string. |
Ignoring a Line Break in a Long String
You may also split a long string (or a long statement) over several lines by using the \continuation character as the last character on a line to ignore the line break:
...: split it over two lines')
this is a longer string, so we split it over two lines
The interpreter reassembles the string’s parts into a single string with no line break. Though the backslash character in the preceding snippet is inside a string, it’s not the escape character because another character does not follow it.
Printing the Value of an Expression
Calculations can be performed in print statements:
Sum is 10
2.5 TRIPLE-QUOTED STRINGS
Earlier, we introduced strings delimited by a pair of single quotes (') or a pair of double quotes ("). Triplequoted strings begin and end with three double quotes (""") or three single quotes ('''). The Style Guide for Python Code recommends three double quotes ("""). Use these to create:
- multiline strings,
- strings containing single or double quotes and
- docstrings, which are the recommended way to document the purposes of certain program components.
Including Quotes in Strings
In a string delimited by single quotes, you may include doublequote characters:
Display "hi” in quotes
but not single quotes:
File "<ipython-input-219bf596ccf72>", line 1
SyntaxError: invalid syntax
unless you use the \' escape sequence:
Display 'hi' in quotes
Snippet [2] displayed a syntax error due to a single quote inside a singlequoted string. IPython displays information about the line of code that caused the syntax error and points to the error with a ^ symbol. It also displays the message SyntaxError:
invalid syntax.A string delimited by double quotes may include single quote characters:
Display the name O'Brien
but not double quotes, unless you use the \" escape sequence:
Display "hi” in quotes
To avoid using \' and \" inside strings, you can enclose such strings in triple quotes:
Display "hi” and 'bye' in quotes
Multiline Strings
The following snippet assigns a multiline triplequoted string to triple_quoted_string:
...: string that spans two lines"""
IPython knows that the string is incomplete because we did not type the closing """ before we pressed Enter. So, IPython displays a continuation prompt ...: at which you can input the multiline string’s next line. This continues until you enter the ending """ and press Enter. The following displays triple_quoted_string:
This is a triplequoted
string that spans two lines
Python stores multiline strings with embedded newline characters. When we evaluate triple_quoted_string rather than printing it, IPython displays it in single quotes with a \n character where you pressed Enter in snippet [7]. The quotes IPython displays indicate that triple_quoted_string is a string—they’re not part of the string’s contents:
Out[9]: 'This is a triplequoted\nstring that spans two lines'
2.6 GETTING INPUT FROM THE USER
The builtin input function requests and obtains user input:
In [2]: name
Out[2]: 'Paul'
Paul
The snippet executes as follows:
- First, input displays its string argument—a prompt—to tell the user what to type and waits for the user to respond. We typed Paul and pressed Enter. We use bold text to distinguish the user’s input from the prompt text that input displays.
- Function input then returns those characters as a string that the program can use. Here we assigned that string to the variable name.
Snippet [2] shows name’s value. Evaluating name displays its value in single quotes as 'Paul' because it’s a string. Printing name (in snippet [3]) displays the string without the quotes. If you enter quotes, they’re part of the string, as in:
What's your name? 'Paul'
In [5]: name
Out[5]: "'Paul'"
Function input Always Returns a String
Consider the following snippets that attempt to read two numbers and add them:
Enter first number: 7
Enter second number: 3
In [9]: value1 + value2
Out[9]: '73'
Rather than adding the integers 7 and 3 to produce 10, Python “adds” the string values '7' and '3', producing the string '73'. This is known as string concatenation. It creates a new string containing the left operand’s value followed by the right operand’s value.
Getting an Integer from the User
If you need an integer, convert the string to an integer using the built-in int function:
Enter an integer: 7
In [11]: value = int(value)
In [12]: value
Out[12]: 7
We could have combined the code in snippets [10] and [11]:
Enter another integer: 13
In [14]: another_value
Out[14]: 13
Variables value and another_value now contain integers. Adding them produces an integer result (rather than concatenating them):
Out[15]: 20
If the string passed to int cannot be converted to an integer, a ValueError occurs:
Enter another integer: hello
ValueError: invalid literal for int() with base 10: 'hello'
Function int also can convert a floatingpoint value to an integer:
Out[17]: 10
To convert strings to floatingpoint numbers, use the built-in float function.
2.7 DECISION MAKING: THE IF STATEMENT AND COMPARISON OPERATORS
A condition is a Boolean expression with the value True or False. The following determines whether 7 is greater than 4 and whether 7 is less than 4:In [1]: 7 > 4
Out[1]: True
In [2]: 7 < 4
Out[2]: False
True and False are Python keywords. Using a keyword as an identifier causes a SyntaxError. True and False are each capitalized.
Operators >, <, >= and <= all have the same precedence. Operators == and != both have the same precedence, which is lower than that of >, <, >= and <=. A syntax error occurs when any of the operators ==, !=, >= and <= contains spaces between its pair of symbols:
File "<ipython-input-3-5c6e2897f3b3>", line 1
7 > = 4
^
SyntaxError: invalid syntax
Another syntax error occurs if you reverse the symbols in the operators !=, >= and <= (by writing them as =!, => and =<).
Making Decisions with the if Statement: Introducing Scripts
We now present a simple version of the if statement, which uses a condition to decide whether to execute a statement (or a group of statements). Here we’ll read two integers from the user and compare them using six consecutive if statements, one for each comparison operator. If the condition in a given if statement is True, the corresponding print statement executes; otherwise, it’s skipped.
IPython interactive mode is helpful for executing brief code snippets and seeing immediate results. When you have many statements to execute as a group, you typically write them as a script stored in a file with the .py (short for Python) extension—such as fig02_01.py for this example’s script. Scripts are also called programs. For instructions on locating and executing the scripts in this article, see Chapter 1’s IPython TestDrive.
Each time you execute this script, three of the six conditions are True. To show this, we execute the script three times—once with the first integer less than the second, once with the same value for both integers and once with the first integer greater than the second. The three sample executions appear after the script Each time we present a script like the one below, we introduce it before the figure, then explain the script’s code after the figure. We show line numbers for your convenience—these are not part of Python. IDEs enable you to choose whether to display line numbers. To run this example, change to this chapter’s ch02 examples folder, then enter:
ipython fig02_01.py
or, if you’re in IPython already, you can use the command:
run fig02_01.py
2 """Comparing integers using if statements and comparison operators."""
3
4 print('Enter two integers, and I will tell you',
5 'the relationships they satisfy.')
6
7 # read first integer
8 number1 = int(input('Enter first integer: '))
9
10 # read second integer
11 number2 = int(input('Enter second integer: '))
12
13 if number1 == number2:
14 print(number1, 'is equal to', number2)
15
16 if number1 != number2:
17 print(number1, 'is not equal to', number2)
18
19 if number1 < number2:
20 print(number1, 'is less than', number2)
21
22 if number1 > number2:
23 print(number1, 'is greater than', number2)
24
25 if number1 <= number2:
26 print(number1, 'is less than or equal to', number2)
27
28 if number1 >= number2:
29 print(number1, 'is greater than or equal to', number2)
Enter two integers and I will tell you the relationships they satisfy.
Enter first integer: 37
Enter second integer: 42
37 is not equal to 42
37 is less than 42
37 is less than or equal to 42
Enter first integer: 7
Enter second integer: 7
7 is equal to 7
7 is less than or equal to 7
7 is greater than or equal to 7
Enter two integers and I will tell you the relationships they satisfy.
Enter first integer: 54
Enter second integer: 17
54 is not equal to 17
54 is greater than 17
54 is greater than or equal to 17
Comments
Line 1 begins with the hash character (#), which indicates that the rest of the line is a comment:
# fig02_01.py
For easy reference, we begin each script with a comment indicating the script’s file name. A comment also can begin to the right of the code on a given line and continue until the end of that line.
Docstrings
The Style Guide for Python Code states that each script should start with a docstring that explains the script’s purpose, such as the one in line 2:
"""Comparing integers using if statements and comparison operators."""
For more complex scripts, the docstring often spans many lines. In later chapters, you’ll use docstrings to describe script components you define, such as new functions and new types called classes. We’ll also discuss how to access docstrings with the IPython help mechanism.
Blank Lines
Line 3 is a blank line. You use blank lines and space characters to make code easier to read. Together, blank lines, space characters and tab characters are known as white space. Python ignores most white space—you’ll see that some indentation is required.
Splitting a Lengthy Statement Across Lines
Lines 4–5
'the relationships they satisfy.')
display instructions to the user. These are too long to fit on one line, so we broke them into two strings. Recall that you can display several values by passing to print a commaseparated list—print separates each value from the next with a space.
Typically, you write statements on one line. You may spread a lengthy statement over several lines with the \ continuation character. Python also allows you to split long code lines in parentheses without using continuation characters (as in lines 4–5). This is the preferred way to break long code lines according to the Style Guide for Python Code. Always choose breaking points that make sense, such as after a comma in the preceding call to print or before an operator in a lengthy expression.
Reading Integer Values from the User
Next, lines 8 and 11 use the builtin input and int functions to prompt for and read two integer values from the user.
if Statements
The if statement in lines 13–14
print(number1, 'is equal to', number2)
uses the == comparison operator to determine whether the values of variables number1 and number2 are equal. If so, the condition is True, and line 14 displays a line of text indicating that the values are equal. If any of the remaining if statements’ conditions are True (lines 16, 19, 22, 25 and 28), the corresponding print displays a line of text.
Each if statement consists of the keyword if, the condition to test, and a colon (:) followed by an indented body called a suite. Each suite must contain one or more statements. Forgetting the colon (:) after the condition is a common syntax error.
Suite Indentation
Python requires you to indent the statements in suites. The Style Guide for Python Code recommends fourspace indents—we use that convention throughout this article.
You’ll see in the next chapter that incorrect indentation can cause errors.
Confusing == and =
Using the assignment symbol (=) instead of the equality operator (==) in an if statement’s condition is a common syntax error. To help avoid this, read == as “is equal to” and = as “is assigned.” You’ll see in the next chapter that using == in place of = in an assignment statement can lead to subtle problems.
Chaining Comparisons
You can chain comparisons to check whether a value is in a range. The following comparison determines whether x is in the range 1 through 5, inclusive:
In [2]: 1 <= x <= 5
Out[2]: True
In [3]: x = 10
In [4]: 1 <= x <= 5
Out[4]: False
2.8 OBJECTS AND DYNAMIC TYPING
Values such as 7 (an integer), 4.1 (a floatingpoint number) and 'dog' are all objects. Every object has a type and a value:
Out[1]: int
In [2]: type(4.1)
Out[2]: float
In [3]: type('dog')
Out[3]: str
An object’s value is the data stored in the object. The snippets above show objects of builtin types int (for integers), float (for floating-point numbers) and str (for strings).
Variables Refer to Objects
Assigning an object to a variable binds (associates) that variable’s name to the object. As you’ve seen, you can then use the variable in your code to access the object’s value:
In [5]: x + 10
Out[5]: 17
In [6]: x
Out[6]: 7
After snippet [4]’s assignment, the variable x refers to the integer object containing 7. As shown in snippet [6], snippet [5] does not change x’s value. You can change x as follows:
In [8]: x
Out[8]: 17
Dynamic Typing
Python uses dynamic typing—it determines the type of the object a variable refers to while executing your code. We can show this by rebinding the variable x to different objects and checking their types:
Out[9]: int
In [10]: x = 4.1
In [11]: type(x)
Out[11]: float
In [12]: x = 'dog'
In [13]: type(x)
Out[13]: str
Garbage Collection
Python creates objects in memory and removes them from memory as necessary. After snippet [10], the variable x now refers to a float object. The integer object from snippet [7] is no longer bound to a variable. As we’ll discuss in a later chapter, Python automatically removes such objects from memory. This process—called garbage collection—helps ensure that memory is available for new objects you create.
2.9 INTRO TO DATA SCIENCE: BASIC DESCRIPTIVE STATISTICS
In data science, you’ll often use statistics to describe and summarize your data. Here, we begin by introducing several such descriptive statistics, including:
- minimum—the smallest value in a collection of values.
- maximum—the largest value in a collection of values.
- range—the range of values from the minimum to the maximum.
- count—the number of values in a collection.
- sum—the total of the values in a collection.
We’ll look at determining the count and sum in the next chapter. Measures of dispersion (also called measures of variability), such as range, help determine how spread out values are. Other measures of dispersion that we’ll present in later chapters include variance and standard deviation.
Determining the Minimum of Three Values
First, let’s show how to determine the minimum of three values manually. The following script prompts for and inputs three values, uses if statements to determine the minimum value, then displays it.
2 """Find the minimum of three values."""
3
4 number1 = int(input('Enter first integer: '))
5 number2 = int(input('Enter second integer: '))
6 number3 = int(input('Enter third integer: '))
7
8 minimum = number1
9
10 if number2 < minimum:
11 minimum = number2
12
13 if number3 < minimum:
14 minimum = number3
15
16 print('Minimum value is', minimum)
Enter second integer: 27
Enter third integer: 36
Minimum value is 12
Enter second integer: 12
Enter third integer: 36
Minimum value is 12
Enter second integer: 27
Enter third integer: 12
Minimum value is 12
After inputting the three values, we process one value at a time:
- First, we assume that number1 contains the smallest value, so line 8 assigns it to the variable minimum. Of course, it’s possible that number2 or number3 contains the actual smallest value, so we still must compare each of these with minimum.
- The first if statement (lines 10–11) then tests number2 < minimum and if this condition is True assigns number2 to minimum.
- The second if statement (lines 13–14) then tests number3 < minimum, and if this condition is True assigns number3 to minimum.
Now, minimum contains the smallest value, so we display it. We executed the script three times to show that it always finds the smallest value regardless of whether the user enters it first, second or third.
Determining the Minimum and Maximum with Built-In Functions min and max
Python has many builtin functions for performing common tasks. Builtin functions min and max calculate the minimum and maximum, respectively, of a collection of values:
Out[1]: 12
In [2]: max(36, 27, 12)
Out[2]: 36
The functions min and max can receive any number of arguments.
Determining the Range of a Collection of Values
The range of values is simply the minimum through the maximum value. In this case, the range is 12 through 36. Much data science is devoted to getting to know your data. Descriptive statistics is a crucial part of that, but you also have to understand how to interpret the statistics. For example, if you have 100 numbers with a range of 12 through 36, those numbers could be distributed evenly over that range. At the opposite extreme, you could have clumping with 99 values of 12 and one 36, or one 12 and 99 values of 36.
Functional-Style Programming: Reduction
Throughout this article, we introduce various functionalstyle programming capabilities. These enable you to write code that can be more concise, clearer and easier to debug—that is, find and correct errors. The min and max functions are examples of functionalstyle programming concept called reduction. They reduce a collection of values to a single value. Other reductions you’ll see include the sum, average, variance and standard deviation of a collection of values. You’ll also see how to define custom reductions.
Upcoming Intro to Data Science Sections
In the next two chapters, we’ll continue our discussion of basic descriptive statistics with measures of central tendency, including mean, median and mode, and measures of dispersion, including variance and standard deviation.
2.10 WRAP-UP
This chapter continued our discussion of arithmetic. You used variables to store values for later use. We introduced Python’s arithmetic operators and showed that you must write all expressions in straightline form. You used the builtin function print to display data. We created single, double and triplequoted strings. You used triple-quoted strings to create multiline strings and to embed single or double quotes in strings. You used the input function to prompt for and get input from the user at the keyboard.
We used the functions int and float to convert strings to numeric values. We presented Python’s comparison operators. Then, you used them in a script that read two integers from the user and compared their values using a series of if statements. We discussed Python’s dynamic typing and used the builtin function type to display an object’s type. Finally, we introduced the basic descriptive statistics minimum and maximum and used them to calculate the range of a collection of values. In the next chapter, we present Python’s control statements.
