4. Functions
Objectives
In this chapter, you’ll
- Create custom functions.
- Import and use Python Standard Library modules, such as random and math, to reuse code and avoid “reinventing the wheel.”
- Pass data between functions.
- Generate a range of random numbers.
- See simulation techniques using randomnumber generation.
- Seed the random number generator to ensure reproducibility.
- Pack values into a tuple and unpack values from a tuple.
- Return multiple values from a function via a tuple.
- Understand how an identifier’s scope determines where in your program you can use it.
- Create functions with default parameter values.
- Call functions with keyword arguments.
- Create functions that can receive any number of arguments.
- Use methods of an object.
- Write and use a recursive function.
Outline
4.2 Defining Functions
4.3 Functions with Multiple Parameters
4.4 RandomNumber Generation
4.5 Case Study: A Game of Chance
4.6 Python Standard Library
4.7 math Module Functions
4.8 Using IPython Tab Completion for Discovery
4.9 Default Parameter Values
4.10 Keyword Arguments
4.11 Arbitrary Argument Lists
4.12 Methods: Functions That Belong to Objects
4.13 Scope Rules
4.14 import: A Deeper Look
4.15 Passing Arguments to Functions: A Deeper Look
4.16 Recursion
4.17 FunctionalStyle Programming
4.18 Intro to Data Science: Measures of Dispersion
4.19 WrapUp
4.1 INTRODUCTION
In this chapter, we continue our discussion of Python fundamentals with custom functions and related topics. We’ll use the Python Standard Library’s random module and randomnumber generation to simulate rolling a sixsided die. We’ll combine custom functions and randomnumber generation in a script that implements the dice game craps. In that example, we’ll also introduce Python’s tuple sequence type and use tuples to return more than one value from a function. We’ll discuss seeding the random number generator to ensure reproducibility.
You’ll import the Python Standard Library’s math module, then use it to learn about IPython tab completion, which speeds your coding and discovery processes. You’ll create functions with default parameter values, call functions with keyword arguments and define functions with arbitrary argument lists. We’ll demonstrate calling methods of objects. We’ll also discuss how an identifier’s scope determines where in your program you can use it.
We’ll take a deeper look at importing modules. You’ll see that arguments are passedby-reference to functions. We’ll also demonstrate a recursive function and begin presenting Python’s functionalstyle programming capabilities. In the Intro to Data Science section, we’ll continue our discussion of descriptive statistics by introducing measures of dispersion—variance and standard deviation—and calculating them with functions from the Python Standard Library’s statistics module.
4.2 DEFINING FUNCTIONS
You’ve called many builtin functions (int, float, print, input, type, sum, len, min and max) and a few functions from the statistics module (mean, median and mode). Each performed a single, welldefined task. You’ll often define and call custom functions. The following session defines a square function that calculates the square of its argument. Then it calls the function twice—once to square the int value 7 (producing the int value 49) and once to square the float value 2.5 (producing the float value 6.25):
...: """Calculate the square of number."""
...: return number ** 2
...:
In [2]: square(7)
Out[2]: 49
In [3]: square(2.5)
Out[3]: 6.25
The statements defining the function in the first snippet are written only once, but may be called “to do their job” from many points throughout a program and as often as you like. Calling square with a nonnumeric argument like 'hello' causes a TypeError because the exponentiation operator (**) works only with numeric values.
Defining a Custom Function
A function definition (like square in snippet [1]) begins with the def keyword, followed by the function name (square), a set of parentheses and a colon (:). Like variable identifiers, by convention function names should begin with a lowercase letter and in multiword names underscores should separate each word.
The required parentheses contain the function’s parameter list—a commaseparated list of parameters representing the data that the function needs to perform its task. Function square has only one parameter named number—the value to be squared. If the parentheses are empty, the function does not use parameters to perform its task. The indented lines after the colon (:) are the function’s block, which consists of an optional docstring followed by the statements that perform the function’s task. We’ll soon point out the difference between a function’s block and a control statement’s suite.
Specifying a Custom Function’s Docstring
The Style Guide for Python Code says that the first line in a function’s block should be a docstring that briefly explains the function’s purpose:
"""Calculate the square of number."""
To provide more detail, you can use a multiline docstring—the style guide recommends starting with a brief explanation, followed by a blank line and the additional details.
Returning a Result to a Function’s Caller
When a function finishes executing, it returns control to its caller—that is, the line of code that called the function. In square’s block, the return statement:
return number ** 2
first squares number, then terminates the function and gives the result back to the caller. In this example, the first caller is in snippet [2], so IPython displays the result in Out[2]. The second caller is in snippet [3], so IPython displays the result in Out[3].
Function calls also can be embedded in expressions. The following code calls square first, then print displays the result:
The square of 7 is 49
There are two other ways to return control from a function to its caller:
- Executing a return statement without an expression terminates the function and implicitly returns the value None to the caller. The Python documentation states that None represents the absence of a value. None evaluates to False in conditions.
- When there’s no return statement in a function, it implicitly returns the value None after executing the last statement in the function’s block.
Local Variables
Though we did not define variables in square’s block, it is possible to do so. A function’s parameters and variables defined in its block are all local variables—they can be used only inside the function and exist only while the function is executing. Trying to access a local variable outside its function’s block causes a NameError, indicating that the variable is not defined.
Accessing a Function’s Docstring via IPython’s Help Mechanism
IPython can help you learn about the modules and functions you intend to use in your code, as well as IPython itself. For example, to view a function’s docstring to learn how to use the function, type the function’s name followed by a question mark (?):
Signature: square(number)
Docstring: Calculate the square of number.
File: ~/Documents/examples/ch04/<ipythoninput-1-7268c8ff93a9>
Type: function
For our square function, the information displayed includes:
- The function’s name and parameter list—known as its signature.
- The function’s docstring.
- The name of the file containing the function’s definition. For a function in an interactive session, this line shows information for the snippet that defined the function—the 1 in "<ipython-input-17268c8ff93a9>" means snippet [1].
- The type of the item for which you accessed IPython’s help mechanism—in this case, a function.
If the function’s source code is accessible from IPython—such as a function defined in the current session or imported into the session from a .py file—you can use ?? to display the function’s full sourcecode definition:
Signature: square(number)
Source:
def square(number):
"""Calculate the square of number."""
return number ** 2
File: ~/Documents/examples/ch04/<ipythoninput-1-7268c8ff93a9>
Type: function
If the source code is not accessible from IPython, ?? simply shows the docstring.
If the docstring fits in the window, IPython displays the next In [] prompt. If a docstring is too long to fit, IPython indicates that there’s more by displaying a colon (:) at the bottom of the window—press the Space key to display the next screen. You can navigate backwards and forwards through the docstring with the up and down arrow keys, respectively. IPython displays (END) at the end of the docstring. Press q (for “quit”) at any : or the (END) prompt to return to the next In [] prompt. To get a sense of IPython’s features, type ? at any In [] prompt, press Enter, then read the help documentation overview.
4.3 FUNCTIONS WITH MULTIPLE PARAMETERS
Let’s define a maximum function that determines and returns the largest of three values—the following session calls the function three times with integers, floatingpoint numbers and strings, respectively.
...: """Return the maximum of three values."""
...: max_value = value1
...: if value2 > max_value:
...: max_value = value2
...: if value3 > max_value:
...: max_value = value3
...: return max_value
...:
In [2]: maximum(12, 27, 36)
Out[2]: 36
In [3]: maximum(12.3, 45.6, 9.7)
Out[3]: 45.6
In [4]: maximum('yellow', 'red', 'orange')
Out[4]: 'yellow'
We did not place blank lines above and below the if statements, because pressing return on a blank line in interactive mode completes the function’s definition.
You also may call maximum with mixed types, such as ints and floats:
Out[5]: 13.5
The call maximum(13.5, 'hello', 7) results in TypeError because strings and numbers cannot be compared to one another with the greaterthan (>) operator.
Function maximum’s Definition
Function maximum specifies three parameters in a commaseparated list. Snippet [2]’s arguments 12, 27 and 36 are assigned to the parameters value1, value2 and value3, respectively.
To determine the largest value, we process one value at a time:
- Initially, we assume that value1 contains the largest value, so we assign it to the local variable max_value. Of course, it’s possible that value2 or value3 contains the actual largest value, so we still must compare each of these with max_value.
- The first if statement then tests value2 > max_value, and if this condition is True assigns value2 to max_value.
- The second if statement then tests value3 > max_value, and if this condition is True assigns value3 to max_value.
Now, max_value contains the largest value, so we return it. When control returns to the caller, the parameters value1, value2 and value3 and the variable max_value in the function’s block—which are all local variables—no longer exist.
Python’s Built-In max and min Functions
For many common tasks, the capabilities you need already exist in Python. For example, builtin max and min functions know how to determine the largest and smallest of their two or more arguments, respectively:
Out[6]: 'yellow'
In [7]: min(15, 9, 27, 14)
Out[7]: 9
Each of these functions also can receive an iterable argument, such as a list or a string. Using builtin functions or functions from the Python Standard Library’s modules rather than writing your own can reduce development time and increase program reliability, portability and performance. For a list of Python’s builtin functions and modules, see
https://docs.python.org/3/library/index.html
4.4 RANDOM-NUMBER GENERATION
We now take a brief diversion into a popular type of programming application—simulation and game playing. You can introduce the element of chance via the Python Standard Library’s random module.
Rolling a Six-Sided Die
Let’s produce 10 random integers in the range 1–6 to simulate rolling a sixsided die:
In [2]: for roll in range(10):
...: print(random.randrange(1, 7), end=' ')
...:
4 2 5 5 4 6 4 6 1 5
First, we import random so we can use the module’s capabilities. The randrange function generates an integer from the first argument value up to, but not including, the second argument value. Let’s use the up arrow key to recall the for statement, then press Enter to reexecute it. Notice that dif erent values are displayed:
...: print(random.randrange(1, 7), end=' ')
...:
4 5 4 5 1 4 1 4 6 5
Sometimes, you may want to guarantee reproducibility of a random sequence—for debugging, for example. At the end of this section, we’ll use the random module’s seed function to do this.
Rolling a Six-Sided Die 6,000,000 Times
If randrange truly produces integers at random, every number in its range has an equal probability (or chance or likelihood) of being returned each time we call it. To show that the die faces 1–6 occur with equal likelihood, the following script simulates 6,000,000 die rolls. When you run the script, each die face should occur approximately 1,000,000 times, as in the sample output.
2 """Roll a sixsided die 6,000,000 times."""
3 import random
4
5 # face frequency counters
6 frequency1 = 0
7 frequency2 = 0
8 frequency3 = 0
9 frequency4 = 0
10 frequency5 = 0
11 frequency6 = 0
12
13 # 6,000,000 die rolls
14 for roll in range(6_000_000): # note underscore separators
15 face = random.randrange(1, 7)
16
17 # increment appropriate face counter
18 if face == 1:
19 frequency1 += 1
20 elif face == 2:
21 frequency2 += 1
22 elif face == 3:
23 frequency3 += 1
24 elif face == 4:
25 frequency4 += 1
26 elif face == 5:
27 frequency5 += 1
28 elif face == 6:
29 frequency6 += 1
30
31 print(f'Face{"Frequency":>13}')
32 print(f'{1:>4}{frequency1:>13}')
33 print(f'{2:>4}{frequency2:>13}')
34 print(f'{3:>4}{frequency3:>13}')
35 print(f'{4:>4}{frequency4:>13}')
36 print(f'{5:>4}{frequency5:>13}')
37 print(f'{6:>4}{frequency6:>13}')
Face Frequency
2 1001481
3 999900
4 1000453
5 999953
6 999527
The script uses nested control statements (an if elif statement nested in the for statement) to determine the number of times each die face appears. The for statement iterates 6,000,000 times. We used Python’s underscore (_) digit separator to make the value 6000000 more readable. The expression range(6,000,000) would be incorrect. Commas separate arguments in function calls, so Python would treat range(6,000,000) as a call to range with the three arguments 6, 0 and 0.
For each die roll, the script adds 1 to the appropriate counter variable. Run the program, and observe the results. This program might take a few seconds to complete execution. As you’ll see, each execution produces dif erent results. Note that we did not provide an else clause in the if elif statement.
Seeding the Random-Number Generator for Reproducibility
Function randrange actually generates pseudorandom numbers, based on an internal calculation that begins with a numeric value known as a seed. Repeatedly calling randrange produces a sequence of numbers that appear to be random, because each time you start a new interactive session or execute a script that uses the random module’s functions, Python internally uses a dif erent seed value. When you’re debugging logic errors in programs that use randomly generated data, it can be helpful to use the same sequence of random numbers until you’ve eliminated the logic errors, before testing the program with other values. To do this, you can use the random module’s seed function to seed the random-number generator yourself—this forces randrange to begin calculating its pseudorandom number sequence from the seed you specify. In the following session, snippets [5] and [8] produce the same results, because snippets [4] and [7] use the same seed (32):
In [5]: for roll in range(10):
...: print(random.randrange(1, 7), end=' ')
...:
1 2 2 3 6 2 4 1 6 1
In [6]: for roll in range(10):
...: print(random.randrange(1, 7), end=' ')
...:
1 3 5 3 1 5 6 4 3 5
In [7]: random.seed(32)
In [8]: for roll in range(10):
...: print(random.randrange(1, 7), end=' ')
...:
1 2 2 3 6 2 4 1 6 1
Snippet [6] generates dif erent values because it simply continues the pseudo-random number sequence that began in snippet [5].
4.5 CASE STUDY: A GAME OF CHANCE
In this section, we simulate the popular dice game known as “craps.” Here is the requirements statement:
You roll two sixsided dice, each with faces containing one, two, three, four, five and six spots, respectively. When the dice come to rest, the sum of the spots on the two upward faces is calculated. If the sum is 7 or 11 on the first roll, you win. If the sum is 2, 3 or 12 on the first roll (called “craps”), you lose (i.e., the “house” wins). If the sum is 4, 5, 6, 8, 9 or 10 on the first roll, that sum becomes your “point.” To win, you must continue rolling the dice until you “make your point” (i.e., roll that same point value).
You lose by rolling a 7 before making your point. The following script simulates the game and shows several sample executions, illustrating winning on the first roll, losing on the first roll, winning on a subsequent roll and losing on a subsequent roll.
view code
2 """Simulating the dice game Craps."""
3 import random
4
5 def roll_dice():
6 """Roll two dice and return their face values as a tuple."""
7 die1 = random.randrange(1, 7)
8 die2 = random.randrange(1, 7)
9 return (die1, die2) # pack die face values into a tuple
10
11 def display_dice(dice):
12 """Display one roll of the two dice."""
13 die1, die2 = dice # unpack the tuple into variables die1 and die2
14 print(f'Player rolled {die1} + {die2} = {sum(dice)}')
15
16 die_values = roll_dice() # first roll
17 display_dice(die_values)
18
19 # determine game status and point, based on first roll
20 sum_of_dice = sum(die_values)
21
22 if sum_of_dice in (7, 11): # win
23 game_status = 'WON'
24 elif sum_of_dice in (2, 3, 12): # lose
25 game_status = 'LOST'
26 else: # remember point
27 game_status = 'CONTINUE'
28 my_point = sum_of_dice
29 print('Point is', my_point)
30
31 # continue rolling until player wins or loses
32 while game_status == 'CONTINUE':
33 die_values = roll_dice()
34 display_dice(die_values)
35 sum_of_dice = sum(die_values)
36
37 if sum_of_dice == my_point: # win by making point
38 game_status = 'WON'
39 elif sum_of_dice == 7: # lose by rolling 7
40 game_status = 'LOST'
41
42 # display "wins” or "loses” message
43 if game_status == 'WON':
44 print('Player wins')
45 else:
46 print('Player loses')
view code image
Player wins
view code image
Player loses
view code image
Point is 9
Player rolled 4 + 4 = 8
Player rolled 2 + 3 = 5
Player rolled 5 + 4 = 9
Player wins
view code image
Point is 6
Player rolled 1 + 6 = 7
Player loses
Function roll_dice—Returning Multiple Values Via a Tuple
Function roll_dice (lines 5–9) simulates rolling two dice on each roll. The function is defined once, then called from several places in the program (lines 16 and 33). The empty parameter list indicates that roll_dice does not require arguments to perform its task.
The builtin and custom functions you’ve called so far each return one value. Sometimes it’s useful to return more than one value, as in roll_dice, which returns both die values (line 9) as a tuple—an immutable (that is, unmodifiable) sequences of values. To create a tuple, separate its values with commas, as in line 9:
(die1, die2)
This is known as packing a tuple. The parentheses are optional, but we recommend using them for clarity. We discuss tuples in depth in the next chapter.
Function display_dice
To use a tuple’s values, you can assign them to a comma-separated list of variables, which unpacks the tuple. To display each roll of the dice, the function display_dice (defined in lines 11–14 and called in lines 17 and 34) unpacks the tuple argument it receives (line 13). The number of variables to the left of = must match the number of elements in the tuple; otherwise, a ValueError occurs. Line 14 prints a formatted string containing both die values and their sum. We calculate the sum of the dice by passing the tuple to the builtin sum function—like a list, a tuple is a sequence.
Note that functions roll_dice and display_dice each begin their blocks with a docstring that states what the function does. Also, both functions contain local variables die1 and die2. These variables do not “collide,” because they belong to different functions’ blocks. Each local variable is accessible only in the block that defined it.
First Roll
When the script begins executing, lines 16–17 roll the dice and display the results. Line 20 calculates the sum of the dice for use in lines 22–29. You can win or lose on the first roll or any subsequent roll. The variable game_status keeps track of the win/loss status.
The in operator in line 22
sum_of_dice in (7, 11)
tests whether the tuple (7, 11) contains sum_of_dice’s value. If this condition is True, you rolled a 7 or an 11. In this case, you won on the first roll, so the script sets game_status to 'WON'. The operator’s right operand can be any iterable. There’s also a not in operator to determine whether a value is not in an iterable. The preceding concise condition is equivalent to
view code image
(sum_of_dice == 7) or (sum_of_dice == 11)
Similarly, the condition in line 24
sum_of_dice in (2, 3, 12)
tests whether the tuple (2, 3, 12) contains sum_of_dice’s value. If so, you lost on the first roll, so the script sets game_status to 'LOST'.
For any other sum of the dice (4, 5, 6, 8, 9 or 10):
- line 27 sets game_status to 'CONTINUE' so you can continue rolling
- line 28 stores the sum of the dice in my_point to keep track of what you must roll to win and
- line 29 displays my_point.
Subsequent Rolls
If game_status is equal to 'CONTINUE' (line 32), you did not win or lose, so the while statement’s suite (lines 33–40) executes. Each loop iteration calls roll_dice, displays the die values and calculates their sum. If sum_of_dice is equal to my_point (line 37) or 7 (line 39), the script sets game_status to 'WON' or 'LOST', respectively, and the loop terminates. Otherwise, the while loop continues executing with the next roll.
Displaying the Final Results
When the loop terminates, the script proceeds to the if else statement (lines 43–46), which prints 'Player wins' if game_status is 'WON', or 'Player loses' otherwise.
4.6 PYTHON STANDARD LIBRARY
Typically, you write Python programs by combining functions and classes (that is, custom types) that you create with preexisting functions and classes defined in modules, such as those in the Python Standard Library and other libraries. A key programming goal is to avoid “reinventing the wheel.”
A module is a file that groups related functions, data and classes. The type Decimal from the Python Standard Library’s decimal module is actually a class. We introduced classes briefly in Chapter 1 and discuss them in detail in the “Object-Oriented Programming” chapter. A package groups related modules. In this book, you’ll work with many preexisting modules and packages, and you’ll create your own modules—in fact, every Python sourcecode (.py) file you create is a module. Creating packages is beyond this book’s scope. They’re typically used to organize a large library’s functionality into smaller subsets that are easier to maintain and can be imported separately for convenience. For example, the matplotlib visualization library that we use in Section 5.17 has extensive functionality (its documentation is over 2300 pages), so we’ll import only the subsets we need in our examples (pyplot and animation).
The Python Standard Library is provided with the core Python language. Its packages and modules contain capabilities for a wide variety of everyday programming tasks.
You can see a complete list of the standard library modules at
https://docs.python.org/3/library/
You’ve already used capabilities from the decimal, statistics and random modules. In the next section, you’ll use mathematics capabilities from the math module. You’ll see many other Python Standard Library modules through-out the article’s examples, including many of those in the following table:
- beyond lists, tuples, dictionaries and sets.
- Cryptography modules—Encrypting data for secure transmission.
- csv—Processing commaseparated value files (like those in Excel).
- datetime—Date and time manipulations. Also modules time and calendar.
- decimal—Fixedpoint and floating-point arithmetic, including monetary calculations.
- doctest—Embed validation tests and expected results in docstrings for simple unit testing.
- math—Common math constants and operations.
- os—Interacting with the operating system.
- profile, pstats, timeit—Performance analysis.
- random—Pseudorandom numbers.
- re—Regular expressions for pattern matching.
- sqlite3—SQLite relational database access.
- statistics—Mathematical statistics functions such as mean, median, mode and variance.
- string—String processing.
- sys—Commandline argument
- gettext and locale—Internationalization and localization modules.
- json—JavaScript Object Notation (JSON) processing used with web services and NoSQL document databases.
- processing; standard input, standard output and standard error streams.
- tkinter—Graphical user interfaces (GUIs) and canvasbased graphics.
- turtle—Turtle graphics.
- webbrowser—For conveniently displaying web pages in Python apps.
4.7 MATH MODULE FUNCTIONS
The math module defines functions for performing various common mathematical calculations. Recall from the previous chapter that an import statement of the following form enables you to use a module’s definitions via the module’s name and a dot (.):
In [1]: import math
For example, the following snippet calculates the square root of 900 by calling the math module’s sqrt function, which returns its result as a float value:
Out[2]: 30.0
Similarly, the following snippet calculates the absolute value of 10 by calling the math module’s fabs function, which returns its result as a float value:
Out[3]: 10.0
Some math module functions are summarized below—you can view the complete list at
https://docs.python.org/3/library/math.html
4.8 USING IPYTHON TAB COMPLETION FOR DISCOVERY
You can view a module’s documentation in IPython interactive mode via tab completion—a discovery feature that speeds your coding and discovery processes. After you type a portion of an identifier and press Tab, IPython completes the identifier for you or provides a list of identifiers that begin with what you’ve typed so far. This may vary based on your operating system platform and what you have imported into your IPython session:
view code image
In [2]: ma<Tab>
map %macro %%markdown
math %magic %matplotlib
max() %man
You can scroll through the identifiers with the up and down arrow keys. As you do, IPython highlights an identifier and shows it to the right of the In [] prompt.
Viewing Identifiers in a Module
To view a list of identifiers defined in a module, type the module’s name and a dot (.), then press Tab:
view code image
acos() atan() copysign() e expm1()
acosh() atan2() cos() erf() fabs()
asin() atanh() cosh() erfc() factorial() >
asinh() ceil() degrees() exp() floor()
If there are more identifiers to display than are currently shown, IPython displays the > symbol (on some platforms) at the right edge, in this case to the right of factorial(). You can use the up and down arrow keys to scroll through the list. In the list of identifiers:
- Those followed by parentheses are functions (or methods, as you’ll see later).
- Single-word identifiers (such as Employee) that begin with an uppercase letter and multi-word identifiers in which each word begins with an uppercase letter (such as Commission-Employee) represent class names (there are none in the preceding list). This naming convention, which the Style Guide for Python Code recommends, is known as CamelCase because the uppercase letters stand out like a camel’s humps.
- Lowercase identifiers without parentheses, such as pi (not shown in the preceding list) and e, are variables. The identifier pi evaluates to 3.141592653589793, and the identifier e evaluates to 2.718281828459045. In the math module, pi and e represent the mathematical constants π and e, respectively.
Python does not have constants, although many objects in Python are immutable (nonmodifiable). So even though pi and e are realworld constants, you must not assign new values to them, because that would change their values. To help distinguish constants from other variables, the style guide recommends naming your custom constants with all capital letters.
Using the Currently Highlighted Function
As you navigate through the identifiers, if you wish to use a currently highlighted function, simply start typing its arguments in parentheses. IPython then hides the auto-completion list. If you need more information about the currently highlighted item, you can view its docstring by typing a question mark (?) following the name and pressing Enter to view the help documentation. The following shows the fabs function’s docstring:
view code image
Docstring:
fabs(x)
Return the absolute value of the float x.
Type: builtin_function_or_method
The builtin_function_or_method shown above indicates that fabs is part of a Python Standard Library module. Such modules are considered to be built into Python. In this case, fabs is a builtin function from the math module.
4.9 DEFAULT PARAMETER VALUES
When defining a function, you can specify that a parameter has a default parameter value. When calling the function, if you omit the argument for a parameter with a default parameter value, the default value for that parameter is automatically passed.
Let’s define a function rectangle_area with default parameter values:
view code image
...: """Return a rectangle's area."""
...: return length * width
...:
You specify a default parameter value by following a parameter’s name with an = and a value—in this case, the default parameter values are 2 and 3 for length and width, respectively. Any parameters with default parameter values must appear in the parameter list to the right of parameters that do not have defaults.
The following call to rectangle_area has no arguments, so IPython uses both default parameter values as if you had called rectangle_area(2, 3):
Out[2]: 6
The following call to rectangle_area has only one argument. Arguments are assigned to parameters from left to right, so 10 is used as the length. The interpreter passes the default parameter value 3 for the width as if you had called rectangle_area(10, 3):
Out[3]: 30
The following call to rectangle_area has arguments for both length and width, so IPython ignores the default parameter values:
Out[4]: 50
4.10 KEYWORD ARGUMENTS
When calling functions, you can use keyword arguments to pass arguments in any order. To demonstrate keyword arguments, we redefine the rectangle_area function—this time without default parameter values:
view code image
...: """Return a rectangle's area."""
...: return length * width
...:
Each keyword argument in a call has the form parametername=value. The following call shows that the order of keyword arguments does not matter—they do not need to match the corresponding parameters’ positions in the function definition:
view code image
Out[3]: 50
In each function call, you must place keyword arguments after a function’s positional arguments—that is, any arguments for which you do not specify the parameter name.
Such arguments are assigned to the function’s parameters lefttoright, based on the argument’s positions in the argument list. Keyword arguments are also helpful for improving the readability of function calls, especially for functions with many arguments.
4.11 ARBITRARY ARGUMENT LISTS
Functions with arbitrary argument lists, such as builtin functions min and max, can receive any number of arguments. Consider the following min call:
min(88, 75, 96, 55, 83)
The function’s documentation states that min has two required parameters (named arg1 and arg2) and an optional third parameter of the form *args, indicating that the function can receive any number of additional arguments. The * before the parameter name tells Python to pack any remaining arguments into a tuple that’s passed to the args parameter. In the call above, parameter arg1 receives 88, parameter arg2 receives 75 and parameter args receives the tuple (96, 55, 83).
Defining a Function with an Arbitrary Argument List
Let’s define an average function that can receive any number of arguments:
view code image
...: return sum(args) / len(args)
...:
The parameter name args is used by convention, but you may use any identifier. If the function has multiple parameters, the *args parameter must be the rightmost parameter.
Now, let’s call average several times with arbitrary argument lists of different lengths:
view code image
Out[2]: 7.5
In [3]: average(5, 10, 15)
Out[3]: 10.0
In [4]: average(5, 10, 15, 20)
Out[4]: 12.5
To calculate the average, divide the sum of the args tuple’s elements (returned by builtin function sum) by the tuple’s number of elements (returned by builtin function len). Note in our average definition that if the length of args is 0, a ZeroDivisionError occurs. In the next chapter, you’ll see how to access a tuple’s elements without unpacking them.
Passing an Iterable’s Individual Elements as Function Arguments
You can unpack a tuple’s, list’s or other iterable’s elements to pass them as individual function arguments. The * operator, when applied to an iterable argument in a function call, unpacks its elements. The following code creates a fiveelement grades list, then uses the expression *grades to unpack its elements as average’s arguments:
view code image
In [6]: average(*grades)
Out[6]: 79.4
The call shown above is equivalent to average(88, 75, 96, 55, 83).
4.12 METHODS: FUNCTIONS THAT BELONG TO OBJECTS
A method is simply a function that you call on an object using the form
view code image
In [2]: s.lower() # call lower method on string object s
Out[2]: 'hello'
In [3]: s.upper()
Out[3]: 'HELLO'
In [4]: s
Out[4]: 'Hello'
The Python Standard Library reference at
https://docs.python.org/3/library/index.html
describes the methods of builtin types and the types in the Python Standard Library. In the “ObjectOriented Programming” chapter, you’ll create custom types called classes and define custom methods that you can call on objects of those classes.
4.13 SCOPE RULES
Each identifier has a scope that determines where you can use it in your program. For that portion of the program, the identifier is said to be “in scope.”
Local Scope
A local variable’s identifier has local scope. It’s “in scope” only from its definition to the end of the function’s block. It “goes out of scope” when the function returns to its caller. So, a local variable can be used only inside the function that defines it.
Global Scope
Identifiers defined outside any function (or class) have global scope—these may include functions, variables and classes. Variables with global scope are known as global variables. Identifiers with global scope can be used in a .py file or interactive session anywhere after they’re defined.
Accessing a Global Variable from a Function
You can access a global variable’s value inside a function:
view code image
In [2]: def access_global():
...: print('x printed from access_global:', x)
...:
In [3]: access_global()
x printed from access_global: 7
However, by default, you cannot modify a global variable in a function—when you first assign a value to a variable in a function’s block, Python creates a new local variable:
view code image
...: x = 3.5
...: print('x printed from try_to_modify_global:', x)
...:
In [5]: try_to_modify_global()
x printed from try_to_modify_global: 3.5
In [6]: x
Out[6]: 7
In function try_to_modify_global’s block, the local x shadows the global x, making it inaccessible in the scope of the function’s block. Snippet [6] shows that global variable x still exists and has its original value (7) after function try_to_modify_global executes.
To modify a global variable in a function’s block, you must use a global statement to declare that the variable is defined in the global scope:
view code image
...: global x
...: x = 'hello'
...: print('x printed from modify_global:', x)
...:
In [8]: modify_global()
x printed from modify_global: hello
In [9]: x
Out[9]: 'hello'
Blocks vs. Suites
You’ve now defined function blocks and control statement suites. When you create a variable in a block, it’s local to that block. However, when you create a variable in a control statement’s suite, the variable’s scope depends on where the control statement is defined:
- If the control statement is in the global scope, then any variables defined in the control statement have global scope.
- If the control statement is in a function’s block, then any variables defined in the control statement have local scope.
We’ll continue our scope discussion in the “ObjectOriented Programming” chapter when we introduce custom classes.
Shadowing Functions
In the preceding chapters, when summing values, we stored the sum in a variable named total. The reason we did this is that sum is a builtin function. If you define a variable named sum, it shadows the builtin function, making it inaccessible in your code. When you execute the following assignment, Python binds the identifier sum to the int object containing 15. At this point, the identifier sum no longer references the builtin function. So, when you try to use sum as a function, a TypeError occurs:
view code image
In [11]: sum
Out[11]: 15
In [12]: sum([10, 5])
TypeError Traceback (most recent call last
)<
ipython-input-1-21237d97a65fb> in <module>()
> 1 sum([10, 5])
TypeError: 'int' object is not callable
Statements at Global Scope
In the scripts you’ve seen so far, we’ve written some statements outside functions at the global scope and some statements inside function blocks. Script statements at global scope execute as soon as they’re encountered by the interpreter, whereas statements in a block execute only when the function is called.
4.14 IMPORT: A DEEPER LOOK
You’ve imported modules (such as math and random) with a statement like:
import module_name
then accessed their features via each module’s name and a dot (.). Also, you’ve imported a specific identifier from a module (such as the decimal module’s Decimal type) with a statement like:
from module_name import identifier
then used that identifier without having to precede it with the module name and a dot (.).
Importing Multiple Identifiers from a Module
Using the from import statement you can import a commaseparated list of identifiers from a module then use them in your code without having to precede them with the module name and a dot (.):
view code image
In [2]: ceil(10.3)
Out[2]: 11
In [3]: floor(10.7)
Out[3]: 10
Trying to use a function that’s not imported causes a NameError, indicating that the name is not defined.
Caution: Avoid Wildcard Imports
You can import all identifiers defined in a module with a wildcard import of the form
from modulename import *
This makes all of the module’s identifiers available for use in your code. Importing a module’s identifiers with a wildcard import can lead to subtle errors—it’s considered a dangerous practice that you should avoid. Consider the following snippets:
In [5]: from math import *
In [6]: e
Out[6]: 2.718281828459045
Initially, we assign the string 'hello' to a variable named e. After executing snippet [5] though, the variable e is replaced, possibly by accident, with the math module’s constant e, representing the mathematical floating-point value e.
Binding Names for Modules and Module Identifiers
Sometimes it’s helpful to import a module and use an abbreviation for it to simplify your code. The import statement’s as clause allows you to specify the name used to reference the module’s identifiers. For example, in Section 3.14 we could have imported the statistics module and accessed its mean function as follows:
view code image
In [8]: grades = [85, 93, 45, 87, 93]
In [9]: stats.mean(grades)
Out[9]: 80.6
As you’ll see in later chapters, import as is frequently used to import Python libraries with convenient abbreviations, like stats for the statistics module. As another example, we’ll use the numpy module which typically is imported with
import numpy as np
Library documentation often mentions popular shorthand names.
Typically, when importing a module, you should use import or import as statements, then access the module through the module name or the abbreviation following the as keyword, respectively. This ensures that you do not accidentally import an identifier that conflicts with one in your code.
4.15 PASSING ARGUMENTS TO FUNCTIONS: A DEEPER LOOK
Let’s take a closer look at how arguments are passed to functions. In many programming languages, there are two ways to pass arguments—pass-by-value and pass-by-reference (sometimes called call-by-value and call-by-reference, respectively):
- With passbyvalue, the called function receives a copy of the argument’s value and works exclusively with that copy. Changes to the function’s copy do not affect the original variable’s value in the caller.
- With passbyreference, the called function can access the argument’s value in the caller directly and modify the value if it’s mutable.
Python arguments are always passed by reference. Some people call this pass-by-object-reference, because “everything in Python is an object.” When a function call provides an argument, Python copies the argument object’s reference—not the object itself—into the corresponding parameter. This is important for performance. Functions often manipulate large objects—frequently copying them would consume large amounts of computer memory and significantly slow program performance.
Memory Addresses, References and “Pointers”
You interact with an object via a reference, which behind the scenes is that object’s address (or location) in the computer’s memory—sometimes called a “pointer” in other languages. After an assignment like
View code image
x = 7
the variable x does not actually contain the value 7. Rather, it contains a reference to an object containing 7 stored elsewhere in memory. You might say that x “points to” (that is, references) the object containing 7, as in the diagram below:
![]() |
| By JuTT BaDshaH |
Built-In Function id and Object Identities
Let’s consider how we pass arguments to functions. First, let’s create the integer variable x mentioned above—shortly we’ll use x as a function argument:
View code image
In [1]: x = 7
Now x refers to (or “points to”) the integer object containing 7. No two separate objects can reside at the same address in memory, so every object in memory has a unique address. Though we can’t see an object’s address, we can use the builtin id function to obtain a unique int value which identifies only that object while it remains in memory (you’ll likely get a different value when you run this on your computer):
View code image
Out[2]: 4350477840
The integer result of calling id is known as the object’s identity. No two objects in memory can have the same identity. We’ll use object identities to demonstrate that objects are passed by reference.
Passing an Object to a Function
Let’s define a cube function that displays its parameter’s identity, then returns the parameter’s value cubed:
view code image
...: print('id(number):', id(number))
...: return number ** 3
...:
Next, let’s call cube with the argument x, which refers to the integer object containing 7:
View code image
id(number): 4350477840
Out[4]: 343
The identity displayed for cube’s parameter number—4350477840—is the same as that displayed for x previously. Since every object has a unique identity, both the argument x and the parameter number refer to the same object while cube executes.
So when function cube uses its parameter number in its calculation, it gets the value of number from the original object in the caller.
Testing Object Identities with the is Operator
You also can prove that the argument and the parameter refer to the same object with Python’s is operator, which returns True if its two operands have the same identity:
view code image
...: print('number is x:', number is x) # x is a global variable
...: return number ** 3
...:
In [6]: cube(x)
number is x: True
Out[6]: 343
Immutable Objects as Arguments
When a function receives as an argument a reference to an immutable (unmodifiable) object—such as an int, float, string or tuple—even though you have direct access to the original object in the caller, you cannot modify the original immutable object’s value. To prove this, first let’s have cube display id(number) before and after assigning a new object to the parameter number via an augmented assignment:
view code image
...: print('id(number) before modifying number:', id(number))
...: number **= 3
...: print('id(number) after modifying number:', id(number))
...: return number
...:
In [8]: cube(x)
id(number) before modifying number: 4350477840
id(number) after modifying number: 4396653744
Out[8]: 343
When we call cube(x), the first print statement shows that id(number) initially is the same as id(x) in snippet [2]. Numeric values are immutable, so the statement
number **= 3
actually creates a new object containing the cubed value, then assigns that object’s reference to parameter number. Recall that if there are no more references to the original object, it will be garbage collected. Function cube’s second print statement shows the new object’s identity. Object identities must be unique, so number must refer to a dif erent object. To show that x was not modified, we display its value and identity again:
view code image
In [9]: print(f'x = {x}; id(x) = {id(x)}')
x = 7; id(x) = 4350477840
Mutable Objects as Arguments
In the next chapter, we’ll show that when a reference to a mutable object like a list is passed to a function, the function can modify the original object in the caller.
4.16 RECURSION
Let’s write a program to perform a famous mathematical calculation. Consider the factorial of a positive integer n, which is written n! and pronounced “n factorial.” This is the product
View code image
n · (n – 1) · (n – 2) ··· 1
with 1! equal to 1 and 0! defined to be 1. For example, 5! is the product 5 · 4 · 3 · 2 · 1, which is equal to 120.
Iterative Factorial Approach
You can calculate 5! iteratively with a for statement, as in:
view code image
In [2]: for number in range(5, 0, 1):
...: factorial *= number
...:
In [3]: factorial
Out[3]: 120
Recursive Problem Solving
Recursive problemsolving approaches have several elements in common. When you call a recursive function to solve a problem, it’s actually capable of solving only the simplest case(s), or base case(s). If you call the function with a base case, it immediately returns a result. If you call the function with a more complex problem, it typically divides the problem into two pieces—one that the function knows how to do and one that it does not know how to do. To make recursion feasible, this latter piece must be a slightly simpler or smaller version of the original problem. Because this new problem resembles the original problem, the function calls a fresh copy of itself to work on the smaller problem—this is referred to as a recursive call and is also called the recursion step. This concept of separating the problem into two smaller portions is a form of the divide-and-conquer approach introduced earlier in the book.
The recursion step executes while the original function call is still active (i.e., it has not finished executing). It can result in many more recursive calls as the function divides each new subproblem into two conceptual pieces. For the recursion to eventually terminate, each time the function calls itself with a simpler version of the original problem, the sequence of smaller and smaller problems must converge on a base case. When the function recognizes the base case, it returns a result to the previous copy of the function. A sequence of returns ensues until the original function call returns the final result to the caller.
Recursive Factorial Approach
You can arrive at a recursive factorial representation by observing that n! can be written as:
View code image
For example, 5! is equal to 5 · 4!, as in:
5! = 5 · (4 · 3 · 2 · 1)
5! = 5 · (4!)
Visualizing Recursion
The evaluation of 5! would proceed as shown below. The left column shows how the succession of recursive calls proceeds until 1! (the base case) is evaluated to be 1, which terminates the recursion. The right column shows from bottom to top the values returned from each recursive call to its caller until the final value is calculated and returned.
![]() |
| By JuTT BaDshaH |
Implementing a Recursive Factorial Function
The following session uses recursion to calculate and display the factorials of the integers 0 through 10:
view code image
...: """Return factorial of number."""
...: if number <= 1:
...: return 1
...: return number * factorial(number 1) # recursive call
...:
In [5]: for i in range(11):
...: print(f'{i}! = {factorial(i)}')
...:
0! = 1
1! = 1
2! = 2
3! = 6
4! = 24
5! = 120
6! = 720
7! = 5040
8! = 40320
9! = 362880
10! = 3628800
Snippet [4]’s recursive function factorial first determines whether the terminating condition number <= 1 is True. If this condition is True (the base case), factorial returns 1 and no further recursion is necessary. If number is greater than 1, the second return statement expresses the problem as the product of number and a recursive call to factorial that evaluates factorial(number 1). This is a slightly smaller problem than the original calculation, factorial(number). Note that function factorial must receive a nonnegative argument. We do not test for this case.
The loop in snippet [5] calls the factorial function for the values from 0 through 10. The output shows that factorial values grow quickly. Python does not limit the size of an integer, unlike many other programming languages.
Indirect Recursion
A recursive function may call another function, which may, in turn, make a call back to the recursive function. This is known as an indirect recursive call or indirect recursion. For example, function A calls function B, which makes a call back to function A. This is still recursion because the second call to function A is made while the first call to function A is active. That is, the first call to function A has not yet finished executing (because it is waiting on function B to return a result to it) and has not returned to function A’s original caller.
Stack Overflow and Infinite Recursion
Of course, the amount of memory in a computer is finite, so only a certain amount of memory can be used to store activation records on the functioncall stack. If more recursive function calls occur than can have their activation records stored on the stack, a fatal error known as stack overflow occurs. This typically is the result of infinite recursion, which can be caused by omitting the base case or writing the recursion step incorrectly so that it does not converge on the base case. This error is analogous to the problem of an infinite loop in an iterative (nonrecursive) solution.
4.17 FUNCTIONAL-STYLE PROGRAMMING
Like other popular languages, such as Java and C#, Python is not a purely functional language. Rather, it offers “functional-style” features that help you write code which is less likely to contain errors, more concise and easier to read, debug and modify.
Functional-style programs also can be easier to parallelize to get better performance on today’s multicore processors. The chart below lists most of Python’s key functional-style programming capabilities and shows in parentheses the chapters in which we initially cover many of them.
![]() |
| By JuTT BaDshaH |
We cover most of these features throughout the book—many with code examples and others from a literacy perspective. You’ve already used list, string and builtin function range iterators with the for statement, and several reductions (functions sum, len, min and max). We discuss declarative programming, immutability and internal iteration below.
What vs. How
As the tasks you perform get more complicated, your code can become harder to read, debug and modify, and more likely to contain errors. Specifying how the code works can become complex.
Functional-style programming lets you simply say what you want to do. It hides many details of how to perform each task. Typically, library code handles the how for you. As you’ll see, this can eliminate many errors.
Consider the for statement in many other programming languages. Typically, you must specify all the details of counter-controlled iteration: a control variable, its initial value, how to increment it and a loop-continuation condition that uses the control variable to determine whether to continue iterating. This style of iteration is known as external iteration and is errorprone. For example, you might provide an incorrect initializer, increment or loopcontinuation condition. External iteration mutates (that is, modifies) the control variable, and the for statement’s suite often mutates other variables as well. Every time you modify variables you could introduce errors.
Functionalstyle programming emphasizes immutability. That is, it avoids operations that modify variables’ values. We’ll say more in the next chapter.
Python’s for statement and range function hide most counter-controlled iteration details. You specify what values range should produce and the variable that should receive each value as it’s produced. Function range knows how to produce those values. Similarly, the for statement knows how to get each value from range and how to stop iterating when there are no more values. Specifying what, but not how, is an important aspect of internal iteration—a key functionalstyle programming concept.
The Python builtin functions sum, min and max each use internal iteration. To total the elements of the list grades, you simply declare what you want to do—that is, sum(grades). Function sum knows how to iterate through the list and add each element to the running total. Stating what you want done rather than programming how to do it is known as declarative programming.
Pure Functions
In pure functional programming language you focus on writing pure functions. A pure function’s result depends only on the argument(s) you pass to it. Also, given a particular argument (or arguments), a pure function always produces the same result.
For example, builtin function sum’s return value depends only on the iterable you pass to it. Given a list [1, 2, 3], sum always returns 6 no matter how many times you call it. Also, a pure function does not have side ef ects. For example, even if you pass a mutable list to a pure function, the list will contain the same values before and after the function call. When you call the pure function sum, it does not modify its argument.
View code image
In [2]: sum(values)
Out[2]: 6
In [3]: sum(values) # same call always returns same result
Out[3]: 6
In [4]: values
Out[5]: [1, 2, 3]
In the next chapter, we’ll continue using functionalstyle programming concepts. Also, you’ll see that functions are objects that you can pass to other functions as data.
4.18 INTRO TO DATA SCIENCE: MEASURES OF DISPERSION
In our discussion of descriptive statistics, we’ve considered the measures of central tendency—mean, median and mode. These help us categorize typical values in a group—such as the mean height of your classmates or the most frequently purchased car brand (the mode) in a given country.
When we’re talking about a group, the entire group is called the population. Sometimes a population is quite large, such as the people likely to vote in the next U.S. presidential election, which is a number in excess of 100,000,000 people. For practical reasons, the polling organizations trying to predict who will become the next president work with carefully selected small subsets of the population known as samples. Many of the polls in the 2016 election had sample sizes of about 1000 people.
In this section, we continue discussing basic descriptive statistics. We introduce measures of dispersion (also called measures of variability) that help you understand how spread out the values are. For example, in a class of students, there may be a bunch of students whose height is close to the average, with smaller numbers of students who are considerably shorter or taller.
For our purposes, we’ll calculate each measure of dispersion both by hand and with functions from the module statistics, using the following population of 10 sixsided die rolls:
View code image
1, 3, 4, 2, 6, 5, 3, 4, 5, 2
Variance
To determine the variance, we begin with the mean of these values—3.5. You obtain this result by dividing the sum of the face values, 35, by the number of rolls, 10. Next, we subtract the mean from every die value (this produces some negative results):
For simplicity, were calculating the population variance. There is a subtle difference between the population variance and the sample variance. Instead of dividing by n(the number of die rolls in our example), sample variance divides by n 1. The difference is pronounced for small samples and becomes insignificant as the sample size increases. The statistics module provides the functions pvariance and variance to calculate the population variance and sample variance, respectively. Similarly, the statistics module provides the functions pstdev and stdev to calculate the population standard deviation and sample standard deviation, respectively.
view code image
2.5, 0.5, 0.5, 1.5, 2.5, 1.5, 0.5, 0.5, 1.5, 1.5
Then, we square each of these results (yielding only positives):
View code image
6.25, 0.25, 0.25, 2.25, 6.25, 2.25, 0.25, 0.25, 2.25, 2.25
Finally, we calculate the mean of these squares, which is 2.25 (22.5 / 10)—this is the population variance. Squaring the difference between each die value and the mean of all die values emphasizes outliers—the values that are farthest from the mean. As we get deeper into data analytics, sometimes we’ll want to pay careful attention to outliers, and sometimes we’ll want to ignore them. The following code uses the statistics module’s pvariance function to confirm our manual result:
view code image
In [2]: statistics.pvariance([1, 3, 4, 2, 6, 5, 3, 4, 5, 2])
Out[2]: 2.25
Standard Deviation
The standard deviation is the square root of the variance (in this case, 1.5), which tones down the effect of the outliers. The smaller the variance and standard deviation are, the closer the data values are to the mean and the less overall dispersion (that is, spread) there is between the values and the mean. The following code calculates the population standard deviation with the statistics module’s pstdev function, confirming our manual result:
view code image
Out[3]: 1.5
Passing the pvariance function’s result to the math module’s sqrt function confirms our result of 1.5:
view code image
In [5]: math.sqrt(statistics.pvariance([1, 3, 4, 2, 6, 5, 3, 4, 5, 2]))
Out[5]: 1.5
Advantage of Population Standard Deviation vs. Population Variance
Suppose you’ve recorded the March Fahrenheit temperatures in your area. You might have 31 numbers such as 19, 32, 28 and 35. The units for these numbers are degrees.
When you square your temperatures to calculate the population variance, the units of the population variance become “degrees squared.” When you take the square root of the population variance to calculate the population standard deviation, the units once again become degrees, which are the same units as your temperatures.
4.19 WRAP-UP
In this chapter, we created custom functions. We imported capabilities from the random and math modules. We introduced random-number generation and used it to simulate rolling a sixsided die. We packed multiple values into tuples to return more than one value from a function. We also unpacked a tuple to access its values. We discussed using the Python Standard Library’s modules to avoid “reinventing the wheel.”
We created functions with default parameter values and called functions with keyword arguments. We also defined functions with arbitrary argument lists. We called methods of objects. We discussed how an identifier’s scope determines where in your program you can use it.
We presented more about importing modules. You saw that arguments are passedby-reference to functions, and how the function-call stack and stack frames support the function-call-and-return mechanism. We also presented a recursive function and began introducing Python’s functional-style programming capabilities. We’ve introduced basic list and tuple capabilities over the last two chapters—in the next chapter, we’ll discuss them in detail.
Finally, we continued our discussion of descriptive statistics by introducing measures of dispersion—variance and standard deviation—and calculating them with functions from the Python Standard Library’s statistics module. For some types of problems, it’s useful to have functions call themselves. A recursive function calls itself, either directly or indirectly through another function.
Previous Chapter
Next Chapter Coming Soon


