Functions Chapter 4 Python Programming Course By JuTT BaDshaH



    4. Functions

    Objectives

    In this chapter, you’ll

    • Create custom functions.
    • Import and use Python Standard Library modules, such as random and math, to reuse code and avoid “reinventing the wheel.”
    • Pass data between functions.
    • Generate a range of random numbers.
    • See simulation techniques using random­number generation.
    • Seed the random number generator to ensure reproducibility.
    • Pack values into a tuple and unpack values from a tuple.
    • Return multiple values from a function via a tuple.
    • Understand how an identifier’s scope determines where in your program you can use it.
    • Create functions with default parameter values.
    • Call functions with keyword arguments.
    • Create functions that can receive any number of arguments.
    • Use methods of an object.
    • Write and use a recursive function.

    Outline

    4.1 Introduction
    4.2 Defining Functions
    4.3 Functions with Multiple Parameters
    4.4 Random­Number Generation
    4.5 Case Study: A Game of Chance
    4.6 Python Standard Library
    4.7 math Module Functions
    4.8 Using IPython Tab Completion for Discovery
    4.9 Default Parameter Values
    4.10 Keyword Arguments
    4.11 Arbitrary Argument Lists
    4.12 Methods: Functions That Belong to Objects
    4.13 Scope Rules
    4.14 import: A Deeper Look
    4.15 Passing Arguments to Functions: A Deeper Look
    4.16 Recursion
    4.17 Functional­Style Programming
    4.18 Intro to Data Science: Measures of Dispersion
    4.19 Wrap­Up

    4.1 INTRODUCTION

    In this chapter, we continue our discussion of Python fundamentals with custom functions and related topics. We’ll use the Python Standard Library’s random module and random­number generation to simulate rolling a six­sided die. We’ll combine custom functions and random­number generation in a script that implements the dice game craps. In that example, we’ll also introduce Python’s tuple sequence type and use tuples to return more than one value from a function. We’ll discuss seeding the random number generator to ensure reproducibility.
    You’ll import the Python Standard Library’s math module, then use it to learn about IPython tab completion, which speeds your coding and discovery processes. You’ll create functions with default parameter values, call functions with keyword arguments and define functions with arbitrary argument lists. We’ll demonstrate calling methods of objects. We’ll also discuss how an identifier’s scope determines where in your program you can use it.
    We’ll take a deeper look at importing modules. You’ll see that arguments are passed­by-reference to functions. We’ll also demonstrate a recursive function and begin presenting Python’s functional­style programming capabilities. In the Intro to Data Science section, we’ll continue our discussion of descriptive statistics by introducing measures of dispersion—variance and standard deviation—and calculating them with functions from the Python Standard Library’s statistics module.

    4.2 DEFINING FUNCTIONS

    You’ve called many built­in functions (int, float, print, input, type, sum, len, min and max) and a few functions from the statistics module (mean, median and mode). Each performed a single, well­defined task. You’ll often define and call custom functions. The following session defines a square function that calculates the square of its argument. Then it calls the function twice—once to square the int value 7 (producing the int value 49) and once to square the float value 2.5 (producing the float value 6.25):

    In [1]: def square(number):
    ...: """Calculate the square of number."""
    ...: return number ** 2
    ...:
    In [2]: square(7)
    Out[2]: 49
    In [3]: square(2.5)
    Out[3]: 6.25

    The statements defining the function in the first snippet are written only once, but may be called “to do their job” from many points throughout a program and as often as you like. Calling square with a non­numeric argument like 'hello' causes a TypeError because the exponentiation operator (**) works only with numeric values.

    Defining a Custom Function

    A function definition (like square in snippet [1]) begins with the def keyword, followed by the function name (square), a set of parentheses and a colon (:). Like variable identifiers, by convention function names should begin with a lowercase letter and in multiword names underscores should separate each word.
    The required parentheses contain the function’s parameter list—a comma­separated list of parameters representing the data that the function needs to perform its task. Function square has only one parameter named number—the value to be squared. If the parentheses are empty, the function does not use parameters to perform its task. The indented lines after the colon (:) are the function’s block, which consists of an optional docstring followed by the statements that perform the function’s task. We’ll soon point out the difference between a function’s block and a control statement’s suite.

    Specifying a Custom Function’s Docstring

    The Style Guide for Python Code says that the first line in a function’s block should be a docstring that briefly explains the function’s purpose:

    """Calculate the square of number."""

    To provide more detail, you can use a multiline docstring—the style guide recommends starting with a brief explanation, followed by a blank line and the additional details.

    Returning a Result to a Function’s Caller

    When a function finishes executing, it returns control to its caller—that is, the line of code that called the function. In square’s block, the return statement:

    return number ** 2

    first squares number, then terminates the function and gives the result back to the caller. In this example, the first caller is in snippet [2], so IPython displays the result in Out[2]. The second caller is in snippet [3], so IPython displays the result in Out[3].
    Function calls also can be embedded in expressions. The following code calls square first, then print displays the result:

    In [4]: print('The square of 7 is', square(7))
    The square of 7 is 49

    There are two other ways to return control from a function to its caller:

    • Executing a return statement without an expression terminates the function and implicitly returns the value None to the caller. The Python documentation states that None represents the absence of a value. None evaluates to False in conditions.
    • When there’s no return statement in a function, it implicitly returns the value None after executing the last statement in the function’s block.

    Local Variables

    Though we did not define variables in square’s block, it is possible to do so. A function’s parameters and variables defined in its block are all local variables—they can be used only inside the function and exist only while the function is executing. Trying to access a local variable outside its function’s block causes a NameError, indicating that the variable is not defined.

    Accessing a Function’s Docstring via IPython’s Help Mechanism

    IPython can help you learn about the modules and functions you intend to use in your code, as well as IPython itself. For example, to view a function’s docstring to learn how to use the function, type the function’s name followed by a question mark (?):

    In [5]: square?
    Signature: square(number)
    Docstring: Calculate the square of number.
    File: ~/Documents/examples/ch04/<ipython­input­-1-­7268c8ff93a9>
    Type: function

    For our square function, the information displayed includes:

    • The function’s name and parameter list—known as its signature.
    • The function’s docstring.
    • The name of the file containing the function’s definition. For a function in an interactive session, this line shows information for the snippet that defined the function—the 1 in "<ipython-­input­-1­7268c8ff93a9>" means snippet [1].
    • The type of the item for which you accessed IPython’s help mechanism—in this case, a function.

    If the function’s source code is accessible from IPython—such as a function defined in the current session or imported into the session from a .py file—you can use ?? to display the function’s full source­code definition:

    In [6]: square??
    Signature: square(number)
    Source:
    def square(number):
    """Calculate the square of number."""
    return number ** 2
    File: ~/Documents/examples/ch04/<ipython­input­-1-­7268c8ff93a9>
    Type: function

    If the source code is not accessible from IPython, ?? simply shows the docstring.
    If the docstring fits in the window, IPython displays the next In [] prompt. If a docstring is too long to fit, IPython indicates that there’s more by displaying a colon (:) at the bottom of the window—press the Space key to display the next screen. You can navigate backwards and forwards through the docstring with the up and down arrow keys, respectively. IPython displays (END) at the end of the docstring. Press q (for “quit”) at any : or the (END) prompt to return to the next In [] prompt. To get a sense of IPython’s features, type ? at any In [] prompt, press Enter, then read the help documentation overview.

    4.3 FUNCTIONS WITH MULTIPLE PARAMETERS

    Let’s define a maximum function that determines and returns the largest of three values—the following session calls the function three times with integers, floating­point numbers and strings, respectively.

    In [1]: def maximum(value1, value2, value3):
    ...: """Return the maximum of three values."""
    ...: max_value = value1
    ...: if value2 > max_value:
    ...: max_value = value2
    ...: if value3 > max_value:
    ...: max_value = value3
    ...: return max_value
    ...:
    In [2]: maximum(12, 27, 36)
    Out[2]: 36
    In [3]: maximum(12.3, 45.6, 9.7)
    Out[3]: 45.6
    In [4]: maximum('yellow', 'red', 'orange')
    Out[4]: 'yellow'

    We did not place blank lines above and below the if statements, because pressing return on a blank line in interactive mode completes the function’s definition.
    You also may call maximum with mixed types, such as ints and floats:

    In [5]: maximum(13.5, ­-3, 7)
    Out[5]: 13.5

    The call maximum(13.5, 'hello', 7) results in TypeError because strings and numbers cannot be compared to one another with the greater­than (>) operator.

    Function maximum’s Definition

    Function maximum specifies three parameters in a comma­separated list. Snippet [2]’s arguments 12, 27 and 36 are assigned to the parameters value1, value2 and value3, respectively.
    To determine the largest value, we process one value at a time:

    • Initially, we assume that value1 contains the largest value, so we assign it to the local variable max_value. Of course, it’s possible that value2 or value3 contains the actual largest value, so we still must compare each of these with max_value.
    • The first if statement then tests value2 > max_value, and if this condition is True assigns value2 to max_value.
    • The second if statement then tests value3 > max_value, and if this condition is True assigns value3 to max_value.

    Now, max_value contains the largest value, so we return it. When control returns to the caller, the parameters value1, value2 and value3 and the variable max_value in the function’s block—which are all local variables—no longer exist.

    Python’s Built-In max and min Functions

    For many common tasks, the capabilities you need already exist in Python. For example, built­in max and min functions know how to determine the largest and smallest of their two or more arguments, respectively:

    In [6]: max('yellow', 'red', 'orange', 'blue', 'green')
    Out[6]: 'yellow'
    In [7]: min(15, 9, 27, 14)
    Out[7]: 9

    Each of these functions also can receive an iterable argument, such as a list or a string. Using built­in functions or functions from the Python Standard Library’s modules rather than writing your own can reduce development time and increase program reliability, portability and performance. For a list of Python’s built­in functions and modules, see

    https://docs.python.org/3/library/index.html

    4.4 RANDOM-NUMBER GENERATION

    We now take a brief diversion into a popular type of programming application—simulation and game playing. You can introduce the element of chance via the Python Standard Library’s random module.

    Rolling a Six-Sided Die

    Let’s produce 10 random integers in the range 1–6 to simulate rolling a six­sided die:

    In [1]: import random
    In [2]: for roll in range(10):
    ...: print(random.randrange(1, 7), end=' ')
    ...:
    4 2 5 5 4 6 4 6 1 5

    First, we import random so we can use the module’s capabilities. The randrange function generates an integer from the first argument value up to, but not including, the second argument value. Let’s use the up arrow key to recall the for statement, then press Enter to re­execute it. Notice that dif erent values are displayed:

    In [3]: for roll in range(10):
    ...: print(random.randrange(1, 7), end=' ')
    ...:
    4 5 4 5 1 4 1 4 6 5

    Sometimes, you may want to guarantee reproducibility of a random sequence—for debugging, for example. At the end of this section, we’ll use the random module’s seed function to do this.

    Rolling a Six-Sided Die 6,000,000 Times

    If randrange truly produces integers at random, every number in its range has an equal probability (or chance or likelihood) of being returned each time we call it. To show that the die faces 1–6 occur with equal likelihood, the following script simulates 6,000,000 die rolls. When you run the script, each die face should occur approximately 1,000,000 times, as in the sample output.

    1 # fig04_01.py
    2 """Roll a six­sided die 6,000,000 times."""
    3 import random
    4
    5 # face frequency counters
    6 frequency1 = 0
    7 frequency2 = 0
    8 frequency3 = 0
    9 frequency4 = 0
    10 frequency5 = 0
    11 frequency6 = 0
    12
    13 # 6,000,000 die rolls
    14 for roll in range(6_000_000): # note underscore separators
    15 face = random.randrange(1, 7)
    16
    17 # increment appropriate face counter
    18 if face == 1:
    19 frequency1 += 1
    20 elif face == 2:
    21 frequency2 += 1
    22 elif face == 3:
    23 frequency3 += 1
    24 elif face == 4:
    25 frequency4 += 1
    26 elif face == 5:
    27 frequency5 += 1
    28 elif face == 6:
    29 frequency6 += 1
    30
    31 print(f'Face{"Frequency":>13}')
    32 print(f'{1:>4}{frequency1:>13}')
    33 print(f'{2:>4}{frequency2:>13}')
    34 print(f'{3:>4}{frequency3:>13}')
    35 print(f'{4:>4}{frequency4:>13}')
    36 print(f'{5:>4}{frequency5:>13}')
    37 print(f'{6:>4}{frequency6:>13}')

    Face Frequency

    1 998686
    2 1001481
    3 999900
    4 1000453
    5 999953
    6 999527

    The script uses nested control statements (an if elif statement nested in the for statement) to determine the number of times each die face appears. The for statement iterates 6,000,000 times. We used Python’s underscore (_) digit separator to make the value 6000000 more readable. The expression range(6,000,000) would be incorrect. Commas separate arguments in function calls, so Python would treat range(6,000,000) as a call to range with the three arguments 6, 0 and 0.
    For each die roll, the script adds 1 to the appropriate counter variable. Run the program, and observe the results. This program might take a few seconds to complete execution. As you’ll see, each execution produces dif erent results. Note that we did not provide an else clause in the if elif statement.

    Seeding the Random-Number Generator for Reproducibility

    Function randrange actually generates pseudorandom numbers, based on an internal calculation that begins with a numeric value known as a seed. Repeatedly calling randrange produces a sequence of numbers that appear to be random, because each time you start a new interactive session or execute a script that uses the random module’s functions, Python internally uses a dif erent seed value. When you’re debugging logic errors in programs that use randomly generated data, it can be helpful to use the same sequence of random numbers until you’ve eliminated the logic errors, before testing the program with other values. To do this, you can use the random module’s seed function to seed the random­-number generator yourself—this forces randrange to begin calculating its pseudorandom number sequence from the seed you specify. In the following session, snippets [5] and [8] produce the same results, because snippets [4] and [7] use the same seed (32):

    In [4]: random.seed(32)
    In [5]: for roll in range(10):
    ...: print(random.randrange(1, 7), end=' ')
    ...:
    1 2 2 3 6 2 4 1 6 1
    In [6]: for roll in range(10):
    ...: print(random.randrange(1, 7), end=' ')
    ...:
    1 3 5 3 1 5 6 4 3 5
    In [7]: random.seed(32)
    In [8]: for roll in range(10):
    ...: print(random.randrange(1, 7), end=' ')
    ...:
    1 2 2 3 6 2 4 1 6 1

    Snippet [6] generates dif erent values because it simply continues the pseudo-random number sequence that began in snippet [5].

    4.5 CASE STUDY: A GAME OF CHANCE

    In this section, we simulate the popular dice game known as “craps.” Here is the requirements statement:
    You roll two six­sided dice, each with faces containing one, two, three, four, five and six spots, respectively. When the dice come to rest, the sum of the spots on the two upward faces is calculated. If the sum is 7 or 11 on the first roll, you win. If the sum is 2, 3 or 12 on the first roll (called “craps”), you lose (i.e., the “house” wins). If the sum is 4, 5, 6, 8, 9 or 10 on the first roll, that sum becomes your “point.” To win, you must continue rolling the dice until you “make your point” (i.e., roll that same point value).
    You lose by rolling a 7 before making your point. The following script simulates the game and shows several sample executions, illustrating winning on the first roll, losing on the first roll, winning on a subsequent roll and losing on a subsequent roll.
    view code

    1 # fig04_02.py
    2 """Simulating the dice game Craps."""
    3 import random
    4
    5 def roll_dice():
    6 """Roll two dice and return their face values as a tuple."""
    7 die1 = random.randrange(1, 7)
    8 die2 = random.randrange(1, 7)
    9 return (die1, die2) # pack die face values into a tuple
    10
    11 def display_dice(dice):
    12 """Display one roll of the two dice."""
    13 die1, die2 = dice # unpack the tuple into variables die1 and die2
    14 print(f'Player rolled {die1} + {die2} = {sum(dice)}')
    15
    16 die_values = roll_dice() # first roll
    17 display_dice(die_values)
    18
    19 # determine game status and point, based on first roll
    20 sum_of_dice = sum(die_values)
    21
    22 if sum_of_dice in (7, 11): # win
    23 game_status = 'WON'
    24 elif sum_of_dice in (2, 3, 12): # lose
    25 game_status = 'LOST'
    26 else: # remember point
    27 game_status = 'CONTINUE'
    28 my_point = sum_of_dice
    29 print('Point is', my_point)
    30
    31 # continue rolling until player wins or loses
    32 while game_status == 'CONTINUE':
    33 die_values = roll_dice()
    34 display_dice(die_values)
    35 sum_of_dice = sum(die_values)
    36
    37 if sum_of_dice == my_point: # win by making point
    38 game_status = 'WON'
    39 elif sum_of_dice == 7: # lose by rolling 7
    40 game_status = 'LOST'
    41
    42 # display "wins” or "loses” message
    43 if game_status == 'WON':
    44 print('Player wins')
    45 else:
    46 print('Player loses')

    view code image

    Player rolled 2 + 5 = 7
    Player wins

    view code image

    Player rolled 1 + 2 = 3
    Player loses

    view code image

    Player rolled 5 + 4 = 9
    Point is 9
    Player rolled 4 + 4 = 8
    Player rolled 2 + 3 = 5
    Player rolled 5 + 4 = 9
    Player wins

    view code image

    Player rolled 1 + 5 = 6
    Point is 6
    Player rolled 1 + 6 = 7
    Player loses

    Function roll_dice—Returning Multiple Values Via a Tuple

    Function roll_dice (lines 5–9) simulates rolling two dice on each roll. The function is defined once, then called from several places in the program (lines 16 and 33). The empty parameter list indicates that roll_dice does not require arguments to perform its task.
    The built­in and custom functions you’ve called so far each return one value. Sometimes it’s useful to return more than one value, as in roll_dice, which returns both die values (line 9) as a tuple—an immutable (that is, unmodifiable) sequences of values. To create a tuple, separate its values with commas, as in line 9:

    (die1, die2)

    This is known as packing a tuple. The parentheses are optional, but we recommend using them for clarity. We discuss tuples in depth in the next chapter.

    Function display_dice

    To use a tuple’s values, you can assign them to a comma-­separated list of variables, which unpacks the tuple. To display each roll of the dice, the function display_dice (defined in lines 11–14 and called in lines 17 and 34) unpacks the tuple argument it receives (line 13). The number of variables to the left of = must match the number of elements in the tuple; otherwise, a ValueError occurs. Line 14 prints a formatted string containing both die values and their sum. We calculate the sum of the dice by passing the tuple to the built­in sum function—like a list, a tuple is a sequence.
    Note that functions roll_dice and display_dice each begin their blocks with a docstring that states what the function does. Also, both functions contain local variables die1 and die2. These variables do not “collide,” because they belong to different functions’ blocks. Each local variable is accessible only in the block that defined it.

    First Roll

    When the script begins executing, lines 16–17 roll the dice and display the results. Line 20 calculates the sum of the dice for use in lines 22–29. You can win or lose on the first roll or any subsequent roll. The variable game_status keeps track of the win/loss status.
    The in operator in line 22

    sum_of_dice in (7, 11)

    tests whether the tuple (7, 11) contains sum_of_dice’s value. If this condition is True, you rolled a 7 or an 11. In this case, you won on the first roll, so the script sets game_status to 'WON'. The operator’s right operand can be any iterable. There’s also a not in operator to determine whether a value is not in an iterable. The preceding concise condition is equivalent to
    view code image

    (sum_of_dice == 7) or (sum_of_dice == 11)

    Similarly, the condition in line 24

    sum_of_dice in (2, 3, 12)

    tests whether the tuple (2, 3, 12) contains sum_of_dice’s value. If so, you lost on the first roll, so the script sets game_status to 'LOST'.
    For any other sum of the dice (4, 5, 6, 8, 9 or 10):

    • line 27 sets game_status to 'CONTINUE' so you can continue rolling
    • line 28 stores the sum of the dice in my_point to keep track of what you must roll to win and
    • line 29 displays my_point.

    Subsequent Rolls

    If game_status is equal to 'CONTINUE' (line 32), you did not win or lose, so the while statement’s suite (lines 33–40) executes. Each loop iteration calls roll_dice, displays the die values and calculates their sum. If sum_of_dice is equal to my_point (line 37) or 7 (line 39), the script sets game_status to 'WON' or 'LOST', respectively, and the loop terminates. Otherwise, the while loop continues executing with the next roll.

    Displaying the Final Results

    When the loop terminates, the script proceeds to the if else statement (lines 43–46), which prints 'Player wins' if game_status is 'WON', or 'Player loses' otherwise.

    4.6 PYTHON STANDARD LIBRARY

    Typically, you write Python programs by combining functions and classes (that is, custom types) that you create with preexisting functions and classes defined in modules, such as those in the Python Standard Library and other libraries. A key programming goal is to avoid “reinventing the wheel.”
    A module is a file that groups related functions, data and classes. The type Decimal from the Python Standard Library’s decimal module is actually a class. We introduced classes briefly in Chapter 1 and discuss them in detail in the “Object-­Oriented Programming” chapter. A package groups related modules. In this book, you’ll work with many preexisting modules and packages, and you’ll create your own modules—in fact, every Python source­code (.py) file you create is a module. Creating packages is beyond this book’s scope. They’re typically used to organize a large library’s functionality into smaller subsets that are easier to maintain and can be imported separately for convenience. For example, the matplotlib visualization library that we use in Section 5.17 has extensive functionality (its documentation is over 2300 pages), so we’ll import only the subsets we need in our examples (pyplot and animation).

    The Python Standard Library is provided with the core Python language. Its packages and modules contain capabilities for a wide variety of everyday programming tasks.
    You can see a complete list of the standard library modules at

    https://docs.python.org/3/library/

    You’ve already used capabilities from the decimal, statistics and random modules. In the next section, you’ll use mathematics capabilities from the math module. You’ll see many other Python Standard Library modules through-out the article’s examples, including many of those in the following table:

    Some popular Python Standard Library modules
    collections—Data structures
    1. beyond lists, tuples, dictionaries and sets.
    2. Cryptography modules—Encrypting data for secure transmission.
    3. csv—Processing comma­separated value files (like those in Excel).
    4. datetime—Date and time manipulations. Also modules time and calendar.
    5. decimal—Fixed­point and floating-point arithmetic, including monetary calculations.
    6. doctest—Embed validation tests and expected results in docstrings for simple unit testing.
    7. math—Common math constants and operations.
    8. os—Interacting with the operating system.
    9. profile, pstats, timeit—Performance analysis.
    10. random—Pseudorandom numbers.
    11. re—Regular expressions for pattern matching.
    12. sqlite3—SQLite relational database access.
    13. statistics—Mathematical statistics functions such as mean, median, mode and variance.
    14. string—String processing.
    15. sys—Command­line argument
    16. gettext and locale—Internationalization and localization modules.
    17. json—JavaScript Object Notation (JSON) processing used with web services and NoSQL document databases.
    18. processing; standard input, standard output and standard error streams.
    19. tkinter—Graphical user interfaces (GUIs) and canvas­based graphics.
    20. turtle—Turtle graphics.
    21. webbrowser—For conveniently displaying web pages in Python apps.

    4.7 MATH MODULE FUNCTIONS

    The math module defines functions for performing various common mathematical calculations. Recall from the previous chapter that an import statement of the following form enables you to use a module’s definitions via the module’s name and a dot (.):

    In [1]: import math

    For example, the following snippet calculates the square root of 900 by calling the math module’s sqrt function, which returns its result as a float value:

    In [2]: math.sqrt(900)
    Out[2]: 30.0

    Similarly, the following snippet calculates the absolute value of ­10 by calling the math module’s fabs function, which returns its result as a float value:

    In [3]: math.fabs(­-10)
    Out[3]: 10.0

    Some math module functions are summarized below—you can view the complete list at

    https://docs.python.org/3/library/math.html


    4.8 USING IPYTHON TAB COMPLETION FOR DISCOVERY

    You can view a module’s documentation in IPython interactive mode via tab completion—a discovery feature that speeds your coding and discovery processes. After you type a portion of an identifier and press Tab, IPython completes the identifier for you or provides a list of identifiers that begin with what you’ve typed so far. This may vary based on your operating system platform and what you have imported into your IPython session:
    view code image

    In [1]: import math
    In [2]: ma<Tab>
              map %macro %%markdown
              math %magic %matplotlib
              max() %man

    You can scroll through the identifiers with the up and down arrow keys. As you do, IPython highlights an identifier and shows it to the right of the In [] prompt.

    Viewing Identifiers in a Module

    To view a list of identifiers defined in a module, type the module’s name and a dot (.), then press Tab:
    view code image

    In [3]: math.<Tab>
       acos()    atan()   copysign()    e expm1()
       acosh() atan2() cos() erf()      fabs()
       asin()    atanh() cosh() erfc() factorial() >
       asinh()  ceil()     degrees()      exp() floor()

    If there are more identifiers to display than are currently shown, IPython displays the > symbol (on some platforms) at the right edge, in this case to the right of factorial(). You can use the up and down arrow keys to scroll through the list. In the list of identifiers:

    • Those followed by parentheses are functions (or methods, as you’ll see later).
    • Single-­word identifiers (such as Employee) that begin with an uppercase letter and multi-word identifiers in which each word begins with an uppercase letter (such as Commission-Employee) represent class names (there are none in the preceding list). This naming convention, which the Style Guide for Python Code recommends, is known as CamelCase because the uppercase letters stand out like a camel’s humps.
    • Lowercase identifiers without parentheses, such as pi (not shown in the preceding list) and e, are variables. The identifier pi evaluates to 3.141592653589793, and the identifier e evaluates to 2.718281828459045. In the math module, pi and e represent the mathematical constants Ď€ and e, respectively.

    Python does not have constants, although many objects in Python are immutable (nonmodifiable). So even though pi and e are real­world constants, you must not assign new values to them, because that would change their values. To help distinguish constants from other variables, the style guide recommends naming your custom constants with all capital letters.

    Using the Currently Highlighted Function

    As you navigate through the identifiers, if you wish to use a currently highlighted function, simply start typing its arguments in parentheses. IPython then hides the auto-completion list. If you need more information about the currently highlighted item, you can view its docstring by typing a question mark (?) following the name and pressing Enter to view the help documentation. The following shows the fabs function’s docstring:
    view code image

    In [4]: math.fabs?
    Docstring:
    fabs(x)
    Return the absolute value of the float x.
    Type: builtin_function_or_method

    The builtin_function_or_method shown above indicates that fabs is part of a Python Standard Library module. Such modules are considered to be built into Python. In this case, fabs is a built­in function from the math module.

    4.9 DEFAULT PARAMETER VALUES

    When defining a function, you can specify that a parameter has a default parameter value. When calling the function, if you omit the argument for a parameter with a default parameter value, the default value for that parameter is automatically passed.
    Let’s define a function rectangle_area with default parameter values:

    view code image

    In [1]: def rectangle_area(length=2, width=3):
    ...: """Return a rectangle's area."""
    ...: return length * width
    ...:

    You specify a default parameter value by following a parameter’s name with an = and a value—in this case, the default parameter values are 2 and 3 for length and width, respectively. Any parameters with default parameter values must appear in the parameter list to the right of parameters that do not have defaults.
    The following call to rectangle_area has no arguments, so IPython uses both default parameter values as if you had called rectangle_area(2, 3):

    In [2]: rectangle_area()
    Out[2]: 6

    The following call to rectangle_area has only one argument. Arguments are assigned to parameters from left to right, so 10 is used as the length. The interpreter passes the default parameter value 3 for the width as if you had called rectangle_area(10, 3):

    In [3]: rectangle_area(10)
    Out[3]: 30

    The following call to rectangle_area has arguments for both length and width, so IPython­ ignores the default parameter values:

    In [4]: rectangle_area(10, 5)
    Out[4]: 50

    4.10 KEYWORD ARGUMENTS

    When calling functions, you can use keyword arguments to pass arguments in any order. To demonstrate keyword arguments, we redefine the rectangle_area function—this time without default parameter values:

    view code image

    In [1]: def rectangle_area(length, width):
    ...: """Return a rectangle's area."""
    ...: return length * width
    ...:

    Each keyword argument in a call has the form parametername=value. The following call shows that the order of keyword arguments does not matter—they do not need to match the corresponding parameters’ positions in the function definition:

    view code image

    In [2]: rectangle_area(width=5, length=10)
    Out[3]: 50

    In each function call, you must place keyword arguments after a function’s positional arguments—that is, any arguments for which you do not specify the parameter name.
    Such arguments are assigned to the function’s parameters left­to­right, based on the argument’s positions in the argument list. Keyword arguments are also helpful for improving the readability of function calls, especially for functions with many arguments.

    4.11 ARBITRARY ARGUMENT LISTS

    Functions with arbitrary argument lists, such as built­in functions min and max, can receive any number of arguments. Consider the following min call:

    min(88, 75, 96, 55, 83)

    The function’s documentation states that min has two required parameters (named arg1 and arg2) and an optional third parameter of the form *args, indicating that the function can receive any number of additional arguments. The * before the parameter name tells Python to pack any remaining arguments into a tuple that’s passed to the args parameter. In the call above, parameter arg1 receives 88, parameter arg2 receives 75 and parameter args receives the tuple (96, 55, 83).

    Defining a Function with an Arbitrary Argument List

    Let’s define an average function that can receive any number of arguments:

    view code image

    In [1]: def average(*args):
    ...: return sum(args) / len(args)
    ...:

    The parameter name args is used by convention, but you may use any identifier. If the function has multiple parameters, the *args parameter must be the rightmost parameter.
    Now, let’s call average several times with arbitrary argument lists of different lengths:

    view code image

    In [2]: average(5, 10)
    Out[2]: 7.5
    In [3]: average(5, 10, 15)
    Out[3]: 10.0
    In [4]: average(5, 10, 15, 20)
    Out[4]: 12.5

    To calculate the average, divide the sum of the args tuple’s elements (returned by built­in function sum) by the tuple’s number of elements (returned by built­in function len). Note in our average definition that if the length of args is 0, a ZeroDivisionError occurs. In the next chapter, you’ll see how to access a tuple’s elements without unpacking them.

    Passing an Iterable’s Individual Elements as Function Arguments

    You can unpack a tuple’s, list’s or other iterable’s elements to pass them as individual function arguments. The * operator, when applied to an iterable argument in a function call, unpacks its elements. The following code creates a five­element grades list, then uses the expression *grades to unpack its elements as average’s arguments:

    view code image

    In [5]: grades = [88, 75, 96, 55, 83]
    In [6]: average(*grades)
    Out[6]: 79.4

    The call shown above is equivalent to average(88, 75, 96, 55, 83).

    4.12 METHODS: FUNCTIONS THAT BELONG TO OBJECTS

    A method is simply a function that you call on an object using the form 

    object_name.method_name(arguments)
     For example, the following session creates the string variable s and assigns it the string object 'Hello'. Then the session calls the object’s lower and upper methods, which produce new strings containing all­lowercase and all­uppercase versions of the original string, leaving s unchanged:

    view code image

    In [1]: s = 'Hello'
    In [2]: s.lower() # call lower method on string object s
    Out[2]: 'hello'
    In [3]: s.upper()
    Out[3]: 'HELLO'
    In [4]: s
    Out[4]: 'Hello'

    The Python Standard Library reference at

    https://docs.python.org/3/library/index.html

    describes the methods of built­in types and the types in the Python Standard Library. In the “Object­Oriented Programming” chapter, you’ll create custom types called classes and define custom methods that you can call on objects of those classes.

    4.13 SCOPE RULES

    Each identifier has a scope that determines where you can use it in your program. For that portion of the program, the identifier is said to be “in scope.”

    Local Scope

    A local variable’s identifier has local scope. It’s “in scope” only from its definition to the end of the function’s block. It “goes out of scope” when the function returns to its caller. So, a local variable can be used only inside the function that defines it.

    Global Scope

    Identifiers defined outside any function (or class) have global scope—these may include functions, variables and classes. Variables with global scope are known as global variables. Identifiers with global scope can be used in a .py file or interactive session anywhere after they’re defined.

    Accessing a Global Variable from a Function

    You can access a global variable’s value inside a function:

    view code image

    In [1]: x = 7
    In [2]: def access_global():
    ...: print('x printed from access_global:', x)
    ...:
    In [3]: access_global()
    x printed from access_global: 7

    However, by default, you cannot modify a global variable in a function—when you first assign a value to a variable in a function’s block, Python creates a new local variable:

    view code image

    In [4]: def try_to_modify_global():
    ...: x = 3.5
    ...: print('x printed from try_to_modify_global:', x)
    ...:
    In [5]: try_to_modify_global()
    x printed from try_to_modify_global: 3.5
    In [6]: x
    Out[6]: 7

    In function try_to_modify_global’s block, the local x shadows the global x, making it inaccessible in the scope of the function’s block. Snippet [6] shows that global variable x still exists and has its original value (7) after function try_to_modify_global executes.
    To modify a global variable in a function’s block, you must use a global statement to declare that the variable is defined in the global scope:

    view code image

    In [7]: def modify_global():
    ...: global x
    ...: x = 'hello'
    ...: print('x printed from modify_global:', x)
    ...:
    In [8]: modify_global()
    x printed from modify_global: hello
    In [9]: x
    Out[9]: 'hello'

    Blocks vs. Suites

    You’ve now defined function blocks and control statement suites. When you create a variable in a block, it’s local to that block. However, when you create a variable in a control statement’s suite, the variable’s scope depends on where the control statement is defined:

    • If the control statement is in the global scope, then any variables defined in the control statement have global scope.
    • If the control statement is in a function’s block, then any variables defined in the control statement have local scope.

    We’ll continue our scope discussion in the “Object­Oriented Programming” chapter when we introduce custom classes.

    Shadowing Functions

    In the preceding chapters, when summing values, we stored the sum in a variable named total. The reason we did this is that sum is a built­in function. If you define a variable named sum, it shadows the built­in function, making it inaccessible in your code. When you execute the following assignment, Python binds the identifier sum to the int object containing 15. At this point, the identifier sum no longer references the built­in function. So, when you try to use sum as a function, a TypeError occurs:

    view code image

    In [10]: sum = 10 + 5
    In [11]: sum
    Out[11]: 15
    In [12]: sum([10, 5])
    TypeError Traceback (most recent call last
    )<
    ipython-­input­-1-2­1237d97a65fb> in <module>()
    ­­­­> 1 sum([10, 5])
    TypeError: 'int' object is not callable

    ­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­­

    Statements at Global Scope

    In the scripts you’ve seen so far, we’ve written some statements outside functions at the global scope and some statements inside function blocks. Script statements at global scope execute as soon as they’re encountered by the interpreter, whereas statements in a block execute only when the function is called.

    4.14 IMPORT: A DEEPER LOOK

    You’ve imported modules (such as math and random) with a statement like:

    import module_name

    then accessed their features via each module’s name and a dot (.). Also, you’ve imported a specific identifier from a module (such as the decimal module’s Decimal type) with a statement like:

    from module_name import identifier

    then used that identifier without having to precede it with the module name and a dot (.).

    Importing Multiple Identifiers from a Module

    Using the from import statement you can import a comma­separated list of identifiers from a module then use them in your code without having to precede them with the module name and a dot (.):

    view code image

    In [1]: from math import ceil, floor
    In [2]: ceil(10.3)
    Out[2]: 11
    In [3]: floor(10.7)
    Out[3]: 10

    Trying to use a function that’s not imported causes a NameError, indicating that the name is not defined.

    Caution: Avoid Wildcard Imports

    You can import all identifiers defined in a module with a wildcard import of the form

    from modulename import *

    This makes all of the module’s identifiers available for use in your code. Importing a module’s identifiers with a wildcard import can lead to subtle errors—it’s considered a dangerous practice that you should avoid. Consider the following snippets:

    In [4]: e = 'hello'
    In [5]: from math import *
    In [6]: e
    Out[6]: 2.718281828459045

    Initially, we assign the string 'hello' to a variable named e. After executing snippet [5] though, the variable e is replaced, possibly by accident, with the math module’s constant e, representing the mathematical floating-­point value e.

    Binding Names for Modules and Module Identifiers

    Sometimes it’s helpful to import a module and use an abbreviation for it to simplify your code. The import statement’s as clause allows you to specify the name used to reference the module’s identifiers. For example, in Section 3.14 we could have imported the statistics module and accessed its mean function as follows:

    view code image

    In [7]: import statistics as stats
    In [8]: grades = [85, 93, 45, 87, 93]
    In [9]: stats.mean(grades)
    Out[9]: 80.6

    As you’ll see in later chapters, import as is frequently used to import Python libraries with convenient abbreviations, like stats for the statistics module. As another example, we’ll use the numpy module which typically is imported with

    import numpy as np

    Library documentation often mentions popular shorthand names.
    Typically, when importing a module, you should use import or import as statements, then access the module through the module name or the abbreviation following the as keyword, respectively. This ensures that you do not accidentally import an identifier that conflicts with one in your code.

    4.15 PASSING ARGUMENTS TO FUNCTIONS: A DEEPER LOOK

    Let’s take a closer look at how arguments are passed to functions. In many programming languages, there are two ways to pass arguments—pass­-by­-value and pass-­by-­reference (sometimes called call­-by-­value and call­-by­-reference, respectively):

    • With pass­by­value, the called function receives a copy of the argument’s value and works exclusively with that copy. Changes to the function’s copy do not affect the original variable’s value in the caller.
    • With pass­by­reference, the called function can access the argument’s value in the caller directly and modify the value if it’s mutable.

    Python arguments are always passed by reference. Some people call this pass­-by-object-­reference, because “everything in Python is an object.” When a function call provides an argument, Python copies the argument object’s reference—not the object itself—into the corresponding parameter. This is important for performance. Functions often manipulate large objects—frequently copying them would consume large amounts of computer memory and significantly slow program performance.

    Memory Addresses, References and “Pointers”

    You interact with an object via a reference, which behind the scenes is that object’s address (or location) in the computer’s memory—sometimes called a “pointer” in other languages. After an assignment like

    View code image

    x = 7

    the variable x does not actually contain the value 7. Rather, it contains a reference to an object containing 7 stored elsewhere in memory. You might say that x “points to” (that is, references) the object containing 7, as in the diagram below:

    By JuTT BaDshaH


    Built-In Function id and Object Identities

    Let’s consider how we pass arguments to functions. First, let’s create the integer variable x mentioned above—shortly we’ll use x as a function argument:

    View code image

    In [1]: x = 7

    Now x refers to (or “points to”) the integer object containing 7. No two separate objects can reside at the same address in memory, so every object in memory has a unique address. Though we can’t see an object’s address, we can use the built­in id function to obtain a unique int value which identifies only that object while it remains in memory (you’ll likely get a different value when you run this on your computer):

    View code image

    In [2]: id(x)
    Out[2]: 4350477840

    The integer result of calling id is known as the object’s identity. No two objects in memory can have the same identity. We’ll use object identities to demonstrate that objects are passed by reference.

    Passing an Object to a Function

    Let’s define a cube function that displays its parameter’s identity, then returns the parameter’s value cubed:

    view code image

    In [3]: def cube(number):
    ...: print('id(number):', id(number))
    ...: return number ** 3
    ...:

    Next, let’s call cube with the argument x, which refers to the integer object containing 7:

    View code image 

    In [4]: cube(x)
    id(number): 4350477840
    Out[4]: 343

    The identity displayed for cube’s parameter number—4350477840—is the same as that displayed for x previously. Since every object has a unique identity, both the argument x and the parameter number refer to the same object while cube executes.
    So when function cube uses its parameter number in its calculation, it gets the value of number from the original object in the caller.

    Testing Object Identities with the is Operator

    You also can prove that the argument and the parameter refer to the same object with Python’s is operator, which returns True if its two operands have the same identity:

    view code image

    In [5]: def cube(number):
    ...: print('number is x:', number is x) # x is a global variable
    ...: return number ** 3
    ...:
    In [6]: cube(x)
    number is x: True
    Out[6]: 343

    Immutable Objects as Arguments

    When a function receives as an argument a reference to an immutable (unmodifiable) object—such as an int, float, string or tuple—even though you have direct access to the original object in the caller, you cannot modify the original immutable object’s value. To prove this, first let’s have cube display id(number) before and after assigning a new object to the parameter number via an augmented assignment:

    view code image

    In [7]: def cube(number):
    ...: print('id(number) before modifying number:', id(number))
    ...: number **= 3
    ...: print('id(number) after modifying number:', id(number))
    ...: return number
    ...:
    In [8]: cube(x)
    id(number) before modifying number: 4350477840
    id(number) after modifying number: 4396653744
    Out[8]: 343

    When we call cube(x), the first print statement shows that id(number) initially is the same as id(x) in snippet [2]. Numeric values are immutable, so the statement

    number **= 3

    actually creates a new object containing the cubed value, then assigns that object’s reference to parameter number. Recall that if there are no more references to the original object, it will be garbage collected. Function cube’s second print statement shows the new object’s identity. Object identities must be unique, so number must refer to a dif erent object. To show that x was not modified, we display its value and identity again:

    view code image

    In [9]: print(f'x = {x}; id(x) = {id(x)}')
    x = 7; id(x) = 4350477840

    Mutable Objects as Arguments

    In the next chapter, we’ll show that when a reference to a mutable object like a list is passed to a function, the function can modify the original object in the caller.

    4.16 RECURSION

    Let’s write a program to perform a famous mathematical calculation. Consider the factorial of a positive integer n, which is written n! and pronounced “n factorial.” This is the product

    View code image 

    n · (n – 1) · (n – 2) ··· 1

    with 1! equal to 1 and 0! defined to be 1. For example, 5! is the product 5 · 4 · 3 · 2 · 1, which is equal to 120.

    Iterative Factorial Approach

    You can calculate 5! iteratively with a for statement, as in:

    view code image

    In [1]: factorial = 1
    In [2]: for number in range(5, 0, ­1):
    ...: factorial *= number
    ...:
    In [3]: factorial
    Out[3]: 120

    Recursive Problem Solving

    Recursive problem­solving approaches have several elements in common. When you call a recursive function to solve a problem, it’s actually capable of solving only the simplest case(s), or base case(s). If you call the function with a base case, it immediately returns a result. If you call the function with a more complex problem, it typically divides the problem into two pieces—one that the function knows how to do and one that it does not know how to do. To make recursion feasible, this latter piece must be a slightly simpler or smaller version of the original problem. Because this new problem resembles the original problem, the function calls a fresh copy of itself to work on the smaller problem—this is referred to as a recursive call and is also called the recursion step. This concept of separating the problem into two smaller portions is a form of the divide­-and-­conquer approach introduced earlier in the book.
    The recursion step executes while the original function call is still active (i.e., it has not finished executing). It can result in many more recursive calls as the function divides each new subproblem into two conceptual pieces. For the recursion to eventually terminate, each time the function calls itself with a simpler version of the original problem, the sequence of smaller and smaller problems must converge on a base case. When the function recognizes the base case, it returns a result to the previous copy of the function. A sequence of returns ensues until the original function call returns the final result to the caller.

    Recursive Factorial Approach

    You can arrive at a recursive factorial representation by observing that n! can be written as:

    View code image

    n! = n · (n – 1)!

    For example, 5! is equal to 5 · 4!, as in:

    5! = 5 · 4 · 3 · 2 · 1
    5! = 5 · (4 · 3 · 2 · 1)
    5! = 5 · (4!)

    Visualizing Recursion

    The evaluation of 5! would proceed as shown below. The left column shows how the succession of recursive calls proceeds until 1! (the base case) is evaluated to be 1, which terminates the recursion. The right column shows from bottom to top the values returned from each recursive call to its caller until the final value is calculated and returned.

    By JuTT BaDshaH


    Implementing a Recursive Factorial Function

    The following session uses recursion to calculate and display the factorials of the integers 0 through 10:

    view code image

    In [4]: def factorial(number):
    ...: """Return factorial of number."""
    ...: if number <= 1:
    ...: return 1
    ...: return number * factorial(number ­ 1) # recursive call
    ...:
    In [5]: for i in range(11):
    ...: print(f'{i}! = {factorial(i)}')
    ...:
    0! = 1
    1! = 1
    2! = 2
    3! = 6
    4! = 24
    5! = 120
    6! = 720
    7! = 5040
    8! = 40320
    9! = 362880
    10! = 3628800

    Snippet [4]’s recursive function factorial first determines whether the terminating condition number <= 1 is True. If this condition is True (the base case), factorial returns 1 and no further recursion is necessary. If number is greater than 1, the second return statement expresses the problem as the product of number and a recursive call to factorial that evaluates factorial(number ­ 1). This is a slightly smaller problem than the original calculation, factorial(number). Note that function factorial must receive a nonnegative argument. We do not test for this case.
    The loop in snippet [5] calls the factorial function for the values from 0 through 10. The output shows that factorial values grow quickly. Python does not limit the size of an integer, unlike many other programming languages.

    Indirect Recursion

    A recursive function may call another function, which may, in turn, make a call back to the recursive function. This is known as an indirect recursive call or indirect recursion. For example, function A calls function B, which makes a call back to function A. This is still recursion because the second call to function A is made while the first call to function A is active. That is, the first call to function A has not yet finished executing (because it is waiting on function B to return a result to it) and has not returned to function A’s original caller.

    Stack Overflow and Infinite Recursion

    Of course, the amount of memory in a computer is finite, so only a certain amount of memory can be used to store activation records on the function­call stack. If more recursive function calls occur than can have their activation records stored on the stack, a fatal error known as stack overflow occurs. This typically is the result of infinite recursion, which can be caused by omitting the base case or writing the recursion step incorrectly so that it does not converge on the base case. This error is analogous to the problem of an infinite loop in an iterative (nonrecursive) solution.

    4.17 FUNCTIONAL-STYLE PROGRAMMING

    Like other popular languages, such as Java and C#, Python is not a purely functional language. Rather, it offers “functional-­style” features that help you write code which is less likely to contain errors, more concise and easier to read, debug and modify.
    Functional­-style programs also can be easier to parallelize to get better performance on today’s multi­core processors. The chart below lists most of Python’s key functional-style programming capabilities and shows in parentheses the chapters in which we initially cover many of them.

    By JuTT BaDshaH

    We cover most of these features throughout the book—many with code examples and others from a literacy perspective. You’ve already used list, string and built­in function range iterators with the for statement, and several reductions (functions sum, len, min and max). We discuss declarative programming, immutability and internal iteration below.

    What vs. How

    As the tasks you perform get more complicated, your code can become harder to read, debug and modify, and more likely to contain errors. Specifying how the code works can become complex.
    Functional­-style programming lets you simply say what you want to do. It hides many details of how to perform each task. Typically, library code handles the how for you. As you’ll see, this can eliminate many errors.
    Consider the for statement in many other programming languages. Typically, you must specify all the details of counter-­controlled iteration: a control variable, its initial value, how to increment it and a loop-­continuation condition that uses the control variable to determine whether to continue iterating. This style of iteration is known as external iteration and is error­prone. For example, you might provide an incorrect initializer, increment or loop­continuation condition. External iteration mutates (that is, modifies) the control variable, and the for statement’s suite often mutates other variables as well. Every time you modify variables you could introduce errors.
    Functional­style programming emphasizes immutability. That is, it avoids operations that modify variables’ values. We’ll say more in the next chapter.
    Python’s for statement and range function hide most counter-­controlled iteration details. You specify what values range should produce and the variable that should receive each value as it’s produced. Function range knows how to produce those values. Similarly, the for statement knows how to get each value from range and how to stop iterating when there are no more values. Specifying what, but not how, is an important aspect of internal iteration—a key functional­style programming concept.
    The Python built­in functions sum, min and max each use internal iteration. To total the elements of the list grades, you simply declare what you want to do—that is, sum(grades). Function sum knows how to iterate through the list and add each element to the running total. Stating what you want done rather than programming how to do it is known as declarative programming.

    Pure Functions

    In pure functional programming language you focus on writing pure functions. A pure function’s result depends only on the argument(s) you pass to it. Also, given a particular argument (or arguments), a pure function always produces the same result.
    For example, built­in function sum’s return value depends only on the iterable you pass to it. Given a list [1, 2, 3], sum always returns 6 no matter how many times you call it. Also, a pure function does not have side ef ects. For example, even if you pass a mutable list to a pure function, the list will contain the same values before and after the function call. When you call the pure function sum, it does not modify its argument.

    View code image

    In [1]: values = [1, 2, 3]
    In [2]: sum(values)
    Out[2]: 6
    In [3]: sum(values) # same call always returns same result
    Out[3]: 6
    In [4]: values
    Out[5]: [1, 2, 3]

    In the next chapter, we’ll continue using functional­style programming concepts. Also, you’ll see that functions are objects that you can pass to other functions as data.

    4.18 INTRO TO DATA SCIENCE: MEASURES OF DISPERSION

    In our discussion of descriptive statistics, we’ve considered the measures of central tendency—mean, median and mode. These help us categorize typical values in a group—such as the mean height of your classmates or the most frequently purchased car brand (the mode) in a given country.
    When we’re talking about a group, the entire group is called the populationSometimes a population is quite large, such as the people likely to vote in the next U.S. presidential election, which is a number in excess of 100,000,000 people. For practical reasons, the polling organizations trying to predict who will become the next president work with carefully selected small subsets of the population known as samples. Many of the polls in the 2016 election had sample sizes of about 1000 people.
    In this section, we continue discussing basic descriptive statistics. We introduce measures of dispersion (also called measures of variability) that help you understand how spread out the values are. For example, in a class of students, there may be a bunch of students whose height is close to the average, with smaller numbers of students who are considerably shorter or taller.
    For our purposes, we’ll calculate each measure of dispersion both by hand and with functions from the module statistics, using the following population of 10 six­sided die rolls:

    View code image

    1, 3, 4, 2, 6, 5, 3, 4, 5, 2

    Variance

    To determine the variance, we begin with the mean of these values—3.5. You obtain this result by dividing the sum of the face values, 35, by the number of rolls, 10. Next, we subtract the mean from every die value (this produces some negative results):
    For simplicity, were calculating the population variance. There is a subtle difference between the population variance and the sample variance. Instead of dividing by n(the number of die rolls in our example), sample variance divides by n 1. The difference is pronounced for small samples and becomes insignificant as the sample size increases. The statistics module provides the functions pvariance and variance to calculate the population variance and sample variance, respectively. Similarly, the statistics module provides the functions pstdev and stdev to calculate the population standard deviation and sample standard deviation, respectively.

    view code image 

    2.5, ­0.5, 0.5, ­1.5, 2.5, 1.5, ­0.5, 0.5, 1.5, ­1.5

    Then, we square each of these results (yielding only positives):

    View code image 

    6.25, 0.25, 0.25, 2.25, 6.25, 2.25, 0.25, 0.25, 2.25, 2.25

    Finally, we calculate the mean of these squares, which is 2.25 (22.5 / 10)—this is the population variance. Squaring the difference between each die value and the mean of all die values emphasizes outliers—the values that are farthest from the mean. As we get deeper into data analytics, sometimes we’ll want to pay careful attention to outliers, and sometimes we’ll want to ignore them. The following code uses the statistics module’s pvariance function to confirm our manual result:

    view code image

    In [1]: import statistics
    In [2]: statistics.pvariance([1, 3, 4, 2, 6, 5, 3, 4, 5, 2])
    Out[2]: 2.25

    Standard Deviation

    The standard deviation is the square root of the variance (in this case, 1.5), which tones down the effect of the outliers. The smaller the variance and standard deviation are, the closer the data values are to the mean and the less overall dispersion (that is, spread) there is between the values and the mean. The following code calculates the population standard deviation with the statistics module’s pstdev function, confirming our manual result:

    view code image

    In [3]: statistics.pstdev([1, 3, 4, 2, 6, 5, 3, 4, 5, 2])
    Out[3]: 1.5

    Passing the pvariance function’s result to the math module’s sqrt function confirms our result of 1.5:

    view code image

    In [4]: import math
    In [5]: math.sqrt(statistics.pvariance([1, 3, 4, 2, 6, 5, 3, 4, 5, 2]))
    Out[5]: 1.5

    Advantage of Population Standard Deviation vs. Population Variance

    Suppose you’ve recorded the March Fahrenheit temperatures in your area. You might have 31 numbers such as 19, 32, 28 and 35. The units for these numbers are degrees.
    When you square your temperatures to calculate the population variance, the units of the population variance become “degrees squared.” When you take the square root of the population variance to calculate the population standard deviation, the units once again become degrees, which are the same units as your temperatures.

    4.19 WRAP-UP

    In this chapter, we created custom functions. We imported capabilities from the random and math modules. We introduced random­-number generation and used it to simulate rolling a six­sided die. We packed multiple values into tuples to return more than one value from a function. We also unpacked a tuple to access its values. We discussed using the Python Standard Library’s modules to avoid “reinventing the wheel.”
    We created functions with default parameter values and called functions with keyword arguments. We also defined functions with arbitrary argument lists. We called methods of objects. We discussed how an identifier’s scope determines where in your program you can use it.
    We presented more about importing modules. You saw that arguments are passed­by-reference to functions, and how the function-­call stack and stack frames support the function­-call­-and­-return mechanism. We also presented a recursive function and began introducing Python’s functional­-style programming capabilities. We’ve introduced basic list and tuple capabilities over the last two chapters—in the next chapter, we’ll discuss them in detail.
    Finally, we continued our discussion of descriptive statistics by introducing measures of dispersion—variance and standard deviation—and calculating them with functions from the Python Standard Library’s statistics module. For some types of problems, it’s useful to have functions call themselves. A recursive function calls itself, either directly or indirectly through another function.

    Previous Chapter

    Next Chapter Coming Soon

    IF YOU HAVE ANY PROBLEMS PLEASE CONTACT ME ON WHATSAPP.

    *

    Post a Comment (0)
    Previous Post Next Post