Iteration Patterns – map, filter, and reduce

You’ve learned three ways to iterate:

  1. Using a while loop:
     i = 0
     while i < len(lst):
         // do something with lst[i]
    
  2. Using a for loop with range:
     for i in range(len(lst)):
         // do something with lst[i]
    
  3. Using a for loop without range:
     for item in lst:
         // do something with item
    

This page discusses three uses for iteration:

Mapping

To apply a transformation to every item in a list, and collect the transformed elements. This operation is called map.

For example, mapping the operation “add ten” across every item in the list [2, 4, 6] produces the list [12, 14, 16].

Filter

To narrow the list down to only those items that meet a certain criterion. This operation is called filter or select, or (if the criterion is expressed as which items not to include) reject.

For example, filtering or selecting the list [2, 4, 6, 8] against the criterion “is greater than 5” produces the list [6, 8]. Rejecting the items of [2, 4, 6, 8] that match the criterion “is greater than 5” produces the list [2, 4].

Reduce

Combining the items in a list into a single value. This operation is called reduce or fold.

For example, using addition to reduce [2, 4, 6] produces the result $2 + 4 + 6 = 12$. Using multiplication produces $2 \times 4 \times 6 = 24$.

Note 1: If you want to show off, you can also call this a catamorphism.

Note 2: In general, it can matter whether you reduce left-to-right or right-to-left – whether you use a left fold or a right fold. With addition, and with multiplication of integers, it doesn’t: $(2 + 4) + 6 = 2 + (4 + 6)$. Reduce and fold generally go left-to-right, unless otherwise specified.

There are other reasons to iterate, but these three are so common that they have names. (In fact, more than one, as you can see.)

Map

Mapping applies a modification to each element in a list.

There’s two ways to do this:

  1. modify the original list
  2. return a new list.

These correspond to chop and middle from first reading journal.

A map that modifies its argument

This function modifies its argument:

def add_ten(ns):
    """Add ten to each of the numbers in a list.
    This function modifies its argument."""
    i = 0
    while i < len(ns):
        ns[i] = ns[i] + 10
        i = i + 1

lst = [2, 4, 6]
print('add_ten returns', add_ten(lst))  # `add_one` is a fruitless function. It returns None.
print('lst =', lst)                     # lst has been modified.
add_ten returns None
lst = [12, 14, 16]

Here’s an implementation of the same specification, that uses for and range:

def add_ten(ns):
    for i in range(len(ns)):
        ns[i] = ns[i] + 10

lst = [2, 4, 6]
print('add_ten returns', add_ten(lst))  # `add_one` is a fruitless function. It returns None.
print('lst =', lst)                     # lst has been modified.
add_ten returns None
lst = [12, 14, 16]

A third implementation, with for and enumerate:

def add_ten(ns):
    for i, n in enumerate(ns):
        ns[i] = n + 10

lst = [2, 4, 6]
print('add_ten returns', add_ten(lst))  # `add_one` is a fruitless function. It returns None.
print('lst =', lst)                     # lst has been modified.
add_ten returns None
lst = [12, 14, 16]

A map that creates a new list

add_ten, above, modifies its argument. It’s like chop from the reading journal, or like lst.sort() from the Python standard library.

added_ten, below, constructs a new list. It’s like middle from the reading journal, or like sorted(lst) from the Python standard library.

Code that constructs a new list generally uses an accumulator to collect the new values into, and includes these stages:

These stages are labelled below.

def added_ten(ns):
    """Add ten to each of the numbers in the list of numbers `xs`.
    This function returns a new list."""
    result = []                # initialize the accumulator
    for n in ns:
        result.append(n + 10)  # modify the accumulator
    return result              # return the accumulated value

lst = [2, 4, 6]
print('added_ten returns', added_ten(lst))  # `add_one` returns a new list
print('lst =', lst)                         # lst is unchanged
added_ten returns [12, 14, 16]
lst = [2, 4, 6]

The map function

Mapping is such a common operation that the Python library provides a standard function for it.

This function takes another function as its argument.

First, let’s see what looks like to separate the code that adds ten out from the code that iterates over the items in the list. (We have factored increment_by_ten out from added_ten.)

def increment_by_ten(n):
    return n + 10

def added_ten(ns):
    result = []                             # initialize the accumulator
    for n in range(len(ns)):
        result.append(increment_by_ten(n))  # modify the accumulator
    return result                           # return the accumulated value

lst = [2, 4, 6]
print('added_ten returns', added_ten(lst))
added_ten returns [10, 11, 12]

Now, we’ll replace the accumulator and the for loop by map. (We’re using the definition of increment_by_ten from above.)

def added_ten(ns):
    return list(map(increment_by_ten, ns))

print('added_ten returns', added_ten(lst))  # `add_one` returns a new list
added_ten returns [12, 14]

(You may recall that range returns a list-like value, that we apply the list function to it in order to turn it into an actual list so that we can see its values. Similarly, map returns a list-like value. We apply the list function to it in order to see its values.)

for and map: a connection

The for statement replaces a number of different statements: that initialize, test, use, and increment the loop variable. With for, Python handles this for us automatically.

Similarly, the map function replaces a number of different statements: that initialize, modify, and return the accumulator variable.

List Comprehensions

You may be familiar with “set builder” notation from math:

These are two different ways of writing: the set of numbers ${n + 10}$, for $n$ from ${2, 4, 6}$. (This set is equal to ${12, 14, 16}$.)

Python provides set comprehensions (we’ll learn about Python sets later). The following set comprehension is equal to the Python set {12, 14, 16}:

{n + 10 for n in {2, 4, 6}}

The Python list list comprehension is similar, but produces a list instead of a set:

[n + 10 for n in {2, 4, 6}]

Here’s an implementation of map using a list comprehension:

def added_ten(ns):
    return [n + 10 for n in ns]

print(added_ten([10, 20, 30, 40]))
[20, 30, 40, 50]

Filter

Filtering returns a list that contains only some items from the original list.

We will only work with implementations that create a new list. See this Stack Overflow question for a discussion of some of the pitfalls with trying to delete items from a list from inside a for loop.

Note: you will need to implement your own is_even function that returns True or False

def evens(ns):
    result = []              # initialize the accumulator
    for n in ns:
        if is_even(n):
            result.append(n)  # modify the accumulator
    return result             # return the accumulated value

evens([1, 2, 3, 4])
[2, 4]

Like mapping and map, Python provides a built-in function for filtering.

def evens(ns):
    return list(filter(is_even, ns))

evens([1, 2, 3, 4])
[2, 4]

There is also a list comprehension form of filter.

Compare:

def evens(ns):
    return [n for n in ns if is_even(n)]

evens([1, 2, 3, 4])
[2, 4, 6]

Reduce

The reduce pattern combines the items in a list into a single value.

def my_sum(ns):
    a = 0          # initialize accumulator
    for n in ns:
        a = a + n  # modify the accumulator
    return a       # return the accumulated value

my_sum([2, 4, 6])
12

Unlike mapping and filtering, reduction does not have a list comprehension equivalent.

Also, you have to import the reduce function from the functools module.

Like map and filter, reduce takes a function argument. We’ll create an add helper function to pass to it.

import functools

def add(a, b):
    return a + b

def my_sum(ns):
    return functools.reduce(add, ns)

my_sum([2, 4, 6])
12

Instead of writing our implementation of add, we could use the operator module to get a Python function that acts like the + operator.

import functools
import operator

def my_sum(ns):
    return functools.reduce(operator.add, ns)

my_sum([2, 4, 6])
12