Day 10

Today

For Next time

Reading Journal Debrief

Bad Kangaroos

In this Reading Journal we looked at a very common error that can be very frustrating to debug:

class Kangaroo:
    def __init__(self, name, contents=[]):	# Warning: mutable default argument bug!
        self.pouch_contents = contents

The core “gotcha” here is that default argument values are evaluated once when the function is defined, not every time it is called.

In this case, it means that the new list is created one time, and each new Kangaroo instance (that did not provide contents to override the default) has a reference to the same list.

This would be true of any mutable object as a default argument, not just list (e.g. dict, your custom classes). Here is another implementation, showing a Python idiom commonly used to avoid this bug:

class Kangaroo:
    def __init__(self, name, contents=None):
        if contents == None:
            contents = []
        self.pouch_contents = contents

Here we’ve used None (immutable) as a stand-in for the case when no contents have been passed and we should create an empty list. The new list is created inside the body of the __init__ method when it is called, so each Kangaroo instance gets its own empty list vs a shared reference.

General advice: avoid using mutable objects as default values of function arguments.


Agenda: sorting objects

For this Reading Journal, you had to implement a print_agenda method that displays Event objects in chronological order. The key challenge here was sorting the Events: Python provides many techniques for sorting, but they don’t know how to sensibly sort your custom objects without some help from you.

Let’s imagine that you have a list of Events. The first design choice to make is whether you should maintain the list in sorted order at all times, or sort it only when required (e.g. whenever print_agenda is called). There are pros and cons to each style, depending on application:

Always sorted

In this case the burden falls on your add_event method to ensure that you insert new Events in chronological order (perhaps using your is_after method). It takes a little more time to add an Event, but you afterwards you can rely on the list being sorted at no extra cost.

Sort on demand

With this approach you have a simpler insertion process, but you must sort the list every time you need it to be in order. This may be more expensive depending on whether adding new Events or printing the Agenda is more common.

Python knows how to sort built-in types (e.g. integers), but not your custom classes. Let’s say you have a Time class:

class Time:
    def __init__(self, hour=0, minute=0, second=0):
        self.hour = hour
        self.minute = minute
        self.second = second

    def to_int(self):
        return 60*(60*self.hour + self.minute) + self.second

Sorting key function

Python’s list.sort method and sorted function both take an optional key parameter, which expects a function that takes the object to be sorted and returns a key used to sort it. We could write such a function for Events:

def event_start(event):
    """Given Event object containing a start Time, return time as sortable key"""
    return event.start.to_int()

and then use it to sort a list of Events:

for event in sorted(some_events, key=event_start):
    ... do something

Anonymous lambda functions

If you just want to sort Events, it can be a bit clunky to write the event_start function solely to pass as a key parameter. You’ll often see people use an anonymous lambda function (a function defined on the fly that is not bound to a name) for this purpose. Equivalently:

for event in sorted(some_events, key=lambda e: e.start.to_int())
    ... do something

Implement comparison operator(s)

If you implement the less-than __lt__ method for your classes (so that time1 < time2 works), then Python now knows how to sort your custom objects.

# inside class Time:
    def __lt__(self, other):
        return self.to_int() < other.to_int()

Agenda: __str__ vs __repr__

Several of you ran into interesting behavior when trying to print your custom classes using your __str__ method:

>>> times = [Time(6,20), Time(2,45), Time(4)]
>>> print(times[0])
06:20:00
>>> print(times)
[<__main__.Time object at 0x105b1d0b8>, <__main__.Time object at 0x105b1d0f0>, <__main__.Time object at 0x105b1d128>]

This arises because Python has two ways of converting an object into a string that are used differently. Read about each: __str__ and __repr__.

When printing collections like lists, the __repr__ method is used for the contained objects. As an example of why, let’s imagine our TicTacToe class has a pretty __str__ method that gives:

X| |O
-----
 |X|O
-----
O|X|

If we used that for the __repr__, and printed a list of two TicTacToe objects, it would be a mess!:

[X| |O
-----
 |X|O
-----
O|X| , X| |O
-----
 |X|O
-----
O|X| ]

vs

[<__main__.TicTacToe object at 0x105b1d160>, <__main__.TicTacToe object at 0x105b1d168>]

When writing __repr__ methods, a great goal is for the result to be a valid Python expression that could recreate the object (e.g. Time(06,20,00) not 06:20:00).

If a class does not define a __str__ method, the __repr__ method is used instead (if it exists).


Exercise: rewrite MP2 using classes

Now that we’ve learned about writing custom classes, let’s circle back and revise your work from MP2 to use what you’ve learned over the past two reading journals.

As you should recall, in MP2 we used the built-in Python data type list as a means to represent a nested function. For instance, the function would be represented as as the following list.

["prod", ["sin_pi", ["x"]], ["cos_pi", ["x"]]]

While we were able to get by with this representation, it was not the most elegant way to represent a nested function. We can use classes to provide a much more versatile and clean implementation of nested functions. Further, there were other operations that our program did (e.g., remapping intervals, generating random functions), that could be improved using classes.

Propose a new design

With two of your classmates, grab some whiteboard (or chalkboard) space. Come up with an object-oriented design (i.e. some classes and the methods that they will provide) to re-implement the second mini-project. Be prepared to answer the following questions regarding your design.

We’ll take some time to hear from various groups to get a sense of the design choices that you all are making.

Revise your design (optional)

If you’d like, you can change your design based on what you’ve heard from your peers.

Implement your design

Start an implementation of your design. You probably won’t be able to get through all of it during class, but try to complete a few methods and provide method stubs (e.g., methods that just contain a pass statement) for all of your classes.

Additional stuff you might work on during this time

We’re also available for conversation about issues that arose using classes in the last reading journals and geometry.py

Diversion: git branch

To allow you to revise MP2 without affecting your work on GitHub, we can use a new branch.

In your local MP2 repo, create a new branch to work on and checkout that new branch:

$ git checkout -b ClassRewrite

You can then do your work and make commits to the new branch without affecting the master branch

You can also (optionally) push your new branch and its work to GitHub:

$ git push -u origin ClassRewrite

If you want to switch back to the master branch:

$ git checkout master

You’ll see your work disappear from the local directory but it’s not gone! Just git checkout ClassRewrite to see it again.

You can also merge your work on a branch back into the master branch (but please don’t do that for MP2

To see this visually, work through the first three levels of “Learn Git Branching