Transforming Code into Beautiful, Idiomatic Python
Notes from Raymond Hettinger’s talk at pycon US 2013 video, slides.
The code examples and direct quotes are all from Raymond’s talk. I’ve reproduced them here for my own edification and the hopes that others will find them as handy as I have!
Looping over a range of numbers
1 2 3 4 5
for i in [0, 1, 2, 3, 4, 5]: print i**2
for i in range(6): print i**2
Better
1 2
for i in xrange(6): print i**2
xrange creates an iterator over the range producing the values one at a time. This approach is much more memory efficient than range. xrange was renamed to range in python 3.
Looping over a collection
1 2 3 4
colors = ['red', 'green', 'blue', 'yellow']
for i in range(len(colors)): print colors[i]
Better
1 2
for color in colors: print color
Looping backwards
1 2 3 4
colors = ['red', 'green', 'blue', 'yellow']
for i in range(len(colors)-1, -1, -1): print colors[i]
Better
1 2
for color in reversed(colors): print color
Looping over a collection and indices
1 2 3 4
colors = ['red', 'green', 'blue', 'yellow']
for i in range(len(colors)): print i, '--->', colors[i]
Better
1 2
for i, color in enumerate(colors): print i, '--->', color
It’s fast and beautiful and saves you from tracking the individual indices and incrementing them.
Whenever you find yourself manipulating indices [in a collection], you’re probably doing it wrong.
n = min(len(names), len(colors)) for i in range(n): print names[i], '--->', colors[i]
for name, color in zip(names, colors): print name, '--->', color
Better
1 2
for name, color in izip(names, colors): print name, '--->', color
zip creates a new list in memory and takes more memory. izip is more efficient than zip. Note: in python 3 izip was renamed to zip and promoted to a builtin replacing the old zip.
Looping in sorted order
1 2 3 4 5 6 7 8 9
colors = ['red', 'green', 'blue', 'yellow']
# Forward sorted order for color in sorted(colors): print colors
# Backwards sorted order for color in sorted(colors, reverse=True): print colors
Custom Sort Order
1 2 3 4 5 6 7 8
colors = ['red', 'green', 'blue', 'yellow']
defcompare_length(c1, c2): if len(c1) < len(c2): return-1 if len(c1) > len(c2): return1 return0
print sorted(colors, cmp=compare_length)
Better
1
print sorted(colors, key=len)
The original is slow and unpleasant to write. Also, comparison functions are no longer available in python 3.
blocks = [] for block in iter(partial(f.read, 32), ''): blocks.append(block)
iter takes two arguments. The first you call over and over again and the second is a sentinel value.
Distinguishing multiple exit points in loops
1 2 3 4 5 6 7 8 9
deffind(seq, target): found = False for i, value in enumerate(seq): if value == target: found = True break ifnot found: return-1 return i
Better
1 2 3 4 5 6 7
deffind(seq, target): for i, value in enumerate(seq): if value == target: break else: return-1 return i
Inside of every for loop is an else.
Looping over dictionary keys
1 2 3 4 5 6 7 8
d = {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}
for k in d: print k
for k in d.keys(): if k.startswith('r'): del d[k]
When should you use the second and not the first? When you’re mutating the dictionary.
If you mutate something while you’re iterating over it, you’re living in a state of sin and deserve what ever happens to you.
d.keys() makes a copy of all the keys and stores them in a list. Then you can modify the dictionary. Note: in python 3 to iterate through a dictionary you have to explicitly write: list(d.keys()) because d.keys() returns a “dictionary view” (an iterable that provide a dynamic view on the dictionary’s keys). See documentation.
Looping over dictionary keys and values
1 2 3 4 5 6 7
# Not very fast, has to re-hash every key and do a lookup for k in d: print k, '--->', d[k]
# Makes a big huge list for k, v in d.items(): print k, '--->', v
Better
1 2
for k, v in d.iteritems(): print k, '--->', v
iteritems() is better as it returns an iterator. Note: in python 3 there is no iteritems() and items() behaviour is close to what iteritems() had. See documentation.
# Simple, basic way to count. A good start for beginners. d = {} for color in colors: if color notin d: d[color] = 0 d[color] += 1
# {'blue': 1, 'green': 2, 'red': 3}
Better
1 2 3 4 5 6 7 8 9
d = {} for color in colors: d[color] = d.get(color, 0) + 1
# Slightly more modern but has several caveats, better for advanced users # who understand the intricacies d = collections.defaultdict(int) for color in colors: d[color] += 1
d = {} for name in names: key = len(name) d.setdefault(key, []).append(name)
Better
1 2 3 4
d = collections.defaultdict(list) for name in names: key = len(name) d[key].append(name)
Is a dictionary popitem() atomic?
1 2 3 4 5
d = {'matthew': 'blue', 'rachel': 'green', 'raymond': 'red'}
while d: key, value = d.popitem() print key, '-->', value
popitem is atomic so you don’t have to put locks around it to use it in threads.
Linking dictionaries
1 2 3 4 5 6 7 8 9 10 11 12 13
defaults = {'color': 'red', 'user': 'guest'} parser = argparse.ArgumentParser() parser.add_argument('-u', '--user') parser.add_argument('-c', '--color') namespace = parser.parse_args([]) command_line_args = {k:v for k, v in vars(namespace).items() if v}
# The common approach below allows you to use defaults at first, then override them # with environment variables and then finally override them with command line arguments. # It copies data like crazy, unfortunately. d = defaults.copy() d.update(os.environ) d.update(command_line_args)
Better
1
d = ChainMap(command_line_args, os.environ, defaults)
ChainMap has been introduced into python 3. Fast and beautiful.
p = 'Raymond', 'Hettinger', 0x30, 'python@example.com'
# A common approach / habit from other languages fname = p[0] lname = p[1] age = p[2] email = p[3]
Better
1
fname, lname, age, email = p
The second approach uses tuple unpacking and is faster and more readable.
Updating multiple state variables
1 2 3 4 5 6 7 8
deffibonacci(n): x = 0 y = 1 for i in range(n): print x t = y y = x + y x = t
Better
1 2 3 4 5
deffibonacci(n): x, y = 0, 1 for i in range(n): print x x, y = y, x + y
Problems with first approach
x and y are state, and state should be updated all at once or in between lines that state is mis-matched and a common source of issues
ordering matters
it’s too low level
The second approach is more high-level, doesn’t risk getting the order wrong and is fast.
Simultaneous state updates
1 2 3 4 5 6 7 8 9 10 11
tmp_x = x + dx * t tmp_y = y + dy * t # NOTE: The "influence" function here is just an example function, what it does # is not important. The important part is how to manage updating multiple # variables at once. tmp_dx = influence(m, x, y, dx, dy, partial='x') tmp_dy = influence(m, x, y, dx, dy, partial='y') x = tmp_x y = tmp_y dx = tmp_dx dy = tmp_dy
Better
1 2 3 4 5 6 7
# NOTE: The "influence" function here is just an example function, what it does # is not important. The important part is how to manage updating multiple # variables at once. x, y, dx, dy = (x + dx * t, y + dy * t, influence(m, x, y, dx, dy, partial='x'), influence(m, x, y, dx, dy, partial='y'))
Efficiency
An optimization fundamental rule
Don’t cause data to move around unnecessarily
It takes only a little care to avoid O(n**2) behavior instead of linear behavior
Basically, just don’t move data around unecessarily.
# More efficient with collections.deque del names[0] names.popleft() names.appendleft('mark')
Decorators and Context Managers
Helps separate business logic from administrative logic
Clean, beautiful tools for factoring code and improving code reuse
Good naming is essential.
Remember the Spiderman rule: With great power, comes great responsibility!
Using decorators to factor-out administrative logic
1 2 3 4 5 6 7
# Mixes business / administrative logic and is not reusable defweb_lookup(url, saved={}): if url in saved: return saved[url] page = urllib.urlopen(url).read() saved[url] = page return page
Stick that in your utils directory and you too can ignore exceptions
Factor-out temporary contexts
1 2 3 4 5 6 7 8
# Temporarily redirect standard out to a file and then return it to normal with open('help.txt', 'w') as f: oldstdout = sys.stdout sys.stdout = f try: help(pow) finally: sys.stdout = oldstdout
Better
1 2 3
with open('help.txt', 'w') as f: with redirect_stdout(f): help(pow)
redirect_stdout is proposed for python 3.4, bug report.
Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub.