(Python) Transforming and Reducing Data at the Same Time

1 min readMar 17, 2021

Let’s say you need to use a reduction function like sum() or min() in order to get calculate the sum of squares. A good way to do this would be to use a generator-expression argument for data reduction and transformation.

You would do something like this:

nums = [1, 2, 3, 4, 5]
s = sum(x * x for x in nums)

The above shows a syntactic aspect of generator expressions when supplied as the single argument to a function: YOU DO NOT NEED REPEATED PARENTHESES! For instance, these statements are the same:

s = sum((x * x for x in nums))   # pass generator-expr as argument
s = sum(x * x for x in nums)   # more elegant syntax

If you didn’t use a generator expression, you would do something like this:

nums = [1, 2, 3, 4, 5]
s = sum([x * x for x in nums])

This works, but it introduces an extra step and creates an extra list. If nums was huge, you would end up creating a large temporary data structure to only be used once and discarded. The generator solution transforms the data iteratively and is therefore much more memory-efficient.

Some reduction functions such as min() and max() accept a key argument that might be useful in situations where you might be inclined to use a generator. Like so:

# original: returns 20
min_shares = min(s['shares'] for s in portfolio)# alternative: returns {'name': 'AOL', 'shares': 20}
min_shares = min(portfolio, key=lambda s: s['shares'])

(Python) Transforming and Reducing Data at the Same Time

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Nicholas An

No responses yet