(Python) Transforming and Reducing Data at the Same Time

Nicholas An
1 min readMar 17, 2021

--

Let’s say you need to use a reduction function like sum() or min() in order to get calculate the sum of squares. A good way to do this would be to use a generator-expression argument for data reduction and transformation.

You would do something like this:

nums = [1, 2, 3, 4, 5]
s = sum(x * x for x in nums)

The above shows a syntactic aspect of generator expressions when supplied as the single argument to a function: YOU DO NOT NEED REPEATED PARENTHESES! For instance, these statements are the same:

s = sum((x * x for x in nums))   # pass generator-expr as argument
s = sum(x * x for x in nums) # more elegant syntax

If you didn’t use a generator expression, you would do something like this:

nums = [1, 2, 3, 4, 5]
s = sum([x * x for x in nums])

This works, but it introduces an extra step and creates an extra list. If nums was huge, you would end up creating a large temporary data structure to only be used once and discarded. The generator solution transforms the data iteratively and is therefore much more memory-efficient.

Some reduction functions such as min() and max() accept a key argument that might be useful in situations where you might be inclined to use a generator. Like so:

# original: returns 20
min_shares = min(s['shares'] for s in portfolio)
# alternative: returns {'name': 'AOL', 'shares': 20}
min_shares = min(portfolio, key=lambda s: s['shares'])

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Nicholas An
Nicholas An

Written by Nicholas An

Backend Server Developer in South Korea

No responses yet

Write a response