(Python) Combining and Concatenating Strings
You want to combine many small strings together into a larger string. Usually, +
works well enough for small operations.
>>> a = 'Is Chicago'
>>> b = 'Not Chicago?'
>>> a + ' ' + b
'Is Chicago Not Chicago?'
However, when you have many strings to join, using +
is inefficient and slow due to the memory copies and garbage collection.
Don’t do this:
s = ''
for p in parts:
s += p
This is much slower than using the join() method, mainly because each +=
operation creates a new string object. One related trick is the conversion of data to strings and concatenation at the same time using a generator expression.
>>> data = ['ACME', 50, 91.1]
>>> ','.join(str(d) for d in data)
'ACME,50,91.1'
Also, be on the lookout for unnecessary string concatenations. For example, do not do this:
print(a + ':' + b + ':' + c)
print(':'.join([a, b, c]))
Instead, do this:
print(a, b, c, sep=':')
Last, but not least, if you are writing code that is building output from lots of small strings, you might consider writing that code as a generator function, using yield
to emit fragments.
def sample():
yield 'Is'
yield 'Chicago'
yield 'Not'
yield 'Chicago'
The interesting thing about this approach is that it makes no assumption about how the fragments are to be assembled together. For example, you could simply join the fragments using join()
.
text = ''.join(sample())
Or you could redirect the fragments to I/O:
for part in sample():
f.write(part)
Or you could come up with some kind of hybrid scheme that’s smart about combining I/O operations:
def combine(source, maxsize):
parts = []
size = 0
for part in source:
parts.append(part)
size += len(part)
if size > maxsize:
yield ''.join(parts)
parts = []
size = 0
yield ''.join(parts)for part in combine(sample(), 32768):
f.write(part)
The key point is that the original generator function doesn’t have to know the precise details. It just yields the parts.