# A Pythonic n-wise Iterator for Any Iterable

01 June 2010

Over the weekend, I was working on upgrading python-ngrams because I had discovered a bug where the tokenization was incorrect. I was reading a research paper that was describing q-grams and while following their examples, I realized I was getting incorrect results for a fundamental n-gram result.

The tokenization that's required here is quite simple, given some iterable consisting of values [x0, x1, x2, x3...,xi], produce an exhaustive iterator of n-tuples such that it satisfies (x0,...xn), (x1,...xn+1), (xm,...xi).

To make this easier to understand, if given a list of [0, 1, 2, 3, 4, 5, 6], I want to be able to return an iterator such that:

```for first, second, third in n_wise([0, 1, 2, 3, 4, 5, 6], 3):
print first, second, third
```

Will have a result of:

```0, 1, 2
1, 2, 3
2, 3, 4
3, 4, 5
4, 5, 6
```

Fortunately, this wasn't too difficult as the trivial implementation of this is already done in itertools, under pairwise. Below I've implemented a n-wise iterator implementation that can take any iterator and return n-iterators where each iterator is advanced by a step ahead of the other.

```from itertools import tee, izip

def n_wise(iterable, n):
"""
Returns n iterators for an iterable that are sequentially

n-wise

"""
n_iterators = tee(iterable, n)
zippables = [n_iterators[0]]

advance += 1  # since enumerate is 0 indexed.

next(iteratee, None)
# append everything to the zippables
zippables.append(iteratee)
# return the izip expansion of each iterator
return izip(*zippables)
```

I find that I sometimes need to open an iterator for a file and I need to read n-wise lines each step, this iterator will do just that. For sake of completeness, look at how concise, powerful, and easy to use this combination is:

```# assuming that n_wise is imported into this namespace

from functools import partial

for line1, line2, line3 in triple_wise_line_reader(open("some_file.log", "r")):
# do some computation with line1, line2 and line3
do_something(line1, line2, line3)

# next step:
# line1 <- line2
# line2 <- line3
# line3 <- line4
```

Imagine doing this in C ;) It will be just ugly!

Please let me know if you find this useful or you have a better implementation to the problem!

This entry was tagged as programming python