You've got a couple of groups and you want to get every possible combination of them. This is called the Cartesian Product of the groups. There are standard ways of doing this in R and Python.
Python: List Comprehensions
Concretely we've got (in Python notation) the vectors
x = [1, 2, 3] and
y = [4, 5] and we want to get all possible pairs: [(1, 4), (2, 4), (3, 4), (1, 5), (2, 5), (3, 5)]`.
The "pythonic" way to do this is with a list comprehension:
[(x_, y_) for x_ in x for y_ in y]
Another possibility is to use
itertools.product which is expecially useful for a large number of lists.
In R we can use
expand.grid to get a
data.frame of all pairs:
In this expression the
y to the left of the
= sign are the names of the columns in the dataframe.
I find this really useful when creating plots of functions with
ggplot2 to try every possible combination of parameters.
You can also do this manually using
rep; for example:
data.frame(x=rep(x, length(y)), y=rep(y, each=length(x)))
Python: More Complex List Comprehensions
What if we have a slightly harder problem: there's another vector
z = [6, 7] and we want to take every aligned pair from
z and combine it with every possible
So the output should be
[(1, 4, 6), (2, 4, 6), (3, 4, 6), (1, 5, 7), (2, 5, 7), (3, 5, 7)].
This is straighforward with list comprehensions by combining
z with zip:
[(x_, y_, z_) for x_ in x for y_, z_ in zip(y, z)]
This is one of the strengths of Python list comprehensions, it's easy to extend with different variables and with functions acting on those variables.
R: tidyr expand
I don't know how to do this harder task in R with
expand.grid, and so I would have to fallback to the long way with
This would be
data.frame(x=rep(x, length(y)), y=rep(y, each=length(x)), z=rep(z, each=length(x)))
This gets quite tedious to write!
expand(data.frame(y=y, z=z), x, nesting(y, z)
This gets all combinations of
z, providing that the pairs
z are in the
data.frame from the first argument.
expand is not referentially transparent, and the variables rely on their names in the data frame (as is typical of tidyverse functions).
expand(data.frame(y=z, z=y), x, nesting(y, z) will reverse the order of the last two columns.