socialroulette R package is a lightweight package for handling the recurrent problem of assigning individuals into groups of a fixed size. This happens, e.g., as part of mystery lunch scheduling or when assigning students into breakout rooms in video lectures. The work on the package was inspired by the blog post Long time, no see: Virtual Lunch Roulette,
The aim is to partition \(n\) participants into groups of size at least \(m\). We shall denote the resulting groupings a partition of the set of participants. If \(m\) is not a divisor of \(n\) then some of the groups have to contain more than \(m\) participants. As an example consider the scenario that 5 individuals have to be divided into groups of size at least 2. We shall adopt the convention, that group size shall be as close to \(m\) as possible and the group sizes should be as equal as possible. In the specific toy example this means that we will need 2 groups, one with 3 participants and one with 2 participants.
Remark: Partitions with \(m=2\) are also known as a matchings and can be handled particularly efficient using special-purpose graph algorithms. However, we use the \(m=2\) example in order to illustrate the package functionality with a simple to follow example. Later sections of the vignette contain some more realistic examples with \(m>2\).
We continue with a slightly more advanced example, where we have a class with 100 students, which for a weekly virtual lab exercise class need to be divided into groups of at least 4. Since for various reasons not all students show up to each class, the sampling frame of individuals to be partitioned each week changes accordingly. Still, we would like to make the partitioning of the current week s.t. students get as many new acquaintances as possible.
Create a history of previous partitions as well as the current frame from a population of 100 individuals participating in the social roulette.
# Class of 100 students with 4 previous lectures students <- tibble::tibble(id=sprintf("email@example.com", 1:100)) partition_dates <- seq(as.Date("2021-03-31"), length.out=4, by="1 week") # Simulate changing participation each week for the last 4 weeks (70% attendance) frames <- map_df( partition_dates, ~ students %>% slice_sample(n = rbinom(1,nrow(.), prob=0.7)) %>% mutate(date=.x)) # Generate some past partitions using simple random sampling past_partitions <- frames %>% group_split(date) %>% map(~rsocialroulette(current_frame=.x, m=4, algorithm="srs")) %>% setNames(partition_dates) #> Partitioning 61 individuals into groups of at least 4 (no past partitions). #> Created 15 groups of sizes 5 4 4 4 4 4 4 4 4 4 4 4 4 4 4. #> Partitioning 72 individuals into groups of at least 4 (no past partitions). #> Created 18 groups of sizes 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4. #> Partitioning 70 individuals into groups of at least 4 (no past partitions). #> Created 17 groups of sizes 5 5 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4. #> Partitioning 76 individuals into groups of at least 4 (no past partitions). #> Created 19 groups of sizes 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4.
For a more realist scenario, we pretend that each of the above previous partitions has been saved as a .csv file. For example like:
# Simulate the storage of each partition as a .csv file to disk # with 3 columns: date, id1 and id2, i.e. all pairs temp_dir <- tempdir() #adjust path to your setting if needed socialroulette::partitions_to_pairs( past_partitions ) %>% group_split(date) %>% walk(~ write_csv(x=.x, file=file.path(temp_dir, stringr::str_c("socialroulette-", .$date, ".csv"))))
We thus read the partitions from disk and convert from the stored pair-format (i.e. a
data.frame listing each pair being in the same group as id1, id2 together with the corresponding date of the partition) back to the list-format (i.e. a list of character vectors, where each vector denotes a group and the vector contains the ids of all members of that group). This can be done as follows:
# Read again from file pairs <- map_df(list.files(path=temp_dir, pattern="socialroulette.*", full.names=TRUE), ~read_csv(file=.x))
A small sanity check to ensure that the write and read did not really change any information:
# Convert into a pairs data.frame past_partitions2 <- pairs %>% socialroulette::pairs_to_partitions() # Sanity check all.equal(past_partitions, past_partitions2) #>  TRUE
As a next step we sample the students who are in the next class.
Our goal is now to partition the students 69 in
current_frame. For each of the 2346 possbile pairs of students in that class, we determine how long ago it has been, since they were in the same group the last time. This can be done using the internal package function
dist <- socialroulette::partitions_to_distance(current_frame, past_partitions) dist %>% head() #> # A tibble: 6 x 4 #> id1 id2 date dist #> <chr> <chr> <date> <dbl> #> 1 firstname.lastname@example.org email@example.com 2021-04-28 35 #> 2 firstname.lastname@example.org email@example.com 2021-04-28 35 #> 3 firstname.lastname@example.org email@example.com 2021-04-28 35 #> 4 firstname.lastname@example.org email@example.com 2021-04-28 35 #> 5 firstname.lastname@example.org email@example.com 2021-04-28 28 #> 6 firstname.lastname@example.org email@example.com 2021-04-28 21
Since the past partitions contain the last 4 weeks, students who have not been in a group so far are assigned a value of one time unit more than the last possible meeting opportunity. In the specific example this corresponds to \(7\cdot (4 + 1) = 35\) days. It thus seems natural to choose the groups, s.t. they do not contain pairs, which have already met in the past. However, for given size of the population, group sizes and meeting histories, this might not be possible to achieve altogether, so a more flexible criterion is to maximize the sum of distance since the last meet over all pairs in the same group of the partition. This problem is also called the maximally diverse grouping problem (MDGP) in operations research.
# Make the partition using a mdgp solver partition <- rsocialroulette(current_frame, past_partitions, m=4, algorithm="mdgp")
The partition can be visualized using the
Of course the partition can also be stored to file as before, in order to include it in the set of past partitions when doing the partitioning next week:
or can be stored in a Zoom compatible breakout room specification format:
For fully-remote companies, virtual coffee rounds can be a good way to bring together employees across different divisions in an informal way. One can use the proposed
rsocialroulette sampling to generate a weekly partition and then use any of the mail sending packages in R to subsequently mail the participants of each group that they have been assigned to a common coffee group and should schedule their own virtual meeting during the week.