Advent of Code

This is one of those puzzles where you know exactly what the twist will be. That doesn’t make it easier to solve, though. It’s time for combinatorial explosion again!

We start with some patterns of colors – represented by letters – and some sequences, presented like in this example:

This input would mean that we have at our disposal an infinite amount of r, wr, b, g, bwu, rb, gb and br patterns, which we need to use to somehow form the brwrr, etc. sequences. For instance, brwrr can be formed with b, r, wr, r.

Part 1 - the bare minimum

For part 1, we just need to determine how many of the sequences are possible. In my puzzle input, I had 447 patterns ranging between 1 and 8 characters, and 400 sequences ranging between 40 and 60 characters. First question, as usual, how do we structure this data in our code?

We don’t really need anything fancy to begin with: I’ll put the sequences and patterns in lists of strings. The only thing that I’m going to do is that I want to sort the patterns by length. The reasoning being: if I want to quickly check if a sequence is possible, it’s better to start with the longest possible pattern, so that we can get to the end of the sequence quicker.

We have a double line break to separate the patterns from the sequences, then commas to separate the patterns and line breaks to separate the sequences: we can do all of that with split.

def parse_input(sample: bool) -> tuple[list[str], list[str]]:
    with open('day19', 'r') as fp:
        data = fp.read().strip()

    patterns, sequences = data.split('\n\n')
    patterns = sorted(patterns.split(', '), key=lambda x: len(x), reverse=True)
    sequences = sequences.split('\n')
    return patterns, sequences

To sort the patterns, we use the key parameter of the sorted built-in function. With the key parameter, we can specify for each element a value that will be used to determine the order. It needs to be a function that takes the element as the input, and return that value as the output. As it’s a very simple function, we use a lambda: key = lambda x: len(x). We want it in descending order, so we specify reverse=True.

To check if a sequence is possible with the available patterns, I used some recursion: going through the patterns (starting from the longest), I check if the sequence starts with that pattern. If it does, I take whatever is left of the sequence after that pattern, and call the function again with that. The recursion ends either when we run out of patterns (in which case the sequence is not possible), or when we find a sequence that is an exact match for one of the patterns.

def is_possible(sequence: str, patterns: list[str]) -> bool:
    if sequence in patterns:
        return True
    for pattern in patterns:
        if sequence.startswith(pattern):
            if is_possible(sequence[len(pattern):], patterns):
                return True
    return False

And that’s it for Part 1, we just need to put it all together. I could just count the number of times is_possible returns True, but I keep all the possible sequences because that’s going to be useful for part 2…

patterns, sequences = parse_input(False)
possible_sequences = [sequence for sequence in sequences if is_possible(sequence, patterns)]
print(f"Part 1 answer: {len(possible_sequences)}")

Part 2 - of course it’s not that easy

Part 2 brings in the very obvious twist that we now need to count every possible way that we can form each sequence with the available patterns. Just for fun I checked what my recursive approach would do (not stopping when we find a correct sequence, but accumulating the number of possible ways as we go)… and I couldn’t get it to finish in a reasonable amount of time on even a single sequence, let alone the 400.

I didn’t have much time yesterday to work on the problem, so I had to stop after just a few attempts, and go back at it this morning. Once I had time to really think it through, I got a reasonably nice solution, I think.

Here’s the idea. Let’s take the example again, and look specifically at the gbbr sequence. Our available patterns are r, wr, b, g, bwu, rb, gb, br, so we can either start with g or with gb. So we start with two possible ways forward. ["g", "gb"]. Moving on to the next character: we can start the sequence bbr with b only, so now our two ways are ["gb", "gb"]. Since they are the same subsequence, we merge them and just keep track of the fact that gb is now two ways. Moving formard: br can be started with b or with br, so we have two new ways starting from two old ways, forming a total of four ways: ["gbb" (x2), "gbbr" (x2)]. Finally we get to the last character, r, which can only be started with r, so we can finish the gbb sequence and get to ["gbbr" (x2), "gbbr" (x2)], which we can again merge into ["gbbr" (x4)].

How to code that? First, I decided to make one pass on the sequence, noting at each character which pattern can start there:

def get_pattern_starts(sequence: str, patterns: list[str]) -> dict[int, list[str]]:
    pattern_start = defaultdict(list)
    for pattern in patterns:
        i = -1
        while (i:= sequence.find(pattern, i+1)) >= 0:
            pattern_start[i].append(pattern)
    return pattern_start

Then we can use that to compute the number of ways. We keep track in a dictionary of which subsequences are currently being explored, and how many ways they represent. This is initialized with all of the patterns that start at character 0, each with one way. Then we move forward in the sequence. If, for character i, the subsequence sequence[:i] (i.e. everything before that character) is not in that dictionary, that means we have no current ways that ends here and that we can keep building on, so we keep going. Otherwise, we take all the patterns that start at character i. If sequence[:i] + pattern is already in the dictionary, that means we are merging two different paths and adding the number of ways from each. Otherwise, we just add it, copying the number of ways from sequence[:i]. Once we have exhausted all the patterns starting at i, we remove sequence[:i] from the dictionary.

def compute_ways(sequence: str, pattern_start: dict[int, list[str]]):
    current_ways = {pattern: 1 for pattern in pattern_start[0]}
    for i in range(1, len(sequence)):
        if (old_seq := sequence[:i]) not in current_ways:
            # there is no node that ends here, so we can't start a new one
            continue
        for pattern in pattern_start[i]:
            if (seq := old_seq+pattern) in current_ways:
                # we reach a sequence that already exist: merging two paths
                current_ways[seq] += current_ways[old_seq]
                continue
            current_ways[seq] = current_ways[old_seq]
        del current_ways[old_seq]
    return current_ways

We can now use that on all the possible sequences, and accumulate the total number of ways, giving us a fast computation for the part 2 answer:

total_ways = 0
for sequence in possible_sequences:
    pattern_start = get_pattern_starts(sequence, patterns)
    ways = compute_ways(sequence, pattern_start)
    total_ways += sum(w for w in ways.values())
print(f"Part 2 answer: {total_ways}")