2024-12-12
[Go to index] [Advent of Code website] [My code on Gitlab]
Today is a puzzle about fencing. Not the swordfight kind of fencing though: the fence kind of fencing.
We have a map such as this one where each cell is marked by a letter:
RRRRIICCFF
RRRRIICCCF
VVRRRCCFFF
VVRCCCJFFF
VVVVCJJCFE
VVIVCCJJEE
VVIIICJJEE
MIIIIIJJEE
MIIISIJEEE
MMMISSJEEE
A region in that map is a set of positions which are connected (not diagonally) and have the same letter.
So for instance, the C
in the example has two distinct
regions:
......CC..
......CCC.
.....CC...
...CCC....
....C.....
....CC....
.....C....
..........
..........
..........
and
..........
..........
..........
..........
.......C..
..........
..........
..........
..........
..........
For the first part, we have to compute for each region its area (i.e. number of cells it covers) and its perimeter (i.e. number of cell borders which touch a cell from another region). To get the puzzle answer, we need to multiply for each region the area and the perimeter, then sum the results of all regions.
Labeling connected components and computing characteristics is a very
common task in image analysis, and I immediately thought about just
using scikit-image
with label
and regionprops
.
But in the spirit of the rules
I set for myself, I will keep to vanilla Python solutions.
First thing first, let’s parse the inputs to get the grid as a
dictionary mapping position to value (here: the letter). I also store
the gridsize. I keep positions as namedtuple
, as I’ve done
in previous puzzles.
= namedtuple("Position", ["row", "column"])
Position
def parse_input() -> tuple[dict[Position, str],
Position]:with open("inputs/day12", 'r') as fp:
= fp.readlines()
data for c in row.strip()] for row in data if len(row) != 0]
data [[c
= len(data)
n_rows = len(data[0])
n_cols
= {}
grid
for r, row in enumerate(data):
for c, v in enumerate(row):
= Position(r, c)
p = v
grid[p]
return grid, Position(n_rows, n_cols)
Then we have to find the connected components. This is not complicated per se, but I still ended up with some spaghetti code that I’m not particularly proud of.
The gist of it is to give a uid
to each distinct
region:
uid
.uid
.uid
from the left and we merge the two conflicting
uid
s.At the end of the process, we have a new grid with uid
s
of connected regions instead of the original letters. We also keep track
of a list of positions for each uid
, so it will be easier
to browser per region later.
To make our grid navigation a bit easier, I also added a few helper
lambda
functions.
= lambda p : Position(p.row-1, p.column)
up = lambda p : Position(p.row+1, p.column)
down = lambda p : Position(p.row, p.column-1)
left = lambda p : Position(p.row, p.column+1)
right
def connected_components(grid: dict[Position, str],
-> tuple[dict[Position, int],
gridsize: Position) dict[int, list[Position]]]:
= {}
uids = defaultdict(list)
pos_per_uids = uid_generator()
uid_gen
for r in range(gridsize.row):
for c in range(gridsize.column):
= Position(r, c)
p # v = letter of the cell
str = grid[p]
v:
# since we go top->down and left->right, we only worry about what's left and up
= uid_left = None
uid_up if (n := grid.get(up(p))) is not None and n == v:
= uids[up(p)]
uid_up if (n := grid.get(left(p))) is not None and n == v:
= uids[left(p)]
uid_left
# nothing up or left
if (uid_up is None and uid_left is None):
= next(uid_gen)
uids[p]
pos_per_uids[uids[p]].append(p)else:
# up or left are both with the same letter as our cell...
if uid_up is not None and uid_left is not None:
# ...and they have the same uid: great!
if uid_up == uid_left:
= uid_up
uids[p]
pos_per_uids[uid_up].append(p)else:
# ...they were given different uid, so we MERGE them.
= uid_up
uids[p]
pos_per_uids[uid_up].append(p)while len(pos_per_uids[uid_left]) > 0:
= pos_per_uids[uid_left].pop()
n = uid_up
uids[n]
pos_per_uids[uid_up].append(n)del pos_per_uids[uid_left]
# only up is set
elif uid_up is not None:
= uid_up
uids[p]
pos_per_uids[uid_up].append(p)# only left is set
else:
= uid_left
uids[p]
pos_per_uids[uid_left].append(p)
return uids, pos_per_uids
Phew, that’s a big function. Too big for sure, but I don’t have the time or energy to refactor: it works fine.
With our list of positions per uid
, computing areas and
perimeters is easy enough: the area is the length of the list, and the
perimeter is the sum, for all positions in the list, of the number of
neighbours that are not in the list.
= lambda p : [up(p), right(p), down(p), left(p)]
neighbours
def get_areas_and_perimeters(pos_per_uids: dict[int, list[Position]]) -> list[tuple[int, int]]:
= []
areas_and_perimeters for _, positions in pos_per_uids.items():
= len(positions)
area = 0
perimeter for p in positions:
+= sum(n not in positions for n in neighbours(p))
perimeter
areas_and_perimeters.append((area, perimeter))return areas_and_perimeters
Putting all together, we get our first answer:
= parse_input()
grid, gridsize = connected_components(grid, gridsize)
uids, pos_per_uids = get_areas_and_perimeters(pos_per_uids)
areas_and_perimeters print(f"Part 1 answer: {sum(a*p for a, p in areas_and_perimeters)}")
The twist is a small but important change in the definition of a border: we no longer consider the borders per-cell, but the sides per-region. So if we have an object like this:
....
.OO..
..O..
..O..
.....
In part one we would have counted 10 borders, but now we need to count 6 sides.
So… what makes a side a side?
I considered a few options, such as walking the perimeter and counting changes of direction, or defining a set of corner patterns and counting corners… but I ultimately decided on a method that I found easier to implement.
A “left side” is an uninterrupted vertical section where the left neighbour is not in the object. So, for a given column, we can extract all the “cells on a left side” from the list of positions in a region with:
= [p for p in positions if p.column == col and left(p) not in positions] left_sides
And then, assuming the positions are sorted top-down and left-right (this is foreshadowing), we can count the number of uninterrupted sections:
def count_sides_with_gaps(items: list[Position]) -> int:
if len(items) == 0:
return 0
if len(items) == 1:
return 1
= 1
sides for p in items[:-1]:
if down(p) not in items: # there is a gap -> there is a side
+= 1
sides return sides
Now we have to do that for every column of the object for the left and right sides, and for every row for the tops and bottoms:
def count_sides_with_gaps(items: list[Position], direction: Callable) -> int:
if len(items) == 0:
return 0
if len(items) == 1:
return 1
= 1
sides for p in items[:-1]:
if direction(p) not in items: # there is a gap -> there is a side
+= 1
sides return sides
def get_areas_and_sides(pos_per_uids: dict[int, list[Position]]) -> list[tuple[int, int]]:
= []
areas_and_sides for _, positions in pos_per_uids.items():
= len(positions)
area = sorted(positions) # make sure we go top->down left->right
positions # find minrow, mincol, maxrow, maxcol of region
= mincol = 10000
minrow = maxcol = 0
maxrow for p in positions:
= min(minrow, p.row)
minrow = max(maxrow, p.row)
maxrow = min(mincol, p.column)
mincol = max(maxcol, p.column)
maxcol = 0
sides # count vertical sides
for col in range(mincol, maxcol+1):
= [p for p in positions if p.column == col and left(p) not in positions]
left_sides += count_sides_with_gaps(left_sides, down)
sides = [p for p in positions if p.column == col and right(p) not in positions]
right_sides += count_sides_with_gaps(right_sides, down)
sides # count horizontal sides
for row in range(minrow, maxrow+1):
= [p for p in positions if p.row == row and up(p) not in positions]
top_sides += count_sides_with_gaps(top_sides, right)
sides = [p for p in positions if p.row == row and down(p) not in positions]
bottom_sides += count_sides_with_gaps(bottom_sides, right)
sides
areas_and_sides.append((area, sides))
return areas_and_sides
I initially didn’t put the positions = sorted(positions)
in there, because I had computed the list of positions in the right
order in connected_components
– or so I thought. I didn’t
immediately realize that it wasn’t necessarily the case when I had
merged two connected regions. I had to fire up the debugger to get to
the bottom of that one. Fortunately it was an easy fix, as tuples are
sorted by their first value, then second, etc. by default, which is
exactly what we need here.
With all of that, we can finally get our part 2 answer:
= get_areas_and_sides(pos_per_uids)
areas_and_sides print(f"Part 2 answer: {sum(a*s for a, s in areas_and_sides)}")
Done!
The puzzles are clearly becoming more complex. Which means they can no longer be solved in a short half-hour morning or evening session as I tended to do in the beginning. So I’m not sure that I’ll keep doing all of them going forward, as I occasionally like to sleep (when the 7 month-olds allows it). We’ll see!
scikit-image
is a great library for image processing. I
knew I could do the “connected components” part of the puzzle very
easily if I used it. How easily? Here’s the code to parse the input data
and return a connected components array of uids:
from skimage.measure import label
with open("inputs/day12", 'r') as fp:
= fp.readlines()
data = np.array([[c for c in row.strip()] for row in data if len(row) != 0])
data = np.unique(data)
keys = np.zeros(data.shape, dtype=int)
grid for key, v in zip(keys, range(1, len(keys)+1)):
==key] = v
grid[data= label(grid) labels
The part that does everything in my connected_components
method is the last line. It is a little bit shorter, I’ll admit.