Advent of Code 2015
2026-03-14
Puzzle: https://adventofcode.com/2015/day/8
My solution: https://codeberg.org/adfoucart/aocode15/src/branch/main/src/day8.c
Another string parsing puzzle, seemingly. We have a bunch of strings, and we need to count the difference between the number of characters “in code” and the number of characters “in memory”, with the examples:
"": 2 characters in code (the two quotes), 0 characters in memory (empty string)
"abc": 5 chars in code, 3 in memory
"aaa\"aaa": 10 in code, 7 in memory (six a, one escaped quote)
"\x27": 6 in code, 1 in memory (hexadecimal-encoded single-quote)
The value for this test input would be 23-11 = 12. Let’s
put that in a failing test to get started:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stdbool.h>
#include "futils.h"
#include "parser.h"
#include "testutils.h"
#define TEST_INPUT_FILE "./data/day8.sample.txt"
#define INPUT_FILE "./data/day8.txt"
int process_input(char* input){
return 0;
}
void run_test(){
char *input = freadall(TEST_INPUT_FILE);
assert(process_input(input)==12, "test input works");
}
int main(){
run_test();
}We’re in business, let’s get parsing.
The “number of characters in code” is just the number of
chars in the string, aka the result of strlen.
So we have:
int process_input(char* input){
str_list_t* list = parse_strings(input, '\n');
printf("%d lines in input\n", list->n);
int in_code = 0;
for (int i=0; i < list->n; i++){
in_code += strlen(list->str_list[i]);
}
printf("Total in code: %d\n", in_code);
return 0;
}Output: Total in code: 23.
The sequences to escape are:
\\ and \", which only count as one
character each “in memory” (\ and ").\x followed by two hexadecimal characters, which only
count as one character (the ASCII corresponding to the hex value).We also have to always remove the two surrounding quotes.
So I think the best strategy is simply to count the number of
occurrences of \\ and \" and remove one for
each; of \x and remove three for each, and then remove two
per line for the surrounding quotes, and we should be good. I think.
Let’s add a “count number of occurrences of a substring in a string”
function in parser:
// parser.c
int strcountseq(char* str, char* seq){
int i = 0;
char* nxt = strstr(str, seq);
while (nxt != NULL){
i++;
nxt = strstr(nxt+1, seq);
}
return i;
}A test never hurts:
// test_parser.c
bool test_count_str(){
char test[] = "how many :x: are there here? There should be two :x:s!";
if (strcountseq(test, ":x:") != 2)
return false;
char test2[] = "how many are there here? none!";
if (strcountseq(test2, ":x:") != 0)
return false;
return true;
}It passes. Now we can use it in process_input. First
attempt. The difficult part is to correctly escape the characters in my
search sequence:
int process_input(char* input){
str_list_t* list = parse_strings(input, '\n');
printf("%d lines in input\n", list->n);
int in_code = 0;
int to_remove = 0;
for (int i=0; i < list->n; i++){
int l = strlen(list->str_list[i]);
if (l == 0)
continue;
in_code += l;
to_remove += 2 + strcountseq(list->str_list[i], "\\\\")
+ strcountseq(list->str_list[i], "\\\"")
+ (strcountseq(list->str_list[i], "\\x")*3);
}
printf("Total in code: %d\n", in_code);
printf("Total to remove: %d\n", to_remove);
return in_code-to_remove;
}Gets me 12 to remove instead of 11, because
I didn’t return the right thing. The “total to remove” is what I’m
supposed to return, so I don’t even have to compute the total “in code”.
Simplify:
int process_input(char* input){
str_list_t* list = parse_strings(input, '\n');
printf("%d lines in input\n", list->n);
int to_remove = 0;
for (int i=0; i < list->n; i++){
if (strlen(list->str_list[i]) == 0)
continue;
int dbl_bck = strcountseq(list->str_list[i], "\\\\");
int bck_quo = strcountseq(list->str_list[i], "\\\"");
int hex = strcountseq(list->str_list[i], "\\x");
to_remove += 2 + dbl_bck + bck_quo + (hex*3);
}
return to_remove;
}Test input works. The real thing?
No, wrong answer. The test input didn’t have a \\
sequence, so let’s add one to check if we don’t have a problem
there.
""
"abc"
"aaa\"aaa"
"\x27"
"\\a"Now I should get 2+2+3+5+3=15. I do. Let me look at the
real input.
Ah, I see, I think. I have for instance this sequence:
"...\\\xa6..."
Where I will count two \\ instead of one, because I only
offset my strstr by one after finding an instance, which is
wrong. I should offset by strlen(seq). Fix, try again, and…
still too high.
Printing all the numbers for the different sequences found, I don’t see anything wrong. What did I misread again?
Wait, no, there is another edge case:
"...\\x\\..."
Since I’m looking at each sequence separately, both \\
will be found, and \x will be found as well. I had ignored
the “hexadecimal” part to make things easier, but I can’t.
\x\\ shouldn’t count. And in this case even if it was
hexadecimal, I shouldn’t count it:
"...\\x1e..."
The first \ escapes the second \, but the
x isn’t escaped. I need a new strategy:
\.
\ or ", add one to
“removed” count and move two characters right.x, and both next characters are
between 0-9 or a-f, then remove all four and
add three to “removed” count.This is too puzzle-specific to get in the parser.c, so
I’ll put it in day8 directly. Since I’m looking for a
single char, I should use strchr
instead of strstr this time.
The process is a tiny bit more hacky:
int process_string(char* str){
int count = 0;
char* end = str + strlen(str);
char* nxt = strchr(str, '\\');
while (nxt != NULL){
if (nxt+1 >= end)
break;
if (*(nxt+1) == '\\' || *(nxt+1) == '\"'){
count++;
nxt += 2;
}
else if (*(nxt+1) == 'x'){
if (nxt+3 >= end)
break;
if (((*(nxt+2) >= '0' && *(nxt+2) <= '9')
|| (*(nxt+2) >= 'a' && *(nxt+2) <= 'f')) &&
((*(nxt+3) >= '0' && *(nxt+3) <= '9')
|| (*(nxt+3) >= 'a' && *(nxt+3) <= 'f'))){
count += 3;
nxt += 4;
}
}
else {
nxt++;
}
nxt = strchr(nxt, '\\');
}
return count+2;
}But it works, both on the test and on the real puzzle input.
Part two is the reverse problem: find how many characters to
add in order to encode the string. So "aaa\"aaa"
should become "\"aaa\\\"aaa\"".
This is actually fairly easy, and we don’t need to compute the actual encoded string. The rules are simply:
"s
becoming "\" and \"".\\ and \" becoming
\\\\ and \\\" respectively.\x.. becoming
\\x...Since we already detect all those cases, we just have to add an element of response. Let’s create a response type:
typedef struct {
int to_remove;
int to_add;
} response_t;
response_t* response_t_create(int to_remove, int to_add){
response_t* res = malloc(sizeof(response_t));
res->to_remove = to_remove;
res->to_add = to_add;
return res;
}And modify the process:
response_t* process_string(char* str){
int to_remove = 0;
int to_add = 0;
// ...
while (nxt != NULL){
if (*(nxt+1) == '\\' || *(nxt+1) == '\"'){
to_remove++;
to_add += 2;
nxt += 2;
}
else if (*(nxt+1) == 'x'){
// ...
to_remove += 3;
to_add += 1;
nxt += 4;
// ...
}
return response_t_create(to_remove+2, to_add+4);
}And modify the rest to use the new type. Success! ⭐⭐