Day 12 - JSON Parsing

Advent of Code 2015

Adrien Foucart

2026-04-20

[Back to index]

Puzzle: https://adventofcode.com/2015/day/12

My solution: https://codeberg.org/adfoucart/aocode15/src/branch/main/src/day12.c

Today’s puzzle (well, today’s puzzle from eleven years ago) promises us a lot of parsing, with the input consisting of a JSON-formatted document. The first task is to sum all numbers found in the JSON string. This doesn’t sound too hard. As we go through the string character by character, we should just be able to find those with an ASCII value between that of 0 and that of 9, convert them to the corresponding int, sum, and voilà.

There is one small difficulty on the way: we have to take into account negative numbers as well, so we’ll also have to keep track of - signs and flip the following number accordingly.

Oh, and also we can have number with multiple digits, of course. There are none in the examples given in the puzzle, but there are in the real input. So let’s make a test input with all these cases just to make sure we get things right:

{"a":2,"b":4,"c":-1,"d":123}

This should give us 2+4-1+123, or 128, if everything goes right. Which it will, obviously.

Let’s think about it for one second. We want to:

Translated into C, this gives us:

int sum_numbers(char* content){
    int current_number = 0;
    int sign = 1;
    int current_sum = 0;
    for (char* c = content; c <= content+strlen(content); c++){
        char _c = *c;
        if (_c == '-') sign = -1;
        else if (_c >= '0' && _c <= '9'){
            current_number *= 10;
            current_number += (int) (_c - '0');
        }
        else {
            if (current_number > 0){
                // reset
                current_sum += sign*current_number;
                current_number = 0;
                sign = 1;
            }
        }
    }

    return current_sum+current_number;
}

I am heavily using here the fact that we can iterate directly on pointers in the for, and that chars are just ints under the hood so we can do arithmetic on the ASCII values directly, so _c - '0' will be equal to the numerical value of the digit.

It works on my sample (once I fixed the sign*current_number, as I had initially forgotten to multiply by the sign). But does it work on the real thing? It gives me a number… and it’s the right one. Amazing.

Part 2, now we’re thinking

I suspected the parsing part would become a tiny bit more difficult in part 2. Now we’re really going to need a little bit of thinking time. The new rule is:

Ignore any object (and all of its children) which has any property with the value "red". Do this only for objects ({...}), not arrays ([...]).

So for instance, if I change my test input to be:

{"a":2,"b":4,"c":-1,"d":123,"e":{"v":"red","w":2}}

It shouldn’t change my sum, because the new number is within an object which contains an attribute with the value red.

Let’s think about it a little bit. If "red" is a key in an object, we will find the string :"red". Any other occurrence of "red" doesn’t count, as it’s going to be either a key or a value in an array. If we find :"red", we then need to determine which object in the arborescence of objects in the JSON file needs to be ignored. This includes any sub-object. I think this may be done somehow through a stack structure. Looking at opening / closing { }, I should be able to track when to stop or restart counting. Or maybe just track the running count per element in the stack. I’m imagining something like this for the example input:

We’ll have to deal with the sub-objects also: if counting is off, it should remain off until we pop our element from the stack. So maybe the “counting” flag should be a pointer to the { of the current :"red". That we we can recognize when we pop the right red? This should work I think… if we code it right.

Let’s start by doing the stack things without the red things, just to check that we can get the same result as before.

A stack can be easily implemented with linked nodes. A node here will have a pointer to where the { was in the string, the sum of all numbers in that object, and a pointer to the next node in the stack. Because of the recursive definition, we cannot do the typedef and the struct at the same time:

typedef struct ObjectStackNode ObjectStackNode;

struct ObjectStackNode {
    char* start;
    int sum;
    ObjectStackNode* next;
};

Then we define a stack as something with a top and a size. I’m not sure we’ll need the size, but it feels right to have it.

typedef struct {
    ObjectStackNode* top;
    int size;
} ObjectStack;

We can have a node create method, taking a pointer to a char as an argument:

struct ObjectStackNode* node_create(char* start){
    ObjectStackNode* node = malloc(sizeof(ObjectStackNode));
    node->start = start;
    node->sum = 0;
    node->next = 0;

    return node;
}

And a stack create method, which also takes a pointer to a char and creates the first node from it.

ObjectStack* stack_create(char* start){
    ObjectStack* stack = malloc(sizeof(ObjectStack));
    stack->size = 1;
    stack->top = node_create(start);

    return stack;
}

A stack needs a push and a pop to add/remove elements from the top:

void push(ObjectStack* stack, char* start){
    ObjectStackNode* new = node_create(start);
    new->next = stack->top;
    stack->top = new;
    stack->size++;
}

ObjectStackNode* pop(ObjectStack* stack){
    ObjectStackNode* node = stack->top;
    stack->top = node->next;
    stack->size--;
    
    return node;
}

Ideally if we were focusing on the data structure I’d like not to return a ObjectStackNode, as that feels like an “internal” structure from the stack, but here it contains its own running total of numbers, so I need it. Otherwise I’d have to complexify things considerably, which I don’t want to do right now.

Now to compute the total sum, I’ll copy the method from part 1, but add the stack elements: initialize a stack, push a new node when we find a {, pop the node when we find a } and add the running sum from the node to the total.

int sum_without_red(char* content){
    int total = 0;
    // check that the first element is a { and that the size is > 0
    if (strlen(content) == 0 || content[0] != '{') return 0;

    ObjectStack *stack = stack_create(0);
    char *ptr = content+1;
    int current_number = 0;
    int sign = 1;
    while (ptr < content+strlen(content)){
        char _c = *ptr;
        if (_c == '-') sign = -1;
        else if (_c >= '0' && _c <= '9'){
            current_number *= 10;
            current_number += (int) (_c - '0');
        }
        else {
            if (current_number > 0){
                // reset number
                stack->top->sum += sign*current_number;
                current_number = 0;
                sign = 1;
            }
            if (_c == '{'){
                push(stack, ptr);
            }
            else if (_c == '}'){
                ObjectStackNode* node = pop(stack);
                total += node->sum;
                free(node);
            }
        }
        ptr++;
    }
    printf("Stack size: %d\n", stack->size);
    free(stack);

    return total;
}

Running this on the original test and real data, I get the same results as before, so I haven’t broken anything. So far so good. I realize, however, that I should make a small change: if I want to be able to not count “red” objects and their descendants, it would probably be useful to have every node hold the total of itself and its “children” nodes. Reasonably small change in the code:

int sum_without_red(char* content){
    // ...
    while (ptr < content+strlen(content)){
        // ...
            else if (_c == '}'){
                if (stack->size > 1){
                    ObjectStackNode* node = pop(stack);
                    stack->top->sum += node->sum;
                    free(node);
                }
            }
        // ...
    }
    printf("Stack size: %d\n", stack->size);
    free(stack);

    return stack->top->sum;
}

When we pop a node, we add its sum to the new top. When we get to the last node, we don’t pop it: it should now contain the total, so we can return it at the end.

Once that’s done, we can add a boolean is_red property to the nodes, setting it to FALSE by default.

When we pop the node from the stack, we can check if is_red is true, and if so set the sum to 0. We just need to determine when to set is_red to TRUE. We’ll do that by checking, whenever we find a :, if the following five characters are "red".

// ...
else if (_c == ':'){
    if (*(ptr+1)=='"' &&
        *(ptr+2)=='r' &&
        *(ptr+3)=='e' &&
        *(ptr+4)=='d' &&
        *(ptr+5)=='"')
        stack->top->is_red = TRUE;
}
// ...

Normally, it wouldn’t be safe to look ahead so far here, as we could go further than the end of the input string. But since we know the input is valid JSON, we are sure that if we find a " after a : there will be a string behind. Either it’s red" and we’re good, or it’s another string and we’ll fail one of the test before we get out of the string. Is it completely safe? No. Is it good enough for an Advent of Code puzzle ? Yes.

Also, it works, and I get the next star.