Python caching with mutable return

Adrien Foucart

2025-09-03

Back to all notes

Python’s functools library has some really useful methods in it. The one I use the most is certainly functools.partial, but one I probably don’t use often enough is functools.cache.

The main idea of cache is that, if you use it as a decorator for a method, it will store the arguments/return pair in a dictionary, so that if you call the same method with the same arguments, it just directly looks up what to return and doesn’t re-execute the code in the method. As I saw during the Advent of Code 2024 Day 21 puzzle, this can really speed up a program.

But here’s something that wasn’t clear to me before, and that I had to test because the documentation didn’t really help me: what happens when the cached method returns a mutable object? If we modify the object in some way after the first call to the method, and then call the method again, will the method return a new instance with the same value as the first time, or will it return the same, now modified, instance?

My instinct was that it would be the latter, as Python passes references rather than values, but I wasn’t sure, so I tested it. Let’s imagine that we have some method load_array that loads a numpy.ndarray from a file. We get the array, apply some modification to it, then later in the program we want to load the same file again:

from functools import cache
import numpy as np

@cache
def load_array(path: str) -> np.ndarray:
    print(f"loading {path}")
    return np.zeros((2, 2))

def first_method() -> None:
    a = load_array("/path/to/file.npy")
    print(a)
    a[0, 0] = 5
    return

def second_method() -> None:
    b = load_array("/path/to/file.npy")
    print(b)
    return

first_method()
second_method()

When we call first_method, we will “load” the array and get a two-by-two array filled with zeros. loading /path/to/file.npy should be written in the terminal. Then, we modify the array and set the value at position (0, 0) to 5, before returning from the method. We are not returning the array, so we no longer have access to it once we get out of the method.

When we call second_method, we reload the file from load_array. If we don’t have the @cache decorator, then we will see:

loading /path/to/file.npy
[[0. 0.]
 [0. 0.]]
loading /path/to/file.npy
[[0. 0.]
 [0. 0.]]

But when we run it with @cache, then we get:

loading /path/to/file.npy
[[0. 0.]
 [0. 0.]]
[[5. 0.]
 [0. 0.]]

So that first_method loads and modify the array, then when second_method calls load_array, it receives a reference to the array which has since then been modified, so we see no loading /path/to/file.npy and a 5 at position (0, 0).

I feel like this can be very useful, and also a source of incredibly frustrating bugs.