What is the difference between Caching and Memoization?

caching terminology memoization

19,724

Solution 1

Memoization is a specific form of caching that involves caching the return value of a function based on its parameters.

Caching is a more general term; for example, HTTP caching is caching but not memoization.

Wikipedia says:

Although related to caching, memoization refers to a specific case of this optimization, distinguishing it from forms of caching such as buffering or page replacement.

Solution 2

As I have seen them used, "memoization" is "caching the result of a deterministic function" that can be reproduced at any time given the same function and inputs.

"Caching" includes basically any output-buffering strategy, whether or not the source value is reproducible at a given time. In fact, caching is also used to refer to input buffering strategies, such as the write-cache on a disk or memory. So it is a much more general term.

Solution 3

I think term caching is usually used when you store results of IO operations, or basically any data that is coming to you from the outside (files, network, db queries). Term memoization usually applies to storing results of your own computations, for example in the context of dynamic programming.

Solution 4

Memoization is a special form of caching the result of a deterministic function. This means that caching the result outside the function is not memoization because the function would have to mutate the cache when computing a new result (not already in the cache) so it would not be a (pure) function anymore. Memoization generally implies passing the cache as an additional argument (in an helper function). Memoization will optimize functions that need to compute values several times for a single access. Caching will optimize functions that are called several times with the same parameters. In other words, Memoization will optimize the first access whether caching will only optimize recurrent accesses.

Solution 5

I would like to add to the other great answers that memoization is also known as tabling. I think it is also important to know that term for those who learn what memoization and caching are.

View more solutions

19,724

Author by

John

Updated on June 23, 2022

Comments

John about 2 years

I would like to know what the actual difference between caching and memoization is.
As I see it, both involve avoiding repeated function calls to get data by storing it.

What's the core difference between the two?
- Sridhar Sarnobat almost 7 years
  
  I wonder if you could say "memoization is to caching" as "array is to sparse array". In other words, you only store things "on demand" rather than enumerating every possible input combination.
nicolas about 10 years

but you can always surrond the part where cache is used with a function and christened it 'memoization'. although the difference is you are in control of the caching policy in your function, whereas memoization is higher order and happens outside the function I guess.
Gherman over 7 years

Are you sure that the function has to be deterministic?
harpo over 7 years

@German, yes, memoization depends on determinism. The classic example is a recursive algorithm, such as the Fibonacci sequence or factorial. Instead of re-computing all the way down to the base case, a memoized function would short-circuit by reusing earlier results for values it's already computed. This obviously depends on the same input always yielding the same output, which is the definition of determinism. Caching, on the other hand, is frequently used for non-deterministic (e.g. random or timestamped) processes, with the understanding that results may not match a "refreshed" value.
Нет войне over 6 years

Why is HTTP Caching not memorisation? that also is based on the parameter (the URL of the resource requested).
SLaks over 6 years

@topomorto: Because of features like If-Match and expirations. Memoization only makes sense for pure function, which HTTP rarely is.
Alexey almost 6 years

@nicolas, not quite, i think. I think that in memoization the term "function" is used in pure/mathematical sense. Downloading a web page from a given address cannot be considered a function, because it may happen that the page changes.
nicolas almost 6 years

@Alexey doesn't the same remark applies to caching ? all those strategies relies on the same function call giving the same result, aka no upstream side effect.
Alexey almost 6 years

@nicolas, it is quite common to cache changing data AFAK, see Cache invalidation Wikipedia article for example. I do not understand what you mean by an "upstream side effect".
nicolas almost 6 years

@Alexey indeed, that's my point. by "upstream side effect" I mean side effect which affect your own inputs, and thus might change your result VS, say, printing to a debug console : unpure, yet wont change your result.
Alexey almost 6 years

@nicolas, sorry, i do not understand what you are talking about (side effects affect inputs?). What "function" are you talking about?
nicolas almost 6 years

@Alexey That's my point : the remark is parametric in whichever notion of "function" you might want to think of. Just use the same notion for memoization and for caching when comparing both. (for "inputs", consider the fact that pure does not mean deterministic. whatever changes your result, deterministic or not, can be quite naturally seen as an input, as it would be if the function was pure....)
Alexey almost 6 years

@nicolas, i am talking about ordinary functions. There is no function that can associate a web page to a web address, because web pages change.
nicolas almost 6 years

@Alexey I agree, and it is a useful distinction to make between function in the mathematical sense, arrows in the category Sets. Half of CS is about recovering a function-like behavior by restricting what's written or by creating advanced type wizardry. but that seems an orthogonal consideration to me