Andy Glew's comp-arch.net wiki, http://semipublic.comp-arch.net

If you are reading this elsewhere, e.g. at site waboba.info, it is an unauthorized copy, and probably a malware site.
comp-arch.net wiki on hold from October 17, 2011

Content Addressable Memory (CAM)

From CompArch

(Redirected from CAM arrays)
Jump to: navigation, search

CAMs are a powerful concept often desired by computer architects, and often deprecated by circuit designers.

Whereas a RAM is indexed by the number of an entry, a CAM is indexed by something within an entry. Usually not the whole entry, usually a field within an entry.

Examples of structures that are, at least conceptually, CAMs:

Contents

CAM array structure

CAMs are usually, but not always, implemented as arrays. Specifically, an array of memory elements, although CAMs also include more logic than RAM arrays.

Pitch problems TBD

Split into separate arrays: CAM tag array and SRAM data array.

See Andy Glew's Method of Abstract Array Design

CAM Pseudocode

A CAM can usually be described with pseudocode as

for every entry E in CAM array
    if match(E,Input.address) then select E

The CAM match function may take various forms:

Encoded CAM
match(E,Input.address) ::== E.address = Input.address
Masked Encoded CAM, a generalization of Ternary CAM
match(E,Input.address) ::== E.address & E.mask = Input.address & E.mask

or

match(E,Input.address) ::== E.address = Input.address & E.mask
Decoded CAM, also known as a Mask CAM or Bitmask CAM
match(E,Input.address) ::== 0 != (E.address & Input.address)
Range CAM, encoded form
match(E,Input.address) ::== E.address_lo <= Input.address < E.address_hi
Interval CAM, encoded form
match(E,Input.address) ::== [E.lo,E.hi) INTERSECT [Input.lo, Input.hi) 
Bitmask CAMs lend themselves to a different implementation of range or interval CAMs.
Priority CAM
for every entry E in CAM array
    if match(E,Input.address) then assert E.matched
for all entries E that have E.matched asserted
    select one and only one according to some prioritization rule

Typical prioritization rules may be

  • lowest physical entry number E
  • entry E that has the lowest value of some other field, such as a timestamp


Are CAMs prohibitively expensive?

For many years it has been a truism that CAMs are prohibitively expensive. This is not always true.

It is always worth knowing just how much bigger a CAM decoder is than a RAM decoder. Often 2X-4X, and up.

Furthermore, RAM decoders can share much logic, essentially via common subexpression elimination at the gate level. CAM decoders cannot share such logic in general, if the addresses that are to be CAMmed are completely independent of each other.

Similarly CAMs may be more power hungry, because every bit is potentially active at all times.

However, after these basic differences, CAMs have often earned a bad rap because of the prioritization logic in Priority CAMs - sometimes called a picker circuit. Non-priority CAMs can be significantly less expensive that priority CAMs.

Duality between CAMs and Software Data structures

CAMs are a hardware data structure.

In hardware, CAMs may be quite simple and efficient - if you neglect things like power and circuit area. At least we know how to build hardware CAMs, no matter what their cost.

In software, there is no equivalent software data structure. At least not on a single threaded Von Neumann machine. Arguably, one can build a CAM out of a MIMD parallel machine, with a separate processor dedicated to each CAM comparison, but we are not yet at a point where that is a reasonable approach.

Moreover, even hardware CAMs are, more and more, deprecated. So the duality described here also applies to hardware - alternative ways of implementing a hardware CAM.

However, there are certain well-known dualities between hardware CAMs and software data structures:


Encoded versus Decoded CAMs

An encoded CAM (Content Addressable Memory) is indexed by an encoded number N. A decoded CAM is indexed by the corresponding decoded number, 1<<N.

I.e.

Encoded CAM:

for every entry E of the CAM array:
   if E.address = Input.address then select entry


Decoded CAM:

for every entry E of the CAM array:
   if E.decoded_address_bitmask & Input.decoded_address_bitmask then select entry

Where the Input.decoded_address_bitmask might be prepared by

Input.decoded_address_bitmask = 1 << Input.encoded_address,
   i.e. all bits 0 except for the single bit corresponding to the encoded address

i.e. it might be a one-hot.

However, a decoded CAM might also be indexed by multiple bits in the mask. I.e. it is a little bit easier to select multiple entries for output with an decoded CAM, whereas with an encoded CAM you would have to have several encoded addresses provided (although a ternary CAM provides a limited degree of flexibility in matching by allowing some bits to be "don't care").

(In either case, the CAM may need a policy as to what happens when multiple entries in a CAM are selected - does it act like a Priority CAM or are the multiple selected entries combined by a logic function such as AND or OR?)

---

  • Decoded CAM has slightly simpler logic per decoded address bit
  • Encoded CAM requires what amounts to an XOR at every address bit for every entry

As mentioned above

  • Decoded CAM or bitmask CAM permits multiple entries to be addressed
  • whereas an encoded CAM might require multiple encoded addresses

Most importantly

  • number of address bits scales more slowly for encoded CAMs than for decoded CAMs.

Overall, decoded CAMs are fine for small addresses. Encoded CAMs are more efficient for large addresses.



Priority CAMs versus Non-Priority CAMs

TBD.

May already have been discussed above.

CAM examples

Instruction cache or data cache
TLBs
Store-to-load forwarding buffer (STLF)
  • often a variant of Priority CAM, since a load must receive the data from the youngest store older than itself matching its address.
  • sometimes a Range CAM or Interval CAM
BTP (Branch Target Predictor)
See also CAM-indexed versus tagged arrays.


Instruction schedulers
Bypass detection
  • ...
Predictors, such as branch predictors and other types
Victim caches


---


Personal tools
No more shadowing