Friday, March 16, 2007

Memory Tags and JVM's

Came up with this talking to my mate Stanley over a coffee...

Why do Java Virtual Machines have a hard time finding/locating pointers? They need to do this as a precursor to doing a "Garbage Collection". At CMGA conference I talked to an admin from a bank (St George?) who said SUN talked about these "Stop The World" events. With more memory, they take longer - e.g. 2-5 seconds of clock time. And nothing can happen in the JVM while garbage collection is going on... The events may not be able to be scheduled, and their occurrence will be unpredictable.

One of the good points is that more CPU's get the job done faster.
Each n-byte block of memory is read to see if it couldbe a pointer, i.e. has the value in the range of the heap addresses. Then that block has to be examined to see if it's an object, and does it contain pointers, and so on. They are looking for objects that are no longer used and can be returned to the 'free pool'... It sounds a very hard way to get things done.

The Burroughs B5000 used Memory Tagsand descriptorsto describe to the hardware the contents/use of each 'word'. Descriptors allowed real Virtual Memory around a decade ahead of IBM 360/370.

And memory 'tags' eliminated a whole class of problems. The 'Mark Stack' tag prevented stack manipulation [negative stack offsets weren't allowed either] and accidental corruptions. A whole class of security problem and coding faults were avoided.

So my question: JVM's are in complete control of their environment. They know when an object is created. Why don't they 'tag' each object and pointers to objects? This makes the garbage problem simple. You know the location of all objects and pointers. I guess it could be extended to a 'reference count', but I don't know enough about implementing these things...

Won't it take memory space?
yes - this is a space/time tradeoff.

Because we don't have hardware memory tags, then we have to do it for ourselves...
Like the 'free block list' in some file systems, a bit-map of each "object sized" memory region need be kept. I'm thinking that objects can't be less than 64-bits and they'd be allocated on those boundaries.

And the same for pointers - when an object pointer is allocated/created, set the tag for it.

No idea if this has already been tried... Just seems like A Good Idea :-)

No comments: