Nikita the Spider

IPC with Python - System V Shared Memory and Semaphores

Please note: This module isn't being developed anymore. However, there are some other good options for Python IPC.

This describes the module shm (written by Vladimir Marangozov) that gives access to System V shared memory and semaphores on *nix systems as well the module shm_wrapper (written by me) which is a companion module that offers more Pythonic access. Windows users, you're out of luck here; these modules only work on platforms that support System V shared objects. Most *nixes do (including OS X) but Windows does not.

You can download shm version 1.2.2 which contains the module shm_wrapper, the module shm, installation instructions and sample code. You might also want to read about some known bugs.

The Modules

There's two modules available here. shm is essential. It is a single C language file that compiles into a Python module. The module's functions and features map fairly closely to the system calls like shmctl, but with nicer names.

The module shm_wrapper offers easier-to-use, higher-lever access to shm's features. For example, shm memory objects support reading the object's permissions through the perm attribute but you must set them by calling setperm(). By contrast, shm_wrapper exposes a simple gettable/settable attribute called permissions.

shm_wrapper also allows you to ignore the messy details surrounding keys and ids for the objects. Since the fact that ftok() is broken on most (all?) modern systems makes keys not very useful, there's no reason for you to spend time messing with them if you can avoid doing so.

Which Should You Use?

If you're an old (or young) Sys V hacker, you might prefer shm's "closer to the metal" feel. Otherwise, use shm_wrapper.

shm_wrapper

This module provides classes that act as handles to shared memory and semaphores as well as functions to create and destroy each. Memory segments and semaphores are distinguished from one another by a guaranteed-unique key that this module generates automatically when the object is created.

shm_wrapper Functions

create_memory(size, permissions = 0666, InitCharacter = ' ')
Creates a shared memory segment and returns a SharedMemoryHandle instance (described below). Each byte of the memory will be initialized to InitCharacter. If you pass a multibyte character for that parameter, the results are undefined. You can destroy the memory either by calling remove_memory() or by calling the .remove() method on a handle to said memory.
remove_memory(key)
Removes (releases) the shared memory identified by key. Raises KeyError if no shared memory has that key.
create_semaphore(InitialValue = 1, permissions = 0666)
Creates a semaphore and returns a SemaphoreHandle instance (described below). You can destroy the semaphore either by calling remove_semaphore() or by calling the .remove() method on a handle to said semaphore.
remove_semaphore(key)
Destroys the semaphore identified by key. Raises KeyError if no semaphore has that key.

shm_wrapper Class - SharedMemoryHandle

This is a handle to a piece of shared memory that allows you to read and write to the memory and manipulate its attributes. Important methods and attributes are below. Not all of the attributes are documented here; refer to the module itself for information on some of the more obscure ones.

read(NumberOfBytes = 0, offset = 0)
Reads the specified number of bytes from position offset and returns a string.
write(s, offset = 0)
Writes the string s to position offset.
remove()
Removes (releases) the shared memory. If any process tries to access it after calling .remove(), the results are system-dependent (but will probably be unpleasant).
key (read only)
The integer key that identifies this segment. By passing the key to other processes, the process that created the shared memory gives them the ability to find it easily.
size (read only)
The size of the segment in bytes.
permissions
The permissions on this memory.

shm_wrapper Class - SemaphoreHandle

This is a handle to a semaphore that allows you to wait on the semaphore and also read and write to its attributes. Important methods and attributes are below. Not all of the attributes are documented here; refer to the module itself for information on some of the more obscure ones.

P()
Blocks while the semaphore is zero and, once unblocked, decrements it. Stands for prolaag or probeer te verlagen (try to decrease).
V()
Increments the semaphore. Stands for verhoog (increase).
Z()
Blocks until zee zemaphore is zero.
remove()
Removes (destroys) the semaphore. If any process tries to access it after calling .remove(), the results are system-dependent (but will probably be unpleasant).
key (read only)
The integer key that identifies this semaphore. By passing the key to other processes, the process that created the semaphore gives them the ability to find it easily.
value
The semaphore's value.
permissions
The permissions on this semaphore.
blocking
Turns blocking mode on or off. In non-blocking mode, P() and Z() will merely raise shm.error if the semaphore is unavailable.
undo
Turns the SEM_UNDO flag on or off, but I believe support for SEM_UNDO is OS dependent.
WaitingForZero (read only)
The number of processes waiting for this semaphore to hit zero (i.e. waiting in a call to .Z()).
WaitingForNonZero (read only)
The number of processes waiting for this semaphore to hit non-zero (i.e. waiting in a call to .P()).

shm

Vladimir Marangozov's shm module is the core of the onion, so to speak. It is a single C language file that compiles into a Python module. The module's functions and features map fairly closely to the system calls like shmctl, but with nicer names. The module has the following features:

shm Functions

create_memory(Key, Size, [Perm=0666])
Creates a new shared memory segment and returns an shm.Memory object (described below). Fails if the key is not unique.
create_semaphore(Key, [Value=1, Perm=0666])
Creates a new semaphore and returns an shm.Semaphore object (described below). Fails if the key is not unique.
ftok(string Path, int ProjId)
Calls the system's ftok() which is supposed to map each filename to a unique integer but is probably broken on your operating system.
getshmid(Key)
Maps a memory key to an integer id. Raises KeyError if the key doesn't exist.
getsemid(Key)
Maps a semaphore key to an integer id. Raises KeyError if the key doesn't exist.
memory(Shmid)
Returns an shm.Memory handle to a shared memory segment if one exists with the id Shmid, otherwise returns a memory object with the addr attribute set to 0. I haven't yet figured out what purpose this serves.
memory_haskey(Key)
True if a shared memory segment with the given key exists.
remove_memory(Shmid)
Destroys (removes from the system) the shared memory segment identified by Shmid.
remove_semaphore(Shmid)
Destroys (removes from the system) the semaphore identified by Shmid.
semaphore(Semid)
Returns an shm.Semaphore handle to a semaphore if one exists with the id Semid, otherwise returns a semaphore object that points to ???. I haven't yet figured out what purpose this serves.
semaphore_haskey(Key)
True if a semaphore with the given key exists.

shm Errors

The shm module defines its own shm.error.

shm Class - Memory

A Memory object is a handle to a chunk of shared memory created by shm.create_memory(). Memory objects have these attributes and methods:

attach([addr = 0, how = 0])
Attaches to the memory segment. See your system's man page for shmat for valid parameter values
detach()
Detaches from the memory segment
read(NumberOfBytesToRead, [offset = 0]), returns a string
Reads bytes from the memory segment.
write(s, [offset = 0])
Writes the string s to the memory segment
setperm(perm)
Sets the permissions
setgid(gid)
Sets the gid
setuid(uid)
Sets the uid
shmid (read only)
The segment's id
key (read only)
The segment key or IPC_PRIVATE (0)
size (read only)
The segment's size in bytes
attached (read only)
A Boolean that reports whether or not the memory has been attached by a call to .attach()
nattch (read only)
The number of processes currently attached to this segment
perm (read only)
The segment's permissions
addr (read only)
Attachment address in the process address space
cgid (read only)
The gid of the creator
cpid (read only)
The pid of the creator
cuid (read only)
The uid of the creator
gid (read only)
The gid of the owner
uid (read only)
The uid of the owner
lpid (read only)
The pid of the last process to touch(?) this object

shm Class - Semaphore

A Semaphore object is a handle to a semaphore created by shm.create_semaphore(). Semaphore objects have these attributes and methods:

P()
Blocks while .val == 0; then decrements before returning. Stands for prolaag or probeer te verlagen (try to decrease).
V()
Increments .val. Stands for verhoog (increase).
Z()
Blocks until .val == 0.
setblocking(block)
Turns blocking on or off.
setundo(undo)
Turns the undo flag on or off. (See your system's man page notes about semop and SEM_UNDO.)
setval(Value)
Sets the value.
setperm(perm)
Sets the permissions
setgid(gid)
Sets the gid
setuid(uid)
Sets the uid
semid (read only)
The sempahore's id
key (read only)
The sempahore key or IPC_PRIVATE (0)
blocking (read only)
A Boolean that reports whether or not blocking mode is on or off.
cgid (read only)
The gid of the creator
cuid (read only)
The uid of the creator
gid (read only)
The gid of the owner
uid (read only)
The uid of the owner
lpid (read only)
The pid of the last process to touch(?) this object
ncnt (read only)
The number of processes waiting for this semaphore's value to become > 0.
perm (read only)
The semaphore's permissions
val (read only)
The value of the semaphore's counter.
zcnt (read only)
The number of processes waiting for this semaphore's value to become = 0.

Version History

Interesting Tools

Many systems (although not some versions of OS X) come with ipcs and ipcrm. The former shows existing shared memory, semaphores and message queues on your system and the latter allows you to remove them.

SHM and Threads

As of version 1.2, shm should be safe to use in threaded applications. (Previous versions were not).

Sample Code

The tarball includes code demonstrating the use of both shm_wrapper and shm. The demo code comes in the form of two complementary applications (two apps for each demo = a total of four apps). The apps are called Mrs. Premise and Mrs. Conclusion (as in, "Four hours to bury a cat?") and they converse with one another through shared memory. Run Mrs. Premise in one terminal and Mrs. Conclusion in another.

The conversation starts with Mrs. Premise creating the shared memory and then seeding it with a random string. Mrs. Conclusion then calculates the md5 hash of this string and writes that back into the shared memory. Mrs. Premise calculates the md5 hash of that string and writes it to the shared memory, and so it goes back and forth for as many iterations as are specified. Using md5 hashes allows Mrs. Premise and Mrs. Conclusion to verify that the other process correctly read what was written because it makes the response predictable. This is important for detecting memory corruption (see below).

You can specify whether or not you want the flow of the conversation to be controlled with a semaphore. (This option as well as the number of iterations are specified in DemoConstants.py.) If you opt not to use the semaphore, memory corruption (i.e. a "simultaneous" write by both processes to the shared memory) will probably occur if you run enough iterations. On my test systems -- a G4 Powerbook and a PIII running FreeBSD -- memory corruption always happened in less than 5000 iterations. If you use the semaphore, memory corruption will not occur. When either process detects corruption, it raises an AssertionError.

In addition to demonstrating the use of shm_wrapper and shm, this demo also illustrates why one needs to be careful with shared memory programming. Consider that even without semaphores these two processes writing to the same bit of memory as fast as they can still require thousands of iterations to step on one another. Now imagine a similarly infrequent bug, but instead of one that's caused by two deliberately careless demo programs, imagine one buried in thousands of lines of your code that causes Some Random Event every two weeks or so. You don't want to have to track down a bug like that. Speaking of bugs...

Known Bugs

Bugs? My code never has bugs! However, there is a suboptimal anomoly...

If I'm correctly interpreting what I see when I run top, the newest version of shm still leaks memory during the create/destroy cycle of both Memory and Semaphore objects. This happens despite plugging several existing memory leaks and despite the fact that Python's garbage collector doesn't report anything amiss. The leak is tiny (about 12 or 13 bytes per create/destroy cycle) and it is released when the Python process ends, so this will only be a problem if you have a long-running process that creates and destroys lots of these objects. Note that the leaks occur only when creating (or destroying) new objects. Getting a handle to an existing object doesn't cause a problem.

I created a simple program called MemoryLeakDemo.py that demonstrates the problem. Start it in one window with top running in another window and you can watch the python process eat memory.

About ftok – Use It At Your Own Peril

Most sample code that you see involving the use of System V semaphores/shared memory recommends ftok() to generate an integer key that's guaranteed to be unique on that machine. The main convenience of this is that processes can get the key (and thus a handle to the shared memory or semaphore) simply by using a previously agreed-upon filename. However, most modern implementations of ftok don't guarantee that it returns a unique key, which means it creates a key that may or may not work. If it doesn't work you have to fall back on a reliable alternative method of key generation, so you might as well just use the alternative in the first place.

The operating systems affected by this include OS X, Open/Net/FreeBSD, and Linux. (See the BUGS or NOTES section of the referenced man pages.)

In my experience, ftok frequently returned duplicate keys for different files in the same directory on my G4 Powerbook. Rather than probing and documenting ftok's limitations, I decided to just avoid it entirely and rely on Python's random number generator to provide keys for me. It's not as convenient as using a previously agreed-upon filename as a key, but buggy implementations of ftok don't (reliably) provide that either. One alternative is to have the creating process generate a random key and write that to a previously agreed-upon file.

Potential Compile Problems

Please let me know if you find a platform for which shm doesn't compile as-is.

On some systems, you might get this error when compiling:

shmmodule.c:186: error: redefinition of `union semun'

If you see this compile error, then you need to add your platform in setup.py to the if statement just above where HAVE_UNION_SEMUN is added.

Also, I suspect that BSD users (other than FreeBSD-ers) might see something like this:

shmmodule.c:1375: error: `PAGE_SIZE' undeclared (first use in this function)

You probably need to #include <machine/param.h>. That's already done for FreeBSD thanks to an #ifdef.

Last but not least, version 1.1.4 introduced some fancy code in setup.py. The need for this comes from the ipc_perm struct in ipc.h. It contains a member that's called key, _key or __key depending on your system. Setup.py should now autodetect which version you have, but the autodetection code is new and hasn't been well exercised (or exorcised). If if doesn't work for you and you're in a rush to get shm working, you can hack shmmodule.c yourself. Find the definition of the ipc_perm structure in /include/sys/ipc.h (/include/bits/ipc.h on some systems) and note which version of key it uses in its definition. Then in shmmodule.c, replace the code instances of IPC_PERM_KEY_NAME with the variable name from ipc.h.