Cheap and Secure Web Hosting Provider : See Now

[Solved]: What is the fastest way to a Publish/Subscribe pattern?

, , No Comments
Problem Detail: 

I am running in a conceptual problem in my development which I previously also encountered in the Drupal CMS and was wondering what would be the best solution to circumvent the problem.

Drupal, and my own Snap Websites software, make use of a publish / subscribe pattern where one can write functions that automatically get called whenever a certain event happens. This can be shown in the following process (note that neither are using threads in those specific processes, it runs within one single main thread in one single main process--although Snap! supports backends—separate processes, I have no concerns with those at this point):

+----+    +----+    +----+    +----+      +----+ | P1 |--->| P2 |--->| P3 |--->| P4 |-...->| Pn | +----+    +----+    +----+    +----+      +----+             |                   |             |                   | p4 signal  +------+    +------+      +------+             |                   +----------->| O1/4 |--->| O2/4 |-...->| On/4 |             |                                +------+    +------+      +------+                 |             | p2 signal  +------+    +------+      +------+             +----------->| O1/2 |--->| O2/2 |-...->| On/2 |                          +------+    +------+      +------+ 

Shown here we see that process P2 emits a signal (p2 signal) and that generates functions of observers O12, followed by O22, etc. up to On2 to be called. The exact same thing happens in process P4.

In most cases that works just fine. However, once in a while, a change made by one of the observers is required by the next observer. In other words, the order O12 then O22 is important. However, if the end user has the possibility to switch the two observers around, then the process becomes O22 then O12 which breaks the proper sequence and thus something will not happen.

The Drupal awkward solution is to use a weight to sort their modules in a given order. This is an issue because that forces all observers of that module to happen in that order when some of the events may need to happen before and others after. Drupal modules with equal weights are sorted alphabetically, which at least provides some form of consistency.

It seems to me that the proper way would be to generate signal p2 again if one of the observers made a change that warrant the signal p2 to be re-run. There are two problems in such an implementation:

  1. We have to be careful to not generate an infinite loop, so an observer should not re-generate the signal if what it needs to do was already done in a previous call [this can be complicated though]; and

  2. We are probably going to re-execute the code of many observers for nothing.

Called again:

+----+    +----+    +----+      +----+ | P1 |--->| P2 |--->| P3 |-...->| Pn | +----+    +----+    +----+      +----+             |             |             |<--------------------------+             |                           | p2 signal after P2-like change             |                           |             | p2 signal  +------+    +------+      +------+             +----------->| O1/2 |--->| O2/2 |-...->| On/2 |                          +------+    +------+      +------+ 

The other way would be for each observer to tell us whether they should run before or after certain other observers. This means observers have knowledge that they probably should not have, although I am not so sure that this is really a bad thing.

A concrete example of the problem in my Snap! Websites environment:

[Client Side]

  1. User uploads a PDF file
  2. Client process (i.e. browser) sends PDF to server
  3. Server transforms PDF to a preview image
  4. Client checks to see whether the PDF preview is ready

In step 3 the server process is something like this:

[Server Side--i.e. front end server]

  1. The Editor plugin receives POST event with the PDF file in it
  2. The Editor plugin calls a Content function to save the PDF in the database
  3. A Page plugin observer is called as the Content plugin sends a "created content" signal
  4. The Page plugin observer links the PDF file to a script that will be used to generate the PDF preview image (this is a Magick++ script, you can define different sizes, quality, flipping, etc.)
  5. The Images plugin also has an observer on the "created content" signal, that observer detects the link created by the Page plugin observer and mark that PDF for future processing by our Image plugin backend
  6. The server process returns to the client (i.e. this was very fast, a few ms)

In the Images plugin backend process (a completely separate process which can even run on a computer by itself):

[Backend Side--i.e. server possibly not accessible from the front end]

  1. The Images plugin backend process wakes up
  2. The Images plugin backend checks for new data to process and finds the PDF
  3. The Images plugin backend runs the script attached to the PDF document
  4. The Images plugin backend saves the preview of the PDF along its other file in the database, ready to be retrieved by the client (step 4 of the first process.)

As we can see in the Server process, step (5) fails if step (4) does not happen before it, which in the Snap Websites environment, at this point, means the Page plugin must be loaded before the Images plugin.

Asked By : Alexis Wilke

Answered By : D.W.

Since you are asking on CS.SE rather than StackOverflow, I presume you are looking for a principled look at the fundamental underlying problem and principled solutions to the general problem, from a scientific/conceptual perspective (as opposed to a "quick hack" or a engineering solution that'll work for your specific situation). So, that's what I'll try to provide.

The problem

What you are describing is an instance of the plan interference problem.

Let me summarize the problem in abstract terms. You are in the middle of delivering an event to the observers, one by one, and the code of one of the observers happens to mutate the underlying state; this mutation might cause the notifications to the other observers to happen in the wrong order, or to expose an inconsistent state to them.

For a more general description of the plan interference problem, I recommend the following papers:

Some candidate solutions

One possible solution, proposed in the first paper above, is to stop using threads with mutable shared state as your form of concurrency. Instead, look at systems like Erlang and others that are more event-based and use alternative models of concurrency.

Another possible solution, proposed in the second paper above, is to stop using threads with mutable shared state as your form of concurrency. Instead, look at systems that are inspired by actors. These systems treat invoking a method on an object $O$ as sending a message to the object $O$, and instead of using a stack discipline, they use a message queue. This ensures that messages are delivered in an order that are consistent with the causality, and avoids causing plan interference. For instance, if you're in the middle of delivering an event to a bunch of observers, and the third observer triggers a state-changing method call $c$, then these systems would queue up $c$ and not deliver/invoke it until all of the observers have been informed. This avoids unexpected violations of atomicity and avoids many kinds of plan interference bugs. See the second paper for details.

A third alternative solution is to continue using threads with mutable shared state as your form of concurrency, and just be really careful. You could make yourself aware of the common kinds of plan interference hazards, and write your code so you won't shoot yourself in the foot in this way. For instance, the state-holding object could make itself responsible for queuing up notifications about state changes, to ensure that notifications happen in a consistent order. It could also establish a contract that forbids observers from making mutations to itself (so there is no mutual recursion where a change to the state of object $x$ triggers a notification to observer $O_i$ which in turn mutates the state of object $x$ some more). You could structure your code so that the order in which observers are invoked should never matter. There are all sorts of ways you could structure your code to try to avoid these hazards from ever affecting you.

Obviously, this last approach is very challenging, because it is difficult to reason about all the things that could happen in such a system. It's easy to screw something up. But hey, concurrency is really hard to think about. If you write your code using shared-state concurrency, don't be surprised if it makes your code harder to reason about.

Plan interference can occur even in single-threaded programs, so this problem is not exclusive to multi-threading. (I mention multi-threading only because it is the most common cause of plan interference problems.) The second paper above does propose solutions to plan interference that will apply even to single-threaded programs.

Best Answer from StackOverflow

Question Source :

3.2K people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback