If you want to make money online : Register now

# [Answers] Monotone Frameworks: Transfer functions for flow edges instead of labels

, ,
Problem Detail:

So, in generic program analysis, we have a lattice $L$ with a join operation $\sqcup$, program with statements labelled, and for each label $b$, a transfer function $F_b : L \rightarrow L$.

The goal is to find, for each label $b$, a fix-point $x$ such that $F_b(x)=x$. This is a common approach used in dataflow analysis,

In "Principles of Program Analysis" by Nielson, Nielson and Hankin, they reference that "Some formulation of monotone frameworks associate transfer functions with edges (or flows) rather than nodes (or labels)." However, in the textbook, they simply reference an exercise where the reader is asked to implement such a framework.

I'm wondering if anyone has a reference to either a book or paper where such a formulation is given. In particular, I'm wondering how a typical worklist algorithm for finding a fixed-point would be formulated.

#### Answered By : Wandering Logic

Your formulation is not stated in the way I typically think of it. Traditionally rather than thinking of looking for a fixed point at each node, we're looking for a solution to a system of simultaneous equations:

$$x_b = F_b \left( \bigsqcup_{p \in \mathtt{pred}[b]}x_p \right) \; \mathrm{for\:all\:nodes}\:b$$

If $F_b$ has the distributive property over $\sqcup$ then you can simply rewrite this as

$$x_b = \bigsqcup_{p \in \mathtt{pred}[b]} F_b(x_p) \; \mathrm{for\:all\:nodes}\:b$$

and since you have a function evaluation for every edge $p \rightarrow b$ you might as well just rewrite it as

$$x_b = \bigsqcup_{p \in \mathtt{pred}[b]} F_{p \rightarrow b}(x_p) \; \mathrm{for\:all\:nodes}\:b$$

If you don't like that handwaving, then here's some more: think of it as taking $F_b$ for every node with one predecessor ($p$) and pushing $F_b$ onto the edge making it $F_{p \rightarrow b}$. For nodes $b$ with more than one predecessor split them into two nodes, $b$ and $b'$. The predecessors of $b$ are still the same, then there's an edge $b \rightarrow b'$ and then the successors of the original $b$ become the successors of $b$. Now the original label $F_b$ becomes $F_{b\rightarrow b'}$. The join operation still happens on the nodes.

Note that while transfer functions are associated with edges, data flow facts are still associated with nodes. So the worklist algorithm doesn't really change. Actually, with the normal (transfer functions on nodes) formulation, the worklist algorithm is really a worklist of edges, not nodes, so nothing changes. (For an example of this take a look at these slides that came up when I googled for "dominance frontier". http://www.seas.harvard.edu/courses/cs252/2011sp/slides/Lec04-SSA.pdf. They give the worklist algorithm on page 7, and it's based on edges.)

A paper using the edge formulation is:

G. Ramalingam: Data Flow Frequency Analysis. PLDI 1996: 267-277.

(I haven't read the paper beyond the first couple of pages to see that they were using the edge formulation, it's what came up after a bit of Googling.)

I wasn't familiar with the edge transfer function approach until you asked this question. But I also know of at least two cases where it is important to take edges into account.

The first is when dealing with (post)dominance frontiers. The natural definition of the dominance frontier of a node is a set of edges, but the original (more complicated) formulation was in terms of nodes. See for example the (brief) discussion of this in

Pingali, Keshav; Bilardi, Gianfranco: Optimal Control Dependence Computation and the Roman Chariots Problem, ACM ToPLaS, 19(3):462-491, 1997.

Another example is in the classic

Knoop, J; Rüthing, O; Steffen, B: Lazy Code Motion, PLDI, 1992.

In Lazy Code Motion you sometimes need to move computations onto edges, but in the original formulation in the paper they were trying to move those computations to different nodes, which gave them trouble with branch nodes that target join nodes. They called these critical edges and had to add a node along the edge. This is explained in the following lecture notes by Cooper and Torczon: https://www.cs.utexas.edu/~pingali/CS380C/2013/lectures/LazyCodeMotion.pdf