Cheap and Secure Web Hosting Provider : See Now

Showing Optimal Substructure for Stacking Boxes DP Problem?

, , No Comments
Problem Detail: 

I am having trouble wrapping my head around the Stacking Boxes dynamic programming problem, and understanding how it has optimal substructure and overlapping subproblems. If it has the optimal substructure and overlapping subproblems, then I can use dynamic programming to solve the problem. How do I know that the stacking boxes exhibits the optimal substructure?

I have read that it is similar to the longest-increasing-subsequence problem which exhibits optimal substructure and overlapping subproblems, but I am unable to see how it relates to the stacking boxes problem.

Asked By : Gary

Answered By : Lee Gao

When people talk about this notion of an optimal substructure, they're just saying that you can solve a bigger problem by looking at the best solutions to smaller (sub)problems and somehow transforming these solutions to these subproblems back into a solution of the bigger problem.

So let's look at a modified version of this box-stacking problem that is also used in G4G's solution. Let $\text{Stack-Boxes-No-Rotation}(\mathrm{B})$ be the problem of finding a maximum stack of boxes in $B$ such that we are not allowed to rotate any of the boxes. Later on, we'll relax this restriction of not being able to perform rotations.

Now, let's get to the meat and potato of this concept. First, let's break down the phrase. What is a substructure? Well, the prefix "sub" suggests that this is a smaller problem structure than the whole problem. However, it's still a bit hazy what we mean by smaller. Are we talking about the number of boxes in $B$ to stack? Are we talking about volume? The height?

Unfortunately, there's no singular prescription that you can follow to figure this out. You'll have to rely on your intuition, and there are usually multiple valid ways to compare two things. The intuition however always comes back to the idea that if you have the solutions to the smaller problems, then you should be able to easily use them to create a solution to the big problem.

It's pretty hard to get a good feel for this unless we work through an example. Here, I'll start off with a way of defining "smaller" subproblems that is different from the way that G4G is doing, but which may have been one of your first instincts. Let's say a problem is smaller than another problem if its height is smaller than the other.

First thing first, since we're defining comparisons in terms of height, we need to create a parameter for it. So let $\text{Stackable}(B, h)$ determine whether it's possible to stack boxes in $B$ to a height of $h$. We wish to find the largest $h$ such that $\text{Stackable}(B, h)$ is still possible.

Now, we've talked about using the height to determine which problem is smaller. We say that $\text{Stackable}(B', h') \preceq \text{Stackable}(B, h)$ if $h' \le h$ and $B' \subseteq h'$. Now, we haven't shown that $\text{Stackable}(B, h)$ has the optimal-substructure property with respect to the order $\prec$ yet. Before we do so, let's review what optimal-substructure claims if it applies to this problem.

If we know $\text{Stackable}(B', h')$ for every subset of boxes $B' \subset B$ or every height $h' < h$ (that is, $(B', h') \prec (B, h)$, then we would be able to use that information to find out whether $\text{Stackable}(B, h)$ is possible.

Okay, so how do we do that?

Well, a good approach is usually to think backwards. If it's possible for a set of boxes in $B$ to stack to height $h$, then let's consider the last move that we could have made. It could be any of the boxes in $B$. Suppose that we've picked a box $b$ to use as the top of the stack, then what? Well, it better be the case that the stack under it uses only boxes from $B - \{b\}$ and has height $h - h(b)$. In addition, it better be the case that the box immediately under $b$ must have an area large enough to accommodate $b$.

If you think about it for a moment, figuring out whether the stack under $b$ has height $h - h(b)$ can be accomplished by knowing $\text{Stackable}(B - \{b\}, h - h(b))$ (which literally tells me if it's possible to stack boxes from $B - \{b\}$ to a height of $h - h(b)$). However, there's no way to ensure that the box immediately under $b$ has to have a large enough area! Therefore, just knowing the solutions to the subproblems is not enough (in this formulation of the problem) to piece together a solution for the bigger problem.

But what went wrong here? It turns out that to stack a box on top of another box, we also need to know the area of the box on the bottom. Now, you have a few choices that you can make here. The problem at hand is that we're not tracking enough information, so one solution is to track more information. The other is to formulate an altogether different problem-structure.

Let's go with the former. How do we amend our process to keep track of the area of the box on top of a stack? Well, we could just add them as parameters of $\text{Stackable}$. Now, we interpret $\text{Stackable}(B, h, w, l)$ as whether it is possible to stack boxes from $B$ to a height of $h$ such that the box on the top of this stack has area $w \times l$. How's that for a mouthful. In addition, we need to also consider whether we should amend our denotation $\preceq$ of whether one subproblem is smaller than the other. While it's tempting to say that $(B', h', w', l') \preceq (B, h, w, l) \iff B' \subseteq B, h' \le h, w' \ge w, l' \ge l$, as you'll see, it's also okay if we just keep the original $\preceq$.

Now, let's see what optimal substructure claims here: If I have the answer to whether it is possible to stack boxes to a height of $h'$ using subsets of boxes that I am given, such that the final box of that stack is some $w' \times l'$, then I should be able to use that information to figure out whether it is possible to stack boxes to my desired height $h$ such that the final box has size $w \times l$.

Fortunately for us, it is now possible to do this. Once again, let's think backwards and consider the final box that we wish to stack: $b$. Now, we know that $b$ has an area of $w \times l$, so in order to stack it, we need to make sure that the preceding stack of boxes in $B - \{b\}$ has height $h - h(b)$ and whose final area is at least $w' \times l'$ where $w' > w$ and $l' > l$. Well, fortunately for us, this is literally what $\text{Stackable}(B - \{b\}, h - h(b), w', l')$ (for each $w', l' > w, l$) answers. Therefore, we know that our subproblem candidate space is $$ \text{candidates}(B, h, w, l) = \bigcup_{b = (h(b), w, l) \in B, \\ w' > w, l' > l} \text{Stackable}(B - \{b\}, h - h(b), w', l'). $$ Now, given these candidates, how do we find $\text{Stackable}(B, h, w, l)$? Well, it's simple; if any of the candidate subproblems are possible, then so is the bigger problem. Therefore, we just have $$ \text{Stackable}(B, h, w, l) = \bigvee_{b = (h(b), w, l) \in B, \\ w' > w, l' > l} \text{Stackable}(B - \{b\}, h - h(b), w', l'). $$

Now, there are $2^{|B|} \times H \times W \times L$ possible ways to fill the parameters $(B, h, w, l)$, so that's definitely a horrible solution, since you might have to find the value of each of these possible arguments in order to find the best result. Don't worry; this actually happens quite a lot in practice. Not everyone finds an acceptable solution on their first try.

After one iteration of a possible Dynamic Programming formulation that you find lacking, you can think about alternative formulations of your problem structure. In my example, I used the formulation $\text{Stackable}(B, h, w, l)$ to structure the solution space. There, I interpreted the result of $\text{Stackable}(B, h, w, l)$ as whether it is possible to stack boxes in $B$ to a height of $h$ such that the final box has area $w \times l$. Now, is there a better formulation that I can make?

Well, in the spirit of gradually refining your process, it's quite obvious that in the formulation $\text{Stackable}(B, h, w, l)$, the final box must have an area of $w \times l$. However, it's probably the case that not that many boxes share the same area. Therefore, we can probably gain back some time if we just outright say what the last box is: $\text{Stackable}(B, h, b \in B)$. Now, our dynamic programming formula becomes $$ \text{Stackable}(B, h, b \in B) = \bigvee_{b' \in B-\{b\}\\w(b') \ge w(b) \\ l(b') \ge l(b)} \text{Stackable}(B, h - h(b'), b') $$ Still, we're potentially looking at $|B| 2^{|B|} H$ arguments here.

Unfortunately, this is the extent of what the stackability formulation can do for us. However, we can try other substructure formulations.

And this is where the idea of the G4G solution comes into place. Let's number each of the boxes in $B$, so that $B = b_1 \le b_2 \le \dots \le b_n$ (with respect to their area). Now, instead of looking at this problem in terms of whether it is possible to stack some boxes to some height, let's look at what the maximum height of the set of boxes $b_1, \dots, b_k$ can reach.

This then gives another sub-structure ordering. We say that a problem on $(b_1, \dots, b_k)$ is smaller ($\prec$) than another problem on $(b_1, \dots, b_{k'})$ if $k < k'$. Now, let's consider the formulation $\text{Tallest}(b_1, \dots, b_k)$, which gives you the tallest stack you can build using just the boxes in $\{b_1, \dots, b_k\}$. Can we show that it has an optimal substructure?

Unfortunately, it's not all rainbows and sunshine: we run into the same problem that we've had before. Suppose we're considering the last box on a stack, we have no way to ensure that the box preceding it can fit the final box, just that its total area is larger. Therefore, let's try our parameterization trick from before; let $\text{Tallest}(n, b_k)$ denote the tallest tower that we can build using just boxes in $(b_1, \dots, b_n)$, such that the final box is $b_k$.

Third time's a charm, right? To see that this formulation has an optimal substructure property, let's try to build the tallest stack using the first $n$ widest boxes that ends at some $b_k$. Well, if we have $\text{Tallest}(n-1, b_{k'})$ where $b_{k'}$ is both wider and lengthier than $b_k$, then we could build an even taller stack. The tallest stack we can build is then the maximum of these candidates: $$ \text{Tallest}(n, b_k) = \max_{b_{k'} \sqsubset b_k} \left(\text{Tallest}(n - 1, b_{k'})\right) $$ where $b_{k'} \sqsubset b_k$ if $k' < k$ and $w(b_{k'}) \ge w(b_k), l(b_{k'}) \ge l(b_k)$. Now, shrinking $n$ is valid, but the astute reader will also notice that $$ \text{Tallest}(n, b_k) = \max_{b_{k'} \sqsubset b_k} \left(\text{Tallest}(n, b_{k'})\right) $$ will work just as well. In fact, we no longer need to keep track of the range of boxes to consider, and we can simplify this formulation down to just $$ \text{Tallest}(b_k) = \max_{b_{k'} \sqsubset b_k} ~ \text{Tallest}(b_{k'}) $$ which can be interpreted as the tallest stack that you can build if the final block is $b_k$. This then concludes the formulation that G4G constructed.

I want to sign off with a few pieces of advice. Dynamic Programming is a very intuition-driven technique for problem solving. It might seem daunting at first, but with enough experience, you'll eventually view it as an extremely powerful tool. Personally, I find that the approaches that works best for me are:

  • Don't worry about the implementation of the actual dynamic program, and instead focus on its interpretation. When you see $\text{Tallest}(b_k)$, don't worry too much about what its code should look like; rather, stick to the high level interpretation and just read it out as "the tallest stack ending at the box $b_k$". This way, it makes reasoning about the correctness of the approach much easier.
  • Try to work backwards if possible. Dynamic programming is essentially just structural induction. It might seem daunting to have to build up the recursion tree in your head, so don't. Instead, the "inversion" trick of looking at the final operation of your program and working backwards is much less mentally taxing. On top of that, working this way will quickly tell you if your "induction hypothesis", or your "optimal substructure" hypothesis, actually holds or not.
  • Dynamic Programming is about decisions. Whenever you formulate a dynamic programming problem, always ask yourself: what is it that I need to choose and decide at every point? In the case of $\text{Tallest}(b_k)$, for each $b_{k'} \sqsubset b_k$, you have to decide whether you will include that box in your stack or not. You can explore all of these possibilities simultaneously, but at the end of the day, you have to commit to a concrete event. Therefore, it's important to figure out what your options/candidate subproblems are and how you can either combine them or choose the best outcome.
Best Answer from StackOverflow

Question Source :

3200 people like this

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback