Free Market

There is a set of $$n$$ items ($$1 \leq n \leq 50$$), with values $$c_0, c_1, ..., c_{n-1} \leq 10,000$$. You are also given a value $$d$$ ($$1 \leq d \leq 10,000$$). On any given day, you have some items (you start out at day 0 with no items). You can take some subset of your items to the "market", and trade them for another set. If $$X$$ is the set of items you bring to the market, and $$Y$$ is the set of items you want to take home, then you can perform this trade if and only if: $$S(Y) \leq S(X) + d$$ and $$X \cap Y = \{\}$$ (where $$S(X)$$ denotes the sum of the values of the items in the set $$X$$). What is the maximum value you can have after any finite sequence of trades. And what is the minimum number of days it will take to achieve this maximum value, assuming you can make one trade per day?
  1. DP
  2. Greedy
  3. Shortest Path?
  4. Flows?
  5. Binary Search?
First sort the items. So I assumed that $$c_i \geq c_{i-1}$$ for all items $$0 \leq i \leq n-1$$, for simplicity. To make any major gains, I simplified the problem.
Instead of asking, "what is the maximum value and the minimum number of days you can reach it in?" I simply focused on the first part of the question.
One notices that (by a greedy observation), if you can get some item with cost $$c_i$$, you can end up with both item $$c_i$$ and $$c_{i-1}$$ simultaneously. Through some sequence of trades (starting from nothing), you eventually got item $$c_i$$. Leave this item at home, and take all the rest of your items to the market. Then you can "pretend" you start at nothing again by never using the $$c_i$$ item again, and you can make the same sequence of trades. Eventually you will have some set with enough value to get item $$c_i$$, but you already have item $$c_i$$ at home, so you can trade those items for item $$c_{i-1}$$. And then you can repeat this while leaving items $$c_i$$ and $$c_{i-1}$$ at home, etc. This means, eventually you will have all items $$c_1, c_2, ..., c_{i-1}, c_i$$ for some $$i$$, in an optimal solution (ignoring the number of days it takes).
The above observation, however, isn't totally useful, except that it implies that the problem has a highly greedy structure. The problem is, I told you a specific strategy for how to get the maximum value. The actual strategy you use will vastly change the number of days it takes.
At this point, I looked at an example or two (because I was really stuck), and tried to characterize what trades I could make, and what subsets were possible.
After looking at examples, I made major progress as follows: I wrote out a table of what values were achievable, and on what days they were achievable on. I tried to write down formulas for when I was able to achieve a subset with a certain value. By actually writing out my table and example, I was able to more easily come up with the final characterization, which is the observation below.
Suppose I start out at home on some day with a set $$X$$ of items. Further suppose that I want to end the day at home with a set $$Y$$ of items ($$X$$ and $$Y$$ can arbitrarily overlap, or even be the same set). Then this is possible if and only if $$S(Y) \leq S(X) + d$$.
(First of all, this is not "obvious". This is not equivalent to the condition in the problem statement. The problem statement talks about which items you can bring to the market and take home from the market. This lemma characterizes the sets you can start and end your day with (after all trading is over), which loosens the conditions, because these items may overlap and intersect.) Let $$X \cap Y$$ denote the intersection of the two sets, and let $$X \backslash Y$$ denote the items in $$X$$ but not in $$Y$$. Given set $$X$$, leave at home every item in $$X \cap Y$$, and take to the market all the items in $$X \backslash Y$$. Then take home $$Y \backslash X$$. Now, notice that: $$S(X) = S(X \backslash Y) + S(X \cap Y)$$ (since $$X = (X \backslash Y) \cup (X \cap Y)$$). Similarly, $$S(Y) = S(Y \backslash X) + S(X \cap Y)$$. If $$S(Y) \leq S(X) + d$$, then $$S(Y \backslash X) \leq S(X \backslash Y) + d$$, so we can make the trade $$X \backslash Y$$ for $$Y \backslash X$$ at the market. And conversely, if we can make that trade at the market, then $$S(Y \backslash X) \leq S(X \backslash Y) + d$$, and so that implies that $$S(Y) \leq S(X) + d$$. Thus it is shown.
To the reader, the above lemma may seem "obvious". Again, for me, it was not immediately obvious (until I went to prove it); But thinking about the problem in terms of "what sets you have" instead of "what trades you make" makes the problem easier to characterize. For example, if we want to "simulate" (i.e.: on paper), a series of trades, we would have had to write out the list of trades. At the end of our list, it would be hard to remember which items we actually HAVE and which ones we don't (i.e.: we would have to scan back through the list to see which items we acquired but have not sold yet). This can be confusing, and makes the problem seem so much more irregular. But, instead of writing out the "sequence of trades made", we simply just write out the "sequence of sets we have at home at the end of the day". Then it is a trivial problem to determine what set we have at the end (just look at the last entry in our list); and to determine the sequence of trades made, we just look at pairs of adjacent sets and see what differed. This little "problem transformation" just seems a little easier to work with. And the above lemma also makes it really easy to determine if the sequence is valid (without even having to check what specific trades were made), by just looking at the difference in the $$S(.)$$ values for each set. As long as the $$S(.)$$ values increase by no more than $$d$$ each time, we're good.
Besides simplifying our view of the problem, the above lemma also yields one more powerful key observation.
We don't care about what items are in our set. For a set $$X$$, we only care about whether the set is actually a valid set (i.e.: it is composed of some actual subset of items), and we care about the value of $$S(X)$$. The optimal solution starting out at (or ending up on) a set $$X$$ only depends on the value $$S(X)$$!!! This seems somewhat counter-intuitive (to me, at least). I kept getting stuck on this; I couldn't believe it. But the proof makes total sense in the context of the above lemma! Here is the proof:
Suppose we have two possible (valid) sets $$X$$ and $$Y$$, such that $$S(X) = S(Y)$$. That is, we have two sets that may be different in some way (in terms of items inside them), but they have the same value. Suppose, in an optimal solution (somehow), we have set $$X$$ at some point; We claim that we could have used set $$Y$$ in place of set $$X$$. Why? Because, look at the trade done that day we had $$X$$. That is, we start with set $$X$$, and make some trade, and end up with some set $$Z$$ at the end of the day. Then $$S(Z) \leq S(X) + d$$ (by the lemma above). But $$S(Y) = S(X)$$, so $$S(Z) \leq S(Y) + d$$ as well. Again, by the lemma, this means we could have instead started with $$Y$$ and ended up with $$Z$$ by some valid trade. Similarly, suppose we start on some day with some set $$W$$. And we go to the market and make a trade that leaves us with set $$X$$. Then $$S(X) \leq S(W) + d$$, and so $$S(Y) \leq S(W) + d$$, so we could have instead ended up with set $$Y$$ after set $$W$$. Both of these arguments show that, in any optimal solution that uses $$X$$, we can replace ALL occurrences of $$X$$ with $$Y$$, without changing the number of days or the value of the solution. So $$Y$$ and $$X$$ are completely interchangeable as long as $$S(X) == S(Y)$$!!!!!!
Again, the observation, for some reason, feels intrinsically unintuitive. This is because we naturally reason about "what trades we can make" rather than what sets we end up with. But the above lemma showed that it is always possible to reason about the sets and make interesting conclusions. Generally speaking, two sets may be composed of different things, but as long as their values are the same, it is possible to make "similar" trades (where "similar" means: the trades may be composed of different items, but they will end up with the same set). Anyway, we continue.
This problem has optimal substructure. It can be completely characterized by the value of a set $$S(X)$$ (without even knowing $$X$$). And we can do some kind of DP on the values. Specifically, let $$v = S(X)$$ and let $$w = S(Y)$$. If we know, say, the minimum number of days it takes to get to some set of value $$v$$, and if $$w \leq v + d$$, then, the minimum number of days it takes to get to some set of value $$w$$ is at most one more than that of $$v$$.
With all of the above in mind, and the initial observations that the problem is probably solvable by DP, we can now generate a valid algorithm. The reader is encouraged to pause here and try to come up with the remainder of the algorithm independently. There are still a few details to work out and to conceptualize.
Let $$A$$ be the set of possibly achievable values. That is, $$A := \{v : v = S(X) \text{ for some set of items } X\}$$.
We know that the entire problem really only depends on these values, and not the subsets $$X$$ that make them up.
For some $$v \in A$$, let $$days(v)$$ denote the minimum number of days it takes to end up at home with some set $$X$$ such that $$S(X) = v$$. For example, $$days(0) := 0$$ (because we start off on day 0 with an empty set).
If $$v, w \in A$$ and $$w \leq v + d$$, then $$days(w) \leq days(v) + 1$$.
This is a corollary to our major lemma. If $$v \in A$$, let $$X$$ be any set so that $$S(X) = v$$. Similarly, define $$Y$$ so that $$S(Y) = w$$. Then we can achieve $$X$$ in $$days(v)$$ days. Then we take to the market everything in $$X \backslash Y$$, and trade them for everything in $$Y \backslash X$$. Since $$w \leq v + d$$, then $$S(Y) \leq S(X) + d$$, thus $$S(Y \backslash X) \leq S(X \backslash Y) + d$$, so the trade is valid. And we end up at the end of the day (and the beginning of the next day) with set $$Y$$, and a value $$S(Y) = w$$. So $$days(w) \leq days(v) + 1$$.
We can compute the set $$A$$ of possible values by a knapsack or subset sums DP algorithm. Once this is known, we can build the $$days[]$$ function as a dynamic programming array, bottom-up. For a fixed $$w$$, we check the $$A$$ to see if any value exists in the range $$w-d..w-1$$, and set $$days[w] = days[v]+1$$. We take the maximum $$v$$ such that $$days[v]$$ is known, and then print $$(v, days[v])$$.
Note, there are some implementation details that I skipped over to make this run in time.
This problem is a beautiful example of a "combination" problem, where you need to decompose the problem into simpler parts to be able to solve it. It took me a long time to solve; I did not get it in the contest. I came back to it the next day, and solved it after an hour or two. For some reason it was hard for me to comprehend. Here is how I solved it. The major observation is that, if I start out at home with a set $$X$$ of items on some day, then I can end the day at home with a set $$Y$$ of items if and only if $$S(Y) \leq S(X) + d$$ (where $$S(X)$$ is the sum of costs of items in the set). Notice that this is completely independent of which items are in $$X$$ and $$Y$$. To see this: suppose we start with set $$X$$. We leave at home any items in the set $$X \cap Y$$, and take to the market the items in $$X \backslash Y$$ and return home with the items $$Y \backslash X$$. If $$S(Y) \leq S(X) + d$$, then it is also true that $$S(Y \backslash X) \leq S(X \backslash Y) + d$$, so we can make this trade (and the converse is true too). So we can make the trade if and only if $$S(Y) \leq S(X) + d$$. This means that, regardless of which items you have in set $$X$$, you can always make a trade to a set $$Y$$ as long as the total value of $$Y$$ isn't too large. With this in mind, the problem becomes much easier to solve greedily, and by dynamic programming because we don't care about the compositions of the sets anymore, but only their values. First, solve the "knapsack" problem on the items you have. That is, compute all possible values that are achievable exactly, using the items given. Let $$A$$ be the set of achievable values. For a value $$v \in A$$ let $$days(v)$$ be the number of days it takes to get to value $$v$$. By the above observation about trading: if $$v \in A$$, $$w \in A$$, and $$w \leq v + d$$, then $$days(w) \leq days(v) + 1$$. This is because, if we have any set with value $$v$$ on some day, then we can always make some trade to end up with value $$w$$ by the end of the day. We know $$days(0) = 0$$ (i.e.: you start off on day 0 with 0 value). The question is to find the maximum shortest path over any node in this implied graph, where the nodes are the "achievable values" in $$A$$, and there is an edge with weight 1 from $$v$$ to $$w$$ if and only if $$w \leq v + d$$. (Actually, since all the weights are 1, we ca assume the graph is unweighted.) Hence, we solve the shortest path problem. Lastly, we notice that the size of $$A$$ can be at most 500,000 (since there are at most 50 coins of value at most 10,000; the maximum achievable value is 50*10,000 which is 500,000). And the number of "edges" in this graph can be very large (there can be at most $$d$$ edges for each node; and this is unreasonably large); so we do not apply any breadth-first-search or shortest path algorithm directly. Instead we do a fairly natural DP, with some optimizations: The base case is to set $$days[0] := 0$$. Suppose we know $$days[v]$$ for all $$v \leq w-1$$. If we want to figure out $$days[w]$$, we want to search (somehow) in the range $$days[w-d..w-1]$$ and take the value in that range. So, we binary search in the set $$A$$ for the smallest value $$v \in [w-d..w-1]$$ (if one exists). And if it does, we set $$days[w] = days[v]+1$$. This works because we might as well start with the smallest value possible (since the $$days[.]$$ function is monotonically increasing over the set $$A$$, or at least non-decreasing). And if we can achieve that value $$v$$ in $$days[v]$$ days, then we can achieve $$w$$ the next day. If there is no such $$v$$, we set $$days[w] = \infty$$ (i.e.: impossible). We fill this table up completely, and then return the maximum $$v$$ so that $$days[v]$$ was set properly (i.e.: so that $$days[v]$$ was not infinity).
  1. Suppose I start out at home on some day with a set $$X$$ of items. Further suppose that I want to end the day at home with a set $$Y$$ of items ($$X$$ and $$Y$$ can arbitrarily overlap, or even be the same set). Then this is possible if and only if $$S(Y) \leq S(X) + d$$. This observation, while appearing a bit "obvious" actually allowed us to transform our thinking of the problem. Instead of thinking in terms of "what trades can I make at the market", you can think in terms of "what sets can I end up with at the end of the day". The latter characterization makes it easier to see that the sets you can have only depend on what values they have, not on their internal composition.
  2. If two sets $$X$$ and $$Y$$ have the same value, then any day that we start or end with set $$X$$ can be replaced by an equivalent solution where we start or end end the day with $$Y$$ instead. Although we can't make the same trades with $$X$$ and $$Y$$ in general, we can always make "similar" trades (i.e.: ones which end up with the same resulting set and value).
  3. If $$v$$ and $$w$$ are the values of some subsets of items, and $$w \leq v + d$$, then $$days(w) \leq days(v) + 1$$.This was a result of our major lemma, and it provided the optimal substructure necessary to solve the problem by dynamic programming.
  1. I first focused on trying to achieve an optimal solution (regardless of number of days it takes). This yielded some greedy structure. Then I focused on trying to characterize the set of achievable solutions. This yielded the rest of the optimal substructure (after some time and work) necessary. This naturally (eventually) led to a solution that also minimized the number of days taken.
  2. It was hard to even conceptualize the problem, because keeping track of trades made didn't really make sense in my head (i.e.: having to deal with non-overlapping subsets). So I tried simulating instances of the problem by hand (on paper). Eventually, in just trying to describe and write down the examples, I came to a couple observations that made the entire problem easier to handle / conceptualize.
  3. During my examples / hand-simulations, I tried writing out the table of values that were achievable. In my head, I still thought that the actual sets used to construct them mattered as well, so it was hard to fully characterize and understand the solutions; but by writing out this table, it helped me somehow generalize.
  4. Instead of thinking about "what kinds of trades can I make at the market", I transformed the problem by looking at the sequence of item-sets that one can start each day with. By looking at these sequences, it is trivial to find out what trades were made each day (by looking at the differences between adjacent sets), and also it became a much easier matter to characterize "which sequences were possible" (i.e.: the characterization was, a set $$Y$$ can follow a set $$X$$ in the sequence if and only if $$S(Y) \leq S(X) + d$$). And this characterization was much easier to describe, and much more regular (i.e.: it didn't depend on the intersection of the two sets, etc., or the existence of any particular items). It was very powerful.
  • "Check my understanding." The reason I couldn't solve this problem was that I honestly could not conceptualize how the trading worked (for some reason). This is remedied by seeking out many examples, and writing down patterns and observations. If one doesn't truly understand the problem, then just sitting and thinking about the problem often won't yield much progress. In the end, the key observations needed to solve the problem were not difficult, just a little unintuitive based on my own limited understanding of the problem.
  • "Use the problem solving process early." Sometimes, problems are easy enough that there is a "natural flow" to the problem solving process. That is, in many (most) cases, just by reading the problem statement, noting down some initial observations, and following whatever train of thought (i.e.: intuition) your mind takes you on, will relatively quickly yield the correct solution. This is faster, and it is very useful when you have an intuitive understanding of the problem. There is no need to break your concentration by asking "forced" questions. On the other hand, in today's problem, I was blocked. I had not really much idea of how to solve the problem, and I could only come up with a couple observations, which turned out to be somewhat useless. It is exactly at these points, when the "natural flow" or "intuition" fails, when I should start employing the more formal problem-solving heuristics that I've come to learn. For instance, by asking direct questions like: "Is there a simpler, or related, problem?" or "What happens if I ignore / drop one of the conditions?", or "What does a simple example look like, if done by hand?", (all questions that I might regularly associate with "the problem solving process"), I could have come to the solution more quickly. For example, this problem was directly related to the knapsack problem, as well as subset-sums. Also, by ignoring the "minimum number of days" constraint, it was much easier to solve. And by looking at examples, I was able to come to a better way of characterizing solutions.
  • Combination Problem / Three Parts
  • Dynamic Programming / Knapsack / Subset Sums
  • Greedy / Matroids / Independent Sets over a Matroid
  • Shortest Path / Breadth-First-Search / on an Implicit Graph
  • Problem Solving Process
  • Observation-based / Easy to prove / Hard to come up with