Analysis: Milk Scheduling by Travis Hance


We can solve this problem with a greedy approach.

Note first that the most straightforward greedy approach doesn't work -- that would be to construct the scheduling starting from t=0, choosing the available cow of largest milk output g_i at each time step. A case where this would not work would be where we have two cows, C and D, with deadlines at t=1 and t=2, respectively. If cow D has a larger output, this rule would cause us to pick D first, but then C becomes unavailable, even though we could have chose both cows by picking C first.

Instead, let us try doing the greedy approach starting at t=10000 and working towards the beginning. Again, the rule will be, at each time step, to choose the best available cow available at that time. The key difference here is that once a cow becomes available (i.e., t decreases below the cow's deadline d_i) it will always be available thereafter. Hence, we can never miss a cow by delaying to take it (unless we haven't taken it before reaching t=0).

Let us prove correctness more rigorously. Consider an execution of the algorithm, and suppose that at some point we chose a cow C, even though another cow D was available and had a better output. We will show that, no matter what schedule S results, we could have done at least as well by choosing D instead. If both C and D end up getting milked in S, we could simply swap the times of the C and D; this remains a valid schedule, and now D is taken at the given time. If D is never milked in the schedule, we could simply replace C with D and get a schedule with a higher output.

Now that we have a greedy algorithm planned out, we need to figure out how to implement it. A naive implementation would take O(dN) time (where d is the maximum deadline); there are d time steps and at each time step you have to determine the best of N cows. We can improve upon this by using a priority queue. As t decreases, any cow that becomes available gets added to the priority queue. When we need to find the best cow, we can just pop a cow off the top of the queue. Using the priority queue gives a solution of O((d + N) log N) time.

#include <cstdio>
#include <queue>
#include <algorithm>
using namespace std;

#define N_MAX 10005

struct cow {
	int g, d;

	// comparison function that will be used to make a max-heap keyed by g
	bool operator<(cow const& c) const {
		return g < c.g;
	}
};
cow cows[N_MAX];

// comparison function used to sort in decreasing order of d
inline bool sort_by_d(cow const& a, cow const& b) {
	return a.d > b.d;
}

int main() {
	freopen("msched.in", "r", stdin);
	freopen("msched.out", "w", stdout);
	
	int n;
	scanf("%d", &n);
	for (int i = 0; i < n; i++) {
		scanf("%d", &cows[i].g);
		scanf("%d", &cows[i].d);
	}

	sort(cows, cows + n, sort_by_d);

	// Index of the next cow to become available (when t decreases
	// below a cow's deadline d, we say the cow becomes available)
	int curcow = 0;

	// Total milk output
	int total = 0;

	// Stores the currently available cows that haven't been scheduled yet
	priority_queue<cow> q;

	for (int t = 10000; t >= 1; t--) {
		// Find all the cows that become available at time t.
		while (curcow < n && t <= cows[curcow].d) {
			q.push(cows[curcow]);
			curcow++;
		}

		// If any cow is available, take the best one and schedule it
		// for time t, adding its output to the total.
		if (q.size() > 0) {
			total += q.top().g;
			q.pop();
		}
	}

	printf("%d\n", total);
}