Analysis: Cow Lineup by Travis Hance


This problem is equivalent to finding, out of all contiguous subintervals containing at most K+1 distinct breed IDs, the maximal number of cows of a single breed contained within such an interval.

The idea is to sweep down the array of cows, keep tracking of the left and right endpoints of an interval. Each time we increment the right endpoint, we may need to increment the left endpoint by some amount so that the interval contains at most K+1 distinct IDs. Of course, when we do this, we will increment the left endpoint as little as possible. To do this, we just need to keep track of (i) for each breed ID, how many cows of that ID are in the interval and (ii) how many distinct breed IDs have a nonzero number of cows in the interval.

Now, when examining any interval, we need to know the maximal number of cows of a single breed in that interval. One approach is to use a data structure such as a set or priority queue to maintain the maximum. Since at most K+1 IDs are nonzero at any given time, this solution takes O(N log(K)) time.

An even simpler approach involves the observation that during this sweep process, at some point the left endpoint will actually be pointing to the correct breed ID, and at this time the interval will contain as many cows of that ID as possible. In other words, rather than asking "Given an interval, what is the maximal number of cows of a single ID within this interval?", we ask "Given a cow, what is the maximum number of cows of that ID which can be in an interval with that cow as the left endpoint?" This solution takes O(N) time.

Here is Mark Gordon's solution in C++:
#include <iostream>
#include <cstdio>
#include <map>
#include <set>

using namespace std;

int A[100010];

int main() {
  freopen("lineup.in", "r", stdin);
  freopen("lineup.out", "w", stdout);

  int N, K; cin >> N >> K;
  for(int i = 0; i < N; i++) {
    cin >> A[i];
  }

  int res = 0;
  int nz_cnt = 0;
  map<int, int> cnt;
  for(int i = 0, j = 0; i < N; i++) {
    int& ci = cnt[A[i]];
    if(ci == 0) nz_cnt++;
    ci++;

    for(; nz_cnt > K + 1; j++) {
      int& cj = cnt[A[j]];
      --cj;
      if(cj == 0) nz_cnt--;
    }

    res = max(res, ci);
  }
  cout << res << endl;

  return 0;
}