(Analysis by Andi Qu, Arpan Banerjee, Benjamin Qi)

Case 1: $N$ is even

Note that if some starting $N$-tuple can make all cows end up with hunger level $x$, then it can also make all cows end up with hunger level $0$ (by feeding each consecutive pair of cows $x$ bags of corn). So it suffices to find the number of $N$-tuples that can be converted to all $0$'s.

If we want to check whether a specific $N$-tuple can be converted to all $0$'s, then the following pseudocode suffices:

for (i in 2..N) {
h[i] -= h[i-1]
assert(h[i] >= 0)
}
assert(h[N] == 0)


The reasoning behind this is that if $h_1,\ldots,h_{i-2}$ are all equal to zero already, then the only way to make $h_{i-1}=0$ is by feeding $h_{i-1}$ bags of corn to each of cows $i-1$ and $i$.

Motivated by this, we can define a DP array where $\texttt{dp}[i][j]$ is the number of ways to choose $h_1\ldots h_i$ satisfying $h_k\le H_k$ for all $1\le k\le i$ such that after the $i$-th iteration of the loop in the pseudocode above, $h_i=j$. Then

$$\texttt{dp}[i][j] = \sum_{x=0}^{H_i-j} \texttt{dp}[i-1][x].$$

That is, if $h_{i-1}=x$ and $h_i=j+x\le H_i$ before the $i$-th iteration of the loop, then $h_{i-1}=0$ and $h_i=j$ after the $i$-th iteration of the loop. The final answer will be $\texttt{dp}[N]$.

A straightforward implementation of this algorithm runs in $\mathcal O(N (\max H)^2)$ time, which is already fast enough. However, we can speed this up by using prefix sums for the transition since we are summing over a contiguous range. Specifically, define $\texttt{pref}[i][j]=\sum_{x=0}^j\texttt{dp}[i][x]$; then $\texttt{dp}[i][j]=\texttt{pref}[i-1][H_i-j]$. This results in a runtime of $\mathcal O(N (\max H))$.

Andi's code:

#include <iostream>
#include <algorithm>
#include <cstring>
typedef long long ll;
using namespace std;

const ll MOD = 1e9 + 7;

int main() {
cin.tie(0)->sync_with_stdio(0);
int n, h;
cin >> n;
for (int i = 1; i <= n; i++) cin >> h[i];
ll dp, pref;
for (int i = 0; i <= h; i++) pref[i] = i + 1;
for (int i = h + 1; i <= 1000; i++) pref[i] = h + 1;
for (int i = 2; i <= n; i++) {
memset(dp, 0, sizeof dp);
for (int j = 0; j <= h[i]; j++) {
dp[j] = pref[h[i] - j];
if (dp[j] >= MOD) dp[j] -= MOD;
}
for (int j = 0; j <= 1000; j++) {
pref[j] = dp[j];
if (j) pref[j] += pref[j - 1];
if (pref[j] >= MOD) pref[j] -= MOD;
}
}
cout << pref;
return 0;
}


Case 2: $N$ is odd

In this case, it is not true that if a starting $N$-tuple can make all cows end up with hunger level $x$, then it can also make all cows end up with hunger level $0$. In fact, if a starting $N$-tuple can make all cows end up with hunger level $x$, then $x$ must be unique (as mentioned in the Bronze analysis). So it suffices to sum the number of starting $N$-tuples over all final hunger levels $x$ satisfying $0\le x\le \min H$.

To count this quantity for a fixed $x$, consider subtracting $x$ from all $h_i$ and $H_i$. Then the problem reduces to converting tuples to all $0$'s, which was described above.

The final time complexity for this case is $\max H$ times that of the complexity for the previous case, or $O(N(\max H)^2)$.

Andi's code:

#include <iostream>
#include <algorithm>
#include <cstring>
typedef long long ll;
using namespace std;

const ll MOD = 1e9 + 7;

int main() {
cin.tie(0)->sync_with_stdio(0);
int n, h;
cin >> n;
for (int i = 1; i <= n; i++) cin >> h[i];
int mn = *min_element(h + 1, h + n + 1);
ll dp, pref, ans = 0;
do {
// on x-th iteration of loop, count number of tuples
// that can make all heights equal to x-1
for (int i = 0; i <= h; i++) pref[i] = i + 1;
for (int i = h + 1; i <= 1000; i++) pref[i] = h + 1;
for (int i = 2; i <= n; i++) {
memset(dp, 0, sizeof dp);
for (int j = 0; j <= h[i]; j++) {
dp[j] = pref[h[i] - j];
if (dp[j] >= MOD) dp[j] -= MOD;
}
for (int j = 0; j <= 1000; j++) {
pref[j] = dp[j];
if (j) pref[j] += pref[j - 1];
if (pref[j] >= MOD) pref[j] -= MOD;
}
}
ans += pref;
if (ans >= MOD) ans -= MOD;
for (int i = 1; i <= n; i++) h[i]--;
} while (n % 2 && mn--);
cout << ans;
return 0;
}


Interestingly, each DP transition is equivalent to reversing a subarray of the previous prefix sum array, so we can code a very short (and fast) solution using functions from C++'s standard library.

#include <algorithm>
#include <cstdio>
#include <numeric>
using namespace std;

int mod_add(int x, int y) {
int res = x + y;
if (res >= 1000000007) res -= 1000000007;
return res;
}

int main() {
int n, h, mn, mx, dp, ans = 0;
scanf("%d", &n);
for (int i = 0; i < n; i++) scanf("%d", h + i);
mn = *min_element(h, h + n), mx = *max_element(h, h + n);
do {
fill(dp, dp + mx + 1, 1);
for (int i = 0; i < n; i++) {
reverse(dp, dp + h[i] + 1);
fill(dp + h[i] + 1, dp + mx + 1, 0);
partial_sum(dp, dp + mx + 1, dp, mod_add);
}