Partial Credit: $N,Q\le 5000$
To solve this subtask, we need to be able to answer each query in $O(N)$ time. Define $S=a_1+a_2+\cdots +a_N$, and suppose that the candidate we are considering is $d$.
This immediately leads to an $O(NQ)$ solution.
Danny Mittal's code:
import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; import java.util.StringTokenizer; public class SleepingInClassHarder { public static void main(String[] args) throws IOException { BufferedReader in = new BufferedReader(new InputStreamReader(System.in)); int n = Integer.parseInt(in.readLine()); long[] times = new long[n]; long sum = 0; StringTokenizer tokenizer = new StringTokenizer(in.readLine()); for (int j = 0; j < n; j++) { times[j] = Long.parseLong(tokenizer.nextToken()); sum += times[j]; } StringBuilder out = new StringBuilder(); for (int q = Integer.parseInt(in.readLine()); q > 0; q--) { long targetNumber = Long.parseLong(in.readLine()); if (sum % targetNumber == 0L) { long res = ((long) n) + (sum / targetNumber); long curr = 0; for (long time : times) { curr += time; if (curr % targetNumber == 0L) { res -= 2L; } } out.append(res); } else { out.append(-1); } out.append('\n'); } System.out.print(out); } }
Full Solution: We need a faster way to compute $x_d$.
Define $q_i=\gcd(p_i,S)$. If $d|S$, then $d|p_i\Leftrightarrow d|q_i$, so we can ignore all the $p_i$ and instead count the number of $i$ such that $d|q_i$. Each $q_i$ is a factor of $S$. Knowing the number of times each factor occurs is enough to answer all queries, so we first count them.
Let $\sigma_0(S)$ denote the number of factors of $S$. The maximum number of factors that can occur within the input constraints is achieved by the largest highly composite number not exceeding $10^{18}$. This is
which has $103680$ factors (see OEIS).
In the worst case, there are $O(\sigma_0(S))$ distinct nontrivial queries. If we can answer each query naively in $O(\sigma_0(S))$, this part by itself will take $O(\sigma_0(S)^2)$, which is too slow.
A faster way is to observe that if we arrange the factors in a multidimensional array, with a dimension for each prime factor, answering the queries from the frequencies is just computing multidimensional prefix sums.
Using inclusion-exclusion, we can compute each prefix sum in $O(2^{\omega(S)})$, where $\omega(S)$ counts the number of distinct prime factors of $S$ (not to be confused with the $\omega$ from complexity theory), but this is still slow.
A better way is to iterate over the dimensions, and compute prefix sums along that direction. This is a generalization of what is commonly referred to as Sum Over Subsets (SOS) and runs in $O(\omega(S)\sigma_0(S))$.
To compute the dimensions of the array, it helps to know the prime factorization of $S$. One way to do this is to use a factorization algorithm that is fast enough to factor numbers up to $10^{18}$. Alternatively, one can use trial division up to $10^6$ to eliminate all but at most two prime factors, then compute $\gcd$s of the remaining factor with the prefix sums and the queries. It is possible that a semiprime is left unfactored, but treating it as prime will not affect the correctness of the queries in that case since no query could distinguish it from being prime.
#include <cstdio> #include <vector> #include <numeric> int N; long long prefix[100005]; int Q; long long queries[100005]; std::vector<std::pair<long long,int> > primes; int num_factors(){ int cnt=1; for(int i=0;i<primes.size();i++){ cnt*=primes[i].second+1; } return cnt; } int freq[103680]; int& get_freq(long long num){ int code=0; for(int j=0;j<primes.size();j++){ int cnt=0; while(num%primes[j].first==0){ num/=primes[j].first; cnt++; } code=code*(primes[j].second+1)+std::min(cnt,primes[j].second); } return freq[code]; } int main(){ scanf("%d",&N); for(int i=1;i<=N;i++){ scanf("%lld",&prefix[i]); prefix[i]+=prefix[i-1]; } long long rest=prefix[N]; for(int p=2;p<=1000000;p++){ if(rest%p==0){ int cnt=0; while(rest%p==0){ rest/=p; cnt++; } primes.emplace_back(p,cnt); } } //now rest has at most two prime factors for(int i=1;i<N;i++){ long long tmp=std::gcd(rest,prefix[i]); if(tmp!=1&&tmp!=rest){ if(tmp>rest/tmp){ tmp=rest/tmp; } if(tmp*tmp==rest){ primes.emplace_back(tmp,2); }else{ primes.emplace_back(tmp,1); primes.emplace_back(rest/tmp,1); } rest=1; } } scanf("%d",&Q); for(int i=0;i<Q;i++){ scanf("%lld",&queries[i]); long long tmp=queries[i]; if(rest%tmp==0&&tmp!=1&&tmp!=rest){ if(tmp>rest/tmp){ tmp=rest/tmp; } if(tmp*tmp==rest){ primes.emplace_back(tmp,2); }else{ primes.emplace_back(tmp,1); primes.emplace_back(rest/tmp,1); } rest=1; } } if(rest!=1){ //assume it is prime; can't tell anyway primes.emplace_back(rest,1); } for(int i=1;i<=N;i++){ get_freq(prefix[i])++; } int block=1; for(int i=primes.size()-1;i>=0;i--){ for(int code=num_factors()-1;code>=0;code--){ if(code/block%(primes[i].second+1)!=0){ freq[code-block]+=freq[code]; } } block*=primes[i].second+1; } for(int i=0;i<Q;i++){ if(prefix[N]%queries[i]!=0){ printf("-1\n"); }else{ long long ans=N+prefix[N]/queries[i]; ans-=get_freq(queries[i])*2; printf("%lld\n",ans); } } }
In the implementation above, each factor $f$ of $S$ gets encoded to a location $\text{code}(f)$ in the $\texttt{freq}$ array. This encoding is generalized to other numbers $g$ by mapping them to the location $\text{code}(\gcd(S,g))$.
The part that performs the multidimensional prefix sum is as follows:
int block=1; for(int i=primes.size()-1;i>=0;i--){ for(int code=num_factors()-1;code>=0;code--){ if(code/block%(primes[i].second+1)!=0){ freq[code-block]+=freq[code]; } } block*=primes[i].second+1; }