GAP, 416 bytes
Won't win on code size, and far from constant time, but uses math to speed up a lot!
x:=X(Integers);
z:=CoefficientsOfUnivariatePolynomial;
s:=Size;
f:=function(n)
local r,c,p,d,l,u,t;
t:=0;
for r in [1..Int((n+1)/2)] do
for c in [r..n-r+1] do
l:=z(Sum([1..26],i->x^i)^(n-c));
for p in Partitions(c,r) do
d:=x;
for u in List(p,k->z(Sum([0..9],i->x^i)^k)) do
d:=Sum([2..s(u)],i->u[i]*Value(d,x^(i-1))mod x^s(l));
od;
d:=z(d);
t:=t+Binomial(n-c+1,r)*NrArrangements(p,r)*
Sum([2..s(d)],i->d[i]*l[i]);
od;
od;
od;
return t;
end;
To squeeze out the unnecessary whitespace and get one line with 416 bytes, pipe through this:
sed -e 's/^ *//' -e 's/in \[/in[/' -e 's/ do/do /' | tr -d \\n
My old "designed for Windows XP" laptop can calculate f(10)
in less than one minute and go much further in under an hour:
gap> for i in [2..15] do Print(i,": ",f(i),"\n");od;
2: 18
3: 355
4: 8012
5: 218153
6: 6580075
7: 203255386
8: 6264526999
9: 194290723825
10: 6116413503390
11: 194934846864269
12: 6243848646446924
13: 199935073535438637
14: 6388304296115023687
15: 203727592114009839797
How it works
Suppose that we first only want to know the number of perfect license
plates fitting the pattern LDDLLDL
, where L
denotes a letter and
D
denotes a digit. Assume we have a list l
of numbers such that
l[i]
gives the number of ways the letters can give the value i
,
and a similar list d
for the values we get from the digits. Then the
number of perfect licence plates with common value i
is just
l[i]*d[i]
, and we get the number of all perfect licence plates with
our pattern by summing this over all i
. Let's denote the operation
of getting this sum by l@d
.
Now even if the best way to get these lists was to try all
combinations and count, we can do this independently for the letters
and the digits, looking at 26^4+10^3
cases instead of 26^4*10^3
cases when we just run through all plates fitting the pattern. But we
can do much better: l
is just the list of coefficients of
(x+x^2+...+x^26)^k
where k
is the number of letters, here 4
.
Similarly, we get the numbers of ways to get a sum of digits in a run
of k
digits as the coefficients of (1+x+...+x^9)^k
. If there is
more than one run of digits, we need to combine the corresponding
lists with an operation d1#d2
that at position i
has as value the
sum of all d1[i1]*d2[i2]
where i1*i2=i
. This is the Dirichlet
convolution, which is just the product if we interpret the lists as
coefficients of Dirchlet series. But we have already used them as
polynomials (finite power series), and there is no good way to
interpret the operation for them. I think this mismatch is part of
what makes it hard to find a simple formula. Let's use it on
polynomials anyway and use the same notation #
. It is easy to
compute when one operand is a monomial: we have p(x) # x^k = p(x^k)
.
Together with the fact that it is bilinear, this gives a nice (but not
very efficient) way to compute it.
Note that k
letters give a value of at at most 26k
, while k
single digits can give a value of 9^k
. So we will often get unneeded
high powers in the d
polynomial. To get rid of them, we can compute
modulo x^(maxlettervalue+1)
. This gives a big speed up and, though I
didn't immediately notice, even helps golfing, because we now know
that the degree of d
is not bigger then that of l
, which
simplifies the upper limit in the final Sum
. We get an even better
speed up by doing a mod
calculation in the first argument of Value
(see comments), and doing the whole #
computation at a lower level
gives an incredible speed up. But we're still trying to be a
legitimate answer to a golfing problem.
So we have got our l
and d
and can use them to compute the number
of perfect licence plates with pattern LDDLLDL
. That is the same
number as for the pattern LDLLDDL
. In general, we can change the
order of the runs of digits of different length as we like,
NrArrangements
gives the number of possibilities. And while there
must be one letter between the runs of digits, the other letters are
not fixed. The Binomial
counts these possibilities.
Now it remains to run through all possible ways of having lengths of
runs digits. r
runs through all numbers of runs, c
through all
total numbers of digits, and p
through all partitions of c
with
r
summands.
The total number of partitions we look at is two less than the number
of partitions of n+1
, and the partition functions grows like
exp(sqrt(n))
. So while there are still easy ways to improve the
running time by reusing results (running through the partitions in a
different order), for a fundamental improvement we need to avoid
looking at each partition seperately.
Computing it fast
Note that (p+q)@r = p@r + q@r
. On its own, this just helps avoid
some multiplications. But together with (p+q)#r = p#r + q#r
it means
that we can combine by simple addition polynomials corresponding to
different partitions. We can't just add them all, because we still
need to know with which l
we have to @
-combine, which factor we
have to use, and which #
-extensions are still possible.
Let's combine all polynomials corresponding to partitions with the same
sum and length, and already account for the multiple ways of
distributing the lengths of runs of digits. Different from what I
speculated in the comments, I don't need to care about the smallest
used value or how often it is used, if I make sure that I won't extend
with that value.
Here is my C++ code:
#include<vector>
#include<algorithm>
#include<iostream>
#include<gmpxx.h>
using bignum = mpz_class;
using poly = std::vector<bignum>;
poly mult(const poly &a, const poly &b){
poly res ( a.size()+b.size()-1 );
for(int i=0; i<a.size(); ++i)
for(int j=0; j<b.size(); ++j)
res[i+j]+=a[i]*b[j];
return res;
}
poly extend(const poly &d, const poly &e, int ml, poly &a, int l, int m){
poly res ( 26*ml+1 );
for(int i=1; i<std::min<int>(1+26*ml,e.size()); ++i)
for(int j=1; j<std::min<int>(1+26*ml/i,d.size()); ++j)
res[i*j] += e[i]*d[j];
for(int i=1; i<res.size(); ++i)
res[i]=res[i]*l/m;
if(a.empty())
a = poly { res };
else
for(int i=1; i<a.size(); ++i)
a[i]+=res[i];
return res;
}
bignum f(int n){
std::vector<poly> dp;
poly digits (10,1);
poly dd { 1 };
dp.push_back( dd );
for(int i=1; i<n; ++i){
dd=mult(dd,digits);
int l=1+26*(n-i);
if(dd.size()>l)
dd.resize(l);
dp.push_back(dd);
}
std::vector<std::vector<poly>> a;
a.reserve(n);
a.push_back( std::vector<poly> { poly { 0, 1 } } );
for(int i=1; i<n; ++i)
a.push_back( std::vector<poly> (1+std::min(i,n+i-i)));
for(int m=n-1; m>0; --m){
// std::cout << "m=" << m << "\n";
for(int sum=n-m; sum>=0; --sum)
for(int len=0; len<=std::min(sum,n+1-sum); ++len){
poly d {a[sum][len]} ;
if(!d.empty())
for(int sumn=sum+m, lenn=len+1, e=1;
sumn+lenn-1<=n;
sumn+=m, ++lenn, ++e)
d=extend(d,dp[m],n-sumn,a[sumn][lenn],lenn,e);
}
}
poly let (27,1);
let[0]=0;
poly lp { 1 };
bignum t { 0 };
for(int sum=n-1; sum>0; --sum){
lp=mult(lp,let);
for(int len=1; len<=std::min(sum,n+1-sum); ++len){
poly &a0 = a[sum][len];
bignum s {0};
for(int i=1; i<std::min(a0.size(),lp.size()); ++i)
s+=a0[i]*lp[i];
bignum bin;
mpz_bin_uiui( bin.get_mpz_t(), n-sum+1, len );
t+=bin*s;
}
}
return t;
}
int main(){
int n;
std::cin >> n;
std::cout << f(n) << "\n" ;
}
This uses the GNU MP library. On debian, install libgmp-dev
. Compile
with g++ -std=c++11 -O3 -o pl pl.cpp -lgmp -lgmpxx
. The program
takes its argument from stdin. For timing, use echo 100 | time ./pl
.
At the end a[sum][length][i]
gives the number of ways in which sum
digits in length
runs can give the number i
. During computation,
at the beginning of the m
loop, it gives the number of ways that can
be done with numbers greater than m
. It all starts with
a[0][0][1]=1
.
Note that this is a superset of the numbers we need to compute the
function for smaller values. So at almost the same time, we could
compute all values up to n
.
There is no recursion, so we have a fixed number of nested loops. (The
deepest nesting level is 6.) Each loop goes through a number of values
that is linear in n
in the worst case. So we need only polynomial
time. If we look closer at the nested i
and j
loops in extend
,
we find an upper limit for j
of the form N/i
. That should only
give a logarithmic factor for the j
loop. The innermost loop in f
(with sumn
etc) is similar. Also keep in mind that we compute with
numbers that grow fast.
Note also that we store O(n^3)
of these numbers.
Experimentally, I get these results on reasonable hardware (i5-4590S):
f(50)
needs one second and 23 MB, f(100)
needs 21 seconds and 166
MB, f(200)
needs 10 minutes and 1.5 GB, and f(300)
needs one hour
and 5.6 GB. This suggests a time complexity better than O(n^5)
.
N
।