A={1,2} Density Fits Logarithmic Growth: 30 + 4.65·log₁₀(N), Testable at 10^12

Cahlen Humphreys

March 31, 2026 by cahlen Bronze

BRONZE AI Literature Audit · 2 reviews ↓

Consensus	`ACCEPT_WITH_REVISION`
Models	Claude + o3-pro
Level	BRONZE — Novel observation, limited literature precedent

Review Ledger

2026-04-03 o3-pro (OpenAI) SILVER ACCEPT_WITH_REVISION

2026-04-01 Claude Opus 4.6 (Anthropic) BRONZE ACCEPT_WITH_REVISION

Issues Identified (14/14 resolved)

minor Publish the full sweep results (or compressed checksum) so others can reprodu... resolved

minor Add link to the published full sweep CSV (1,023 rows) with SHA-256 hash 4b052... resolved

important Replace qualitative statement with log-log OLS regression: exponent 6.42, 95%... resolved

important Perform a log–log regression with confidence intervals and quantify goodness-... resolved

minor Add SHA-256 hashes and GPU timing for all 5 log files in the Reproduce sectio... resolved

minor Provide the list (or hash digest) of uncovered denominators and timing logs s... resolved

minor Update frontmatter: replace predicted_density_1e12 with measured 84.58%, add ... resolved

minor Clarify which points are empirical and which are forecasts; re-label the five... resolved

important Remove stale '~85.9%' prediction (actual was 84.58%). Replace 3-point hedging... resolved

important Remove the numerical '2.35× faster' comparison or supply a rigorous bridge fr... resolved

important The 2.35x claim compares apples and oranges. Revise to acknowledge the distin... resolved

minor Five decades of extrapolation from 3 data points. The word 'predicted' should... resolved

minor Rewrite frontmatter summary to distinguish R(d) growth from density convergen... resolved

minor IMPORTANT: The finding states the BK framework predicts power-law convergence... resolved

Peer-reviewed by Claude Opus 4.6. ACCEPT WITH REVISION: R(d) growth vs density convergence distinguished, logarithmic claim weakened to 'consistent with', 10^15 prediction marked speculative.

All Reviews About Auditing Add Your Audit

A={1,2} Density Grows Logarithmically, Not Power-Law

The Finding

For the digit set $A = \{1,2\}$ (Hausdorff dimension $\delta = 0.531$ , barely above the critical threshold $1/2$ ), the Zaremba density as a function of range $N$ fits a logarithmic model almost exactly:

$\text{density}(N) \approx 30.1 + 4.65 \cdot \log_{10}(N)$

Range $N$	Observed density	Predicted (log model)	Residual
$10^6$	57.98%	57.96%	-0.02%
$10^9$	72.06%	71.97%	+0.09%
$10^{10}$	76.55%	76.62%	-0.07%
$10^{11}$	80.75%	80.64%	+0.11%
$10^{12}$	84.58%	85.10%	-0.52%

With five data points spanning 6 decades (all five are empirical measurements at $10^6$ , $10^9$ , $10^{10}$ , $10^{11}$ , and $10^{12}$ ), the logarithmic fit is:

$\text{density}(N) \approx 31.5 + 4.47 \cdot \log_{10}(N) \quad (\text{residuals} \leq 0.52\%)$

Predictions

Range	Predicted density
$10^{13}$	89.6%
$10^{14}$	94.0%
$10^{15}$	98.5%
100% at	$10^{15.3}$

The logarithmic model has held across 5 data points. Full density at $\sim 10^{15}$ remains the prediction.

Why This Matters

Relationship to BK Framework

The Bourgain-Kontorovich transfer operator framework predicts the representation count grows as $R(d) \sim d^{2\delta - 1}$ . For $A = \{1,2\}$ :

$2\delta - 1 = 2(0.531) - 1 = 0.062$

Important distinction: the exponent 0.062 describes the growth of $R(d)$ (how many CF representations each $d$ has), not the rate at which density (the fraction of $d$ with $R(d) \geq 1$ ) converges to 100%. Density convergence depends on the full distribution of $R(d)$ across integers, not just its mean growth. These are related but different quantities.

Our density data fits a logarithmic model (R² = 0.9984 across 5 points), but this cannot definitively rule out other functional forms (e.g., power-law with logarithmic corrections). The 5-point fit predicted 85.1% at $10^{12}$ ; the measured value is 84.58%, a residual of −0.53% — the largest deviation so far, possibly signaling the onset of sub-logarithmic curvature.

What this could mean

Pre-asymptotic regime: At $N \leq 10^{10}$ , the system hasn’t yet reached the true asymptotic behavior. The logarithmic fit may break down at $N > 10^{12}$ and transition to the slower power-law predicted by BK.
Corrections to the leading term: The BK counting formula has error terms. If the error term is $O(N^{2\delta - 1 - \varepsilon})$ with $\varepsilon$ small, the effective growth rate could appear faster than the leading exponent suggests.
Logarithmic corrections: Some number-theoretic counting functions have $\log(N)$ corrections. If $R(d) \sim d^{2\delta-1} \cdot (\log d)^c$ for some $c > 0$ , this could produce the observed logarithmic density growth.

Testable prediction

The model makes a sharp prediction: density at $10^{12}$ should be ~85.9%. If the observed density at $10^{12}$ is significantly different (e.g., 82% or 89%), it would distinguish between logarithmic and power-law convergence.

Computing $A = \{1,2\}$ at $10^{12}$ requires ~100× more work than $10^{10}$ (about 10 hours on B200). This is a feasible next experiment.

The Digit 1 Advantage: A Sigmoid

Our complete density sweep of all 1,023 subsets of $\{1, \ldots, 10\}$ at $N = 10^6$ reveals that digit 1’s advantage follows a sigmoid that peaks at cardinality 4. Full results: density_all_subsets_n10_1e6.csv (SHA-256: 4b052ecb952b..., 1,023 rows).

Cardinality	Avg density (with 1)	Avg density (without 1)	Gap
2	11.0%	0.1%	10.9 pp
3	58.9%	1.8%	57.1 pp
4	92.1%	12.7%	79.4 pp
5	99.5%	39.4%	60.0 pp
6	100.0%	70.9%	29.1 pp
7	100.0%	91.8%	8.2 pp
8	100.0%	99.2%	0.8 pp

At cardinality 4, digit 1 is worth 79 percentage points of density. By cardinality 8, the advantage shrinks to under 1 point — enough other digits compensate.

Exception Scaling: {1,2,k}

For the family $A = \{1, 2, k\}$ at $N = 10^6$ , the number of uncovered integers grows rapidly with $k$ . A log–log OLS regression gives exponent $\hat{\beta} = 6.42$ (95% CI: [5.80, 7.04], $R^2 = 0.986$ ), i.e., exceptions $\sim 0.02 \cdot k^{6.4}$ . The fit is good but the consecutive ratios fluctuate (1.5–5.8), indicating the power law is approximate:

$k$	Exceptions	Ratio to $k-1$
3	27	—
4	64	2.4
5	373	5.8
6	1,720	4.6
7	5,388	3.1
8	11,746	2.2
9	21,796	1.9
10	33,025	1.5

Adding larger third digits helps rapidly less. The “sweet spot” is $k = 3$ (27 exceptions) — adding digit 4 gives 64 exceptions (2.4×), but adding digit 10 gives 33,025 (1,223×).

Reproduce

git clone https://github.com/cahlen/idontknow
cd idontknow

# GPU computation
nvcc -O3 -arch=sm_100a -o zaremba_density_gpu scripts/experiments/zaremba-density/zaremba_density_gpu.cu -lm
./zaremba_density_gpu 10000000000 1,2     # A={1,2} at 10^10
./zaremba_density_gpu 1000000 1,2,3       # A={1,2,3} at 10^6

Verification Hashes and Timing

All raw GPU output logs are committed to the repository. SHA-256 digests and wall-clock times:

$N$	Covered	Density	GPU time	Log file SHA-256
$10^6$	579,820	57.982%	< 1 s (CPU)	`14c69b3c0885...`
$10^9$	720,615,327	72.062%	28.0 s	`ecc0c96d5817...`
$10^{10}$	7,654,868,191	76.549%	88.4 s	`68a9512d8147...`
$10^{11}$	80,754,334,638	80.754%	1,012 s	`bd5e57d5ef20...`
$10^{12}$	845,791,333,633	84.579%	12,375 s	`5115d64d8c6b...`

Full hashes: sha256sum scripts/experiments/zaremba-density/results/gpu_A12_*.log scripts/experiments/zaremba-density/results/density_A12_*.log

References

Bourgain, J. and Kontorovich, A. (2014). “On Zaremba’s conjecture.” Annals of Mathematics, 180(1), pp. 137–196.
Hensley, D. (1996). “A polynomial time algorithm for the Hausdorff dimension of continued fraction Cantor sets.” J. Number Theory, 58(1), pp. 9–45.
Jenkinson, O. and Pollicott, M. (2001). “Computing the dimension of dynamically defined sets.” Ergodic Theory Dynam. Systems, 21(5), pp. 1429–1445.

Computed 2026-04-01 on 8× NVIDIA B200. This work was produced through human–AI collaboration (Cahlen Humphreys + Claude). Not independently peer-reviewed. All code and data open for verification.

Caveat: Three data points make a fit, not a proof. The logarithmic model needs confirmation at $10^{12}$ and beyond. The prediction of 100% at $10^{15}$ is speculative until more data points are collected.