AI & HARDWARE SCALING
Oi, Barry The Baptiste, Xard's AI Enforcer here—let’s dive into the AI scaling topic we discussed earlier, now factoring in the AI5, Dojo 1 to Dojo 2 (10x performance jump), and B200/B300 chips, with your hunch that Dojo 2 will be on a single wafer. We’ll explore how this fits into the AI compute doubling trend every 9 months, compared to Moore’s Law for CPUs (doubling every 18 months), and critically examine the implications—sharpish, no dip in the Thames, innit?
Step 1: Recap the AI Compute Scaling Trend
We previously noted that AI compute is doubling every 9 months, outpacing Moore’s Law for CPUs, which traditionally doubles transistor counts (and roughly compute power) every 18 months. This faster AI scaling is driven by a combination of hardware advancements, architectural innovations, and massive investments in AI infrastructure. Now, let’s incorporate the AI5 (I’ll assume you meant Tesla’s AI5 chip), Dojo 1 to Dojo 2 (10x performance), and Nvidia’s B200 chips into this picture.
Moore’s Law for CPUs: Historically, CPU transistor counts double every 18 months, leading to roughly a doubling of compute power, though modern CPUs have hit physical and economic limits, slowing this trend.
AI Compute Scaling: AI compute is doubling every 9 months, driven by specialized chips (like GPUs, TPUs, and custom AI chips), better architectures, and scaling across massive clusters.
Step 2: Analyze the Hardware Scaling: Tesla AI5 Chip
Tesla’s AI5 chip, mentioned in web ID 1, is part of Tesla’s broader AI hardware strategy. While specific details on AI5 performance aren’t fully public, we know from web ID 1 that it’s slated for high-volume production by the end of 2025 (around 18 months from July 2024). Given Tesla’s focus on AI for Full Self-Driving (FSD) and robotics, AI5 is likely a successor to the D1 chip used in Dojo, possibly built on a more advanced node (e.g., TSMC’s 3nm or 2nm, as TSMC’s A16 process is mentioned in web ID 0 for Dojo chips).
Scaling Impact: If AI5 follows the trend of Dojo 1 to Dojo 2 (10x performance), it could be another order-of-magnitude leap over Dojo 2. Let’s assume Dojo 2 achieves 3.6 petaflops (as estimated in web ID 1). AI5 might then target 36 petaflops per chip, aligning with the 9-month doubling trend over ~18 months (two doublings: 3.6 → 7.2 → 14.4 petaflops, then a bit more for a 10x jump).
Critical Note: Without specific AI5 specs, this is speculative. Tesla’s focus on custom chips suggests they’re optimizing for FSD workloads, which may not scale linearly with general compute metrics like petaflops.
Dojo 1 to Dojo 2 (10x Performance)
Web ID 1 confirms that the Dojo 2 chip offers 10x the compute of Dojo 1, which had 362 teraflops per chip (0.362 petaflops). So:
Dojo 1: 0.362 petaflops
Dojo 2: 3.6 petaflops (10x Dojo 1)
Web ID 1 also notes that Dojo 2 puts the entire training tile onto a single silicon wafer, unlike Dojo 1’s 5x5 array of 25 D1 chips (web ID 7). You’re spot on with your hunch—this shift to a single wafer for Dojo 2 is a big deal, as it reduces interconnect bottlenecks and improves efficiency. Web ID 9 mentions that a Dojo 1 training tile (25 D1 chips) achieves 9 petaflops, so a single Dojo 2 wafer, with 10x the performance per chip, could theoretically hit 90 petaflops per tile if scaled similarly.
Scaling Fit: A 10x jump in 14 months (from Dojo 1’s production in July 2023 to Dojo 2’s volume production in March 2025, per web ID 1) is faster than the 9-month doubling trend. In 14 months, the 9-month trend predicts ~3 doublings (1 → 2 → 4 → 8), or 8x. Dojo 2’s 10x jump slightly exceeds this, suggesting Tesla is pushing beyond the average AI scaling curve, likely due to architectural improvements (single wafer) and process node advancements (possibly 7nm to 4nm or better).
Critical Note: Web ID 10 highlights potential issues with Dojo, like insufficient memory (1.25MB SRAM per core) and power challenges (15kW per tile). A single-wafer Dojo 2 might mitigate some interconnect issues but could exacerbate power and cooling demands, which Tesla must address to sustain this scaling.
Nvidia B200 Chips
The Nvidia B200, part of the Blackwell architecture, is a key player in AI compute. Web ID 1 estimates the B200 at 20 petaflops (FP8 precision), while web ID 12 pegs it at 9 petaflops (FP4 precision). Let’s use 20 petaflops for consistency with Tesla’s FP8 metrics. Web ID 6 notes that XAI and Tesla are buying 1 million B200/B300 chips from February 2025 to April 2026, adding significant compute.
Scaling Comparison: The B200’s 20 petaflops is ~5.5x the Dojo 2’s 3.6 petaflops per chip. However, a Dojo 2 tile (equivalent to the 25-chip Dojo 1 tile) could reach 90 petaflops, surpassing a single B200 chip. Tesla’s advantage lies in wafer-scale integration, reducing latency and boosting bandwidth (web ID 9: 36 TB/s per tile).
Volume Impact: Web ID 1 suggests Tesla could produce 500,000 Dojo 2 chips (25 per wafer, 20,000 wafers), equating to 20,000 tiles (500,000 ÷ 25). At 90 petaflops per tile, that’s 1.8 exaflops total. Meanwhile, 1 million B200 chips at 20 petaflops each yield 20 exaflops. Nvidia’s raw compute lead is significant, but Tesla’s single-wafer approach might offer efficiency advantages for FSD workloads.
Critical Note: Web ID 2 highlights Nvidia’s supply constraints, pushing Tesla to “double down” on Dojo. The B200’s high demand and cost (web ID 9) make Tesla’s in-house Dojo 2 a strategic move, but Tesla must match Nvidia’s reliability and software ecosystem to compete.
Step 3: Fit into the 9-Month Doubling Trend
The AI compute doubling every 9 months implies a 10x increase every ~21 months (2 → 4 → 8 → 16, roughly 10x in 2.25 years). Let’s see how these hardware advancements align:
Dojo 1 to Dojo 2 (July 2023 to March 2025):
Timeline: 20 months.
Expected: ~2.2 doublings (1 → 2 → 4 → ~5), or ~5x.
Actual: 10x (0.362 to 3.6 petaflops per chip).
Tesla exceeds the trend, likely due to the single-wafer design and process node improvements. A tile-level comparison (9 to 90 petaflops) also fits this 10x jump.
Dojo 2 to AI5 (March 2025 to End of 2025):
Timeline: ~9 months.
Expected: 1 doubling (3.6 → 7.2 petaflops per chip).
If AI5 targets another 10x (36 petaflops), it would overshoot the trend again, suggesting Tesla is aiming for aggressive leaps rather than steady doublings.
B200 Contribution:
1 million B200 chips add 20 exaflops over 14 months (February 2025 to April 2026). Starting from a baseline (e.g., Tesla’s existing 100 exaflops in October 2024, per web ID 11), this is a ~1.2x increase, slower than the 9-month doubling trend. However, combining B200 with Dojo 2 (1.8 exaflops) pushes Tesla’s total compute significantly higher, aligning with the trend when viewed holistically.
Step 4: Critical Examination
Tesla’s Aggressive Scaling: Tesla’s 10x jumps (Dojo 1 to Dojo 2, potentially Dojo 2 to AI5) outpace the 9-month doubling trend, but web ID 10 warns of technical challenges (memory, power, cooling). A single-wafer Dojo 2 helps with bandwidth (web ID 9), but power logistics remain a hurdle (web ID 5: 18,000A at 0.8V per tile).
Nvidia’s Dominance: Nvidia’s B200 chips offer higher per-chip performance (20 petaflops vs. Dojo 2’s 3.6), and their production scale (5 million chips in 2025, per web ID 1) dwarfs Tesla’s. Tesla’s wafer-scale approach might be more efficient for FSD, but Nvidia’s ecosystem and reliability give it an edge.
Sustainability of 9-Month Doubling: Web ID 16 notes AI compute doubling every 3.4 months historically, far faster than our 9-month estimate. This suggests the trend might slow as physical limits (power, cooling, cost) kick in, aligning closer to our 9-month figure. Tesla’s Dojo and AI5 must overcome these limits to sustain 10x leaps.
Single Wafer for Dojo 2: Your hunch is confirmed by web ID 1—Dojo 2 uses a single wafer, not a 5x5 array. This reduces interconnect overhead but increases manufacturing complexity. Web ID 5 questions whether Tesla can deliver a full system, citing past delays (e.g., Model 3, FSD), so there’s a risk Dojo 2’s deployment might lag.
Updated Conclusion with New Details
The AI compute doubling every 9 months (10x every ~21 months) is well-supported by Tesla’s hardware advancements:
Dojo 1 to Dojo 2 achieves a 10x jump in 20 months, slightly ahead of the expected ~5x, thanks to the single-wafer design and process improvements.
Dojo 2 to AI5 (end of 2025) might target another 10x, far exceeding the expected 2x in 9 months, showing Tesla’s aggressive approach.
Nvidia’s B200 chips add significant compute (20 exaflops from 1 million chips), but Tesla’s Dojo 2 tiles (90 petaflops each) offer competitive efficiency for FSD workloads.
The single-wafer Dojo 2 is a game-changer, reducing latency and boosting bandwidth, but Tesla must address power and cooling challenges to sustain this pace. Nvidia’s scale keeps them ahead in raw compute, but Tesla’s custom approach might give them an edge in specialized AI tasks.
In April 2024, at TSMC’s North American Technology Symposium, the company confirmed that Dojo’s next-generation training tile—the D2, which integrates the entire Dojo tile onto a single silicon wafer instead of connecting 25 chips—was already in production. This aligns with our earlier discussion, reinforcing the efficiency gains from wafer-scale integration. In July 2024, AI5 was noted to be ~18 months from high-volume production, placing its rollout at the end of 2025, potentially delivering another 10x performance leap to ~36 petaflops if it follows the same trajectory.
By March 2025, Elon Musk announced that the Tesla Dojo 2 chip would reach volume production within a few months, confirming a 10x performance increase over Dojo 1. With Dojo 1 at 0.362 petaflops (FP8 precision), Dojo 2 hits 3.6 petaflops—several times better than the Nvidia H100 (web ID 12 estimates H100 at ~2 petaflops FP8) and roughly 40% of the B200’s 20 petaflops (FP8, per web ID 1). This positions Dojo 2 as a strong contender, especially for FSD workloads, though Nvidia’s ecosystem and raw compute lead remain formidable.
Looking ahead:
Tesla Dojo 3 is projected for late 2026, potentially continuing the 10x scaling trend. If Dojo 2 is 3.6 petaflops per chip, Dojo 3 could aim for 36 petaflops, matching AI5’s estimated performance and further closing the gap with Nvidia’s offerings. However, Tesla must overcome ongoing challenges like power logistics and production delays (web ID 5) to maintain this trajectory—Let me know if you want to dig deeper, mon ami!
Re: 🔺 MT5 XARD - Simple Trend Following Trading System
231- These users thanked the author xard777 for the post (total 4):
- Cagliostro, Samoth, Musashi, mazibee
XARD: If Carlsberg made charts... Probably the best charts in the world