Model Strategy

We follow Mistral's playbook: start with fine-tuned models, progress to mixture-of-experts, then train from scratch. Each phase builds on the previous.

Phase 1A

Foundation (Months 1-6)

~$500K-1M compute
  • Fine-tune Llama 3 / Qwen / Mistral on Spanish/Portuguese legal, government, and financial corpora
  • Release "Plata 7B" (Apache 2.0)
  • Build inference API platform
  • Focus: fast release, community building, proof of concept
Phase 1B

Mixture of Experts (Months 6-12)

~$2-5M compute
  • Train "Plata Mixtral" equivalent (sparse MoE, 8x7B)
  • Legal/financial domain expertise
  • On-premise deployment option for government
  • Focus: demonstrate capability, government pilots
Phase 2

Frontier Pre-Training (Year 2-3)

~$10-20M compute
  • Pre-train 50B+ parameter model from scratch
  • Multilingual: Spanish, Portuguese, Guarani, Quechua
  • Vision-language capabilities (document AI)
  • Focus: competitive with GPT-4 class on regional tasks

Product Stack

Our product stack mirrors Mistral's but with regional customization:

1

Plata Chat

Mistral Vibe

Consumer and business chatbot. Spanish and Portuguese first. Legal and government knowledge built-in. $14.99/mo consumer plan.

2

Plata Studio

Mistral Studio

Agent builder for government and enterprise workflows. Low-code interface for building AI agents on top of Plata models.

3

Plata Forge

Mistral Forge

Custom model training and fine-tuning platform. Government agencies train models on their own classified data.

4

Plata Compute

Mistral Compute

Sovereign inference infrastructure. On-premise deployment for air-gapped environments. Itaipu-powered data centers.

5

Plata Legal

Mistral OCR + Legal

Document intelligence for legal and government documents. PDF parsing, form extraction, legal text synthesis.

6

Plata API

La Plateforme

Developer platform for inference, embeddings, fine-tuning. RESTful API with Spanish/Portuguese optimization.

Infrastructure Strategy

Our infrastructure is designed around three principles: sovereignty, cost efficiency, and regional proximity.

Data Center Locations

PY

Itaipu, Paraguay

Training Hub
  • Ultra-cheap green hydro power ($0.02/kWh)
  • 14 GW installed capacity
  • Central Mercosur location
  • Phase 1: 64 GPUs (Year 1)
  • Phase 2: 512+ GPUs (Year 2)
  • Phase 3: 2,000+ GPUs (Year 3)
AR

Buenos Aires, Argentina

Talent Hub
  • Engineering and R&D headquarters
  • Government proximity
  • Tierra del Fuego free trade zone for hardware
  • Inference for Argentine market
  • University partnerships (UBA, ITBA)
BR

São Paulo, Brazil

Enterprise
  • Enterprise sales and support
  • Financial services inference
  • Largest market access
  • Portuguese-first services
  • Partnership with local cloud providers

Hardware Stack

Training GPUs NVIDIA H100 / B200
Inference GPUs NVIDIA A100 / L40S
Alternative AMD MI300X (evaluate)
Networking InfiniBand / 400Gbps
Storage NVMe + S3-compatible

Software Stack

# Inference Engine
- vLLM (primary) / TensorRT-LLM (NVIDIA optimized)
- TGI (Hugging Face) for compatibility

# Training Framework
- PyTorch (primary) / JAX (for large-scale pre-training)
- DeepSpeed / FSDP for distributed training
- Megatron-LM for large model training

# Orchestration
- Kubernetes (K8s) for container orchestration
- Slurm for HPC cluster management
- Ray for distributed applications

# Data Pipeline
- Apache Spark for data processing
- Hugging Face Datasets for model training data
- Weights & Biases for experiment tracking

# Government Deployment
- Air-gapped Kubernetes (no internet)
- Custom OS hardening (SELinux/AppArmor)
- Hardware security modules (HSM) for key management
- Zero-trust network architecture

Open Source Strategy

Following Mistral's proven playbook:

  • Release base models under Apache 2.0 (free, open, no restrictions)
  • Keep best versions proprietary (API-only access)
  • Use open-source to build ecosystem, recruit talent, and create brand
  • Community engagement: Discord, Hugging Face, GitHub, regional conferences
  • License mix: Apache 2.0 (community), proprietary (enterprise), research license (academia)

Security & Compliance

Government deployments require military-grade security:

Data Sovereignty

All data stays within country borders. No cross-border data transfer for government clients.

Air-Gapped Deployment

Models run without internet connection. Fully isolated environments for defense and intelligence.

Audit & Logging

Complete audit trail of all model interactions. Immutable logs for compliance.

Encryption

At-rest and in-transit encryption. Hardware security modules for key management.

Cost Projections

Item Year 1 Year 2 Year 3
Compute (training) $500K $5M $20M
Compute (inference) $100K $1M $5M
Data center (Itaipu) $200K $2M $10M
Engineering team $1M $5M $15M
Sales & operations $300K $1.5M $5M
Total $2.1M $14.5M $55M