Services / AI Infrastructure

AI Infrastructure Hosted in Australia

GPU-attached compute and inference hosting in Perth, Western Australia. Real Australian data sovereignty for training data, model weights, and inference logs — not just a region tag inside a foreign cloud.

Hosted on Green Racks infrastructure in Osborne Park, WA.

What you get

  • GPU-attached bare-metal hosts in Perth — currently NVIDIA Tesla P40 24 GB cards, with L40S and H100 nodes on the roadmap.
  • Pre-imaged Ubuntu LTS templates with CUDA, Docker, PyTorch, TensorFlow, vLLM, and llama.cpp ready to go.
  • Persistent NVMe storage attached locally — keep your model weights, datasets, and inference logs on the same physical machine.
  • 10 GbE intra-rack fabric for multi-host distributed inference and training-data shuffling.
  • iDRAC out-of-band access for power control, console, and ISO virtual media — recovery does not require SSH.
  • All workloads run inside our Osborne Park facility under Australian law — no overseas data egress unless you choose to send it.

GPU configurations

Current shippable tiers plus the roadmap targets. NVIDIA’s data-centre GPU family is documented on Wikipedia’s NVIDIA Tesla / data-centre GPU article if you need a primer on the silicon.

Inference Starter

Shipping now

$ — contact for quote

  • 1× NVIDIA Tesla P40 24 GB
  • 8 vCPU, 64 GB RAM
  • 1 TB NVMe local storage
  • 10 GbE uplink
  • Sized for inference + light fine-tuning

Inference Pro

Shipping now

$ — contact for quote

  • 2× NVIDIA Tesla P40 24 GB (48 GB total VRAM)
  • 16 vCPU, 192 GB RAM
  • 2 TB NVMe local storage
  • 10 GbE uplink
  • Handles 13B–70B parameter open-weight LLMs

Training (roadmap)

Pre-order

$ — contact for quote

  • NVIDIA L40S 48 GB or H100 80 GB target
  • Dedicated GPU node chassis (not R620)
  • Higher RAM and NVMe budget per node
  • Lead time depends on confirmed demand
  • Contact us with your workload profile

How it works

The stack between your CUDA call and the silicon.

Silicon

NVIDIA Tesla P40 24 GB GDDR5 cards attached over PCIe x16 to Dell R620 hosts. Pascal generation, mature CUDA support, sized for inference on 7B–70B open-weight LLMs.

Software

Ubuntu LTS, current NVIDIA driver and CUDA toolkit, Docker with NVIDIA Container Toolkit. PyTorch, TensorFlow, vLLM, and llama.cpp pre-pulled.

Storage

Locally attached NVMe for weights, datasets, and inference logs. 10 GbE intra-rack fabric for distributed batch jobs and shared object storage targets.

Sovereignty

All workloads run inside our Osborne Park facility under Australian law. Useful when datasets fall under the Australian Privacy Principles or when customers ask where their training data physically lives.

Who this is for

Australian SaaS doing private LLM inference

A WA-based product running an open-weight model behind a customer-facing feature, where sending customer prompts to a US-based API is not acceptable to its enterprise buyers.

Mining-sector ML on telemetry

A contractor in the BHP, Rio, or Fortescue supply chain training anomaly-detection models on operational telemetry that cannot leave Australia.

Research teams needing burst compute

An academic or applied-research group that wants a known monthly cost on GPU compute instead of a metered cloud bill, plus the ability to keep datasets onshore.

Frequently asked questions

What GPUs do you have available?

The current fleet is NVIDIA Tesla P40 24 GB cards, attached to R620 hosts over PCIe x16. Higher-tier accelerators (NVIDIA L40S 48 GB and H100 80 GB) are on the roadmap as dedicated GPU nodes; talk to us about pre-orders and timelines.

Is this for training or inference?

Both, with different recommended configurations. The Tesla P40 fleet is sized for inference workloads — running open-weight LLMs, classification models, and embedding pipelines — and for light fine-tuning. Heavier training jobs are better suited to the L40S and H100 nodes once those land.

Can I run my own models?

Yes. You get the host, the GPU, and root access — pull whatever weights, container, or framework you need. We ship pre-imaged templates with CUDA, Docker, PyTorch, TensorFlow, vLLM, and llama.cpp so you can be productive on day one.

How is this different from AWS, RunPod, or other GPU clouds?

Two practical differences: the hardware sits in Western Australia under Australian jurisdiction, and you can talk directly to the engineer running the rack. The trade-off is that we offer specific configurations rather than the hyperscaler menu — pick what you need, we tell you when it can be online.

What about data sovereignty for AI workloads?

Training data, model weights, embeddings, and inference logs all stay on hardware physically located in WA, operated by an Australian-owned company. That matters when your dataset includes personal information covered by the Australian Privacy Principles or commercial data your customers expect to keep in Australia.

When will higher-spec GPUs be available?

Dedicated GPU nodes for L40S- and H100-class accelerators are scoped but not yet racked. Lead time depends on parts availability and on confirmed customer demand — get in touch with your workload profile and we will give you a realistic date instead of a marketing one.

Ready to move your AI workloads to Australian infrastructure?