TEMPLATE

Hypothesis & Experiment (HEEC) Template

Every Core activity in the AusIndustry portal asks the same questions. Use this template verbatim. The character limits below are enforced by the portal.

Template structure

Hypothesis (≤ 4,000 chars)
We hypothesised that [system / approach X] would [produce outcome Y] when [conditions Z], measured by [metric M] within [tolerance T].
Independent variables: ...
Dependent variables: ...
Held constant: ...

Technical uncertainties
What competent professionals in the field could not have known or determined in advance.

New knowledge sought
The knowledge you set out to generate, stated as a question you can answer at the end.

Sources investigated (≤ 1,000 chars)
Papers, docs, benchmarks, vendor literature reviewed before deciding the experiment was necessary.

Experiment
What you actually built and ran. Control / baseline. Variations. Measurement apparatus.

Evaluation
Observations against the hypothesis. Numbers, tables, before/after. Where it confirmed, where it diverged.

Conclusion
What the experiment proved (or didn't), and the next hypothesis it surfaced.

Field-by-field guidance

Hypothesis (≤ 4,000 chars)

A single falsifiable statement, with measurable parameters, the variables you'll change (independent), the variables you'll measure (dependent), the variables you'll hold constant, and the intended outcome.

Technical uncertainties

What competent professionals in the field could not have known or determined in advance. Not "we hadn't tried it" - "no public knowledge predicts the answer".

New knowledge sought

The knowledge you set out to generate. State it as a question you can answer at the end.

Sources investigated (≤ 1,000 chars)

Papers, docs, benchmarks, vendor literature you reviewed before deciding the experiment was necessary. Cite specifically - "PyTorch 2.4 release notes", not "industry literature".

Experiment

What you actually built and ran. The control / baseline. The variations. The measurement apparatus. Enough detail that another engineer could rebuild it.

Evaluation

The observations against the hypothesis. Numbers, tables, before/after. Where it confirmed, where it diverged.

Conclusion

What the experiment proved (or didn't), and the next hypothesis it surfaced. Often "the simple approach doesn't work because X" is the conclusion - and that's fine.

Worked example: vector retrieval at sub-50ms p99

Hypothesis

We hypothesised that approximate-nearest-neighbour retrieval over a 12M-vector corpus could sustain <50ms p99 latency at 800 concurrent queries on a single A10G, using a hybrid HNSW + product-quantisation index, without recall dropping below 0.92@10 relative to an exact brute-force baseline. Independent variables: index type (HNSW vs IVF-PQ vs IVF-OPQ), M and ef-construction parameters, PQ subspace count. Dependent: p50/p99 latency, recall@10, memory footprint. Constants: corpus, query distribution, hardware, query batch size, OS, kernel.

Technical uncertainty

Public benchmarks cover either latency or recall in isolation, not the combined constraint on our query distribution (long-tail entity queries with 18% near-duplicates).

Experiment

Built a benchmark harness that replayed 14 days of production query logs against each index configuration. Held all environment variables constant. Ran 200k queries per configuration.

Evaluation

HNSW M=48 ef=256 hit 38ms p99 at 0.94 recall - passed. IVF-PQ with 64 subspaces hit 22ms p99 but recall collapsed to 0.81 - failed. IVF-OPQ recovered recall to 0.90 but cost 41ms p99 - borderline.

Conclusion

HNSW with the tuned parameters meets the latency-recall constraint on our distribution. Memory cost (28GB) exceeds budget, surfacing a new hypothesis on tiered storage.

Ready to file?

Stop reading. Start preparing.

KarenGrants applies everything in these guides automatically - from your codebase, tickets and payroll - and hands your tax agent a lodgement pack at the end.

Start your claim free