Why I Used Econometrics Instead of ML to Estimate Export Potential — And What I Learned Implementing It

Table of Contents

In 2020, I was handed a PDF — an ILO working paper titled Spotting Export Potential and Implications for Employment in Developing Countries (Cheong, Decreux & Spies, 2018) — and asked to turn it into a working algorithm.

The paper describes a methodology developed by the International Trade Centre to identify a country’s unrealized export opportunities, and then estimate how many jobs realizing those opportunities would create. Across six developing countries. At the product-market-sector level.

My first instinct was: this sounds like a job for machine learning. My second instinct, after actually reading the paper, was: no, it really isn’t. And my third insight, after spending several weeks getting the implementation wrong before getting it right, was about something much more fundamental than the ML vs econometrics debate: it was about how thinking in matrices rather than loops is not just a performance concern — it’s an epistemological one.

This post is about all three.

The Paper: What It Actually Does
#

The methodology has two parts.

Part one computes the Export Potential Indicator (EPI) — a score for every (exporting country, product, target market) triple that represents how much more a country could export given its current supply capacity, the target market’s demand, and how easy it is for those two to trade with each other. The gap between potential and actual exports is “untapped potential.”

The formula structure is:

EPI(country, product, market) = min(supply, demand) × ease_of_exporting

The supply component projects future market share based on current export share and relative GDP growth. The demand component projects future import volume adjusted for tariff advantages and bilateral distance. The ease component is a ratio of actual to hypothetical trade, capturing proximity, language, and commercial history.

Part two translates that unrealized export potential into employment, using Leontief input-output analysis. If a sector’s exports increase by $X, production must increase by at least $X (direct effect), and that production requires inputs from upstream sectors, who need inputs of their own — a multiplier chain formalized as:

dy = (I - BA)^{-1} dx

Where A is the matrix of technical coefficients (input intensities), B is a diagonal matrix of domestic supply shares, and (I - BA)^{-1} is the Leontief inverse — the total production change required throughout the economy per unit of final demand increase. Employment follows proportionally: dl = diag(l/y) · dy.

The paper applies this across Benin, Ghana, Guatemala, Morocco, Myanmar, and the Philippines. The output is sector-level employment creation estimates, disaggregated by gender and skill level.

The Implementation Mistake I Made First
#

My initial implementation looked like this:

for country in countries:
    for product in products:
        for market in markets:
            epi = compute_epi(country, product, market)
            results.append((country, product, market, epi))

This is wrong. Not just slow — wrong in the way that obscures what you’re actually doing.

The EPI is not a scalar function applied to individual triples. It is a computation over tensors. Supply is a matrix indexed by (country, product). Demand is a matrix indexed by (market, product). Ease is a matrix indexed by (country, market). The EPI is their combination — a three-dimensional array.

When you implement it as nested loops, you lose the structure. You can’t see that you’re taking the element-wise minimum of two matrices projected into the same space. You can’t see that ease is being broadcast across all products. You can’t vectorize it later because the logic is buried in conditional branches inside the loop body.

The matrix formulation, by contrast, forces clarity:

# supply[country, product], demand[market, product], ease[country, market]
supply_projected = supply * (1 + gdp_growth_relative)[:, np.newaxis]
demand_projected = demand * (1 + pop_growth + rev_elasticity * gdppc_growth)

# EPI: for each (country, product, market)
epi = np.minimum(
    supply_projected[:, :, np.newaxis],  # (countries, products, 1)
    demand_projected[np.newaxis, :, :]   # (1, markets, products) — note: transposed
) * ease[:, np.newaxis, :]              # (countries, 1, markets)

Now the math is transparent. The broadcasting operations correspond exactly to the formula in the paper. You can audit each step against the appendix. And it runs in seconds instead of hours.

The Leontief computation is even more explicit:

A = Z / y  # technical coefficients matrix
B = np.diag(d / (m + d))  # import share diagonal matrix
leontief_inverse = np.linalg.inv(np.eye(n) - B @ A)

# employment multiplier
dl = np.diag(l / y) @ leontief_inverse @ dx

Three lines. Directly from the technical appendix. Zero ambiguity.

Why ML Would Have Been the Wrong Choice
#

By the time I had the correct implementation running, the ML question had answered itself. But it’s worth articulating why.

1. The question is “how many jobs”, not “which sector will grow”
#

Machine learning is excellent at prediction. Given historical data on which sectors in which countries expanded exports, a gradient boosted tree could probably rank future opportunities with decent accuracy. But that’s not what the ILO/ITC methodology is trying to answer.

The question is: if we help a country realize its export potential in sector X, how many jobs would that create, and where — directly in sector X, and indirectly in the upstream industries that supply it?

That question requires a structural model. You need technical coefficients (how much steel goes into making cars). You need labor intensity ratios (how many workers per unit of output). You need the IO matrix to trace the supply chain. A black-box model trained on historical correlations cannot give you a number that a policymaker can use to justify a budget allocation.

Mullainathan & Spiess (2017), in what remains the best paper on this topic, put it precisely: machine learning solves the problem of prediction, while many economic applications revolve around parameter estimation. The policy question here is a parameter estimation problem.

2. Interpretability isn’t a luxury — it’s the deliverable
#

The EPI gives three named, decomposable components: supply, demand, ease. A country looking at its results can say: “We have high untapped potential in processed food exports to Europe — supply capacity is there, European demand for this product is growing, but our ease score is low because of non-tariff barriers.” That’s an actionable diagnosis.

A machine learning model ranking export opportunities would rank the same sector highly — but could not tell you why. The diagnosis drives the policy response. Without it, you have a priority list with no prescription.

3. Developing countries don’t have ML-scale data
#

ML models for gravity-based trade prediction typically require bilateral trade data across many country-pairs over many years, plus a rich feature set. That data exists for OECD countries. For Benin or Myanmar, the statistical infrastructure is thinner, the time series shorter, and key variables (input-output tables, detailed employment surveys) may be from outdated vintages.

The Leontief approach is robust to this. Technical coefficients are considered relatively stable over medium time horizons — a 2012 IO table is still useful in 2018. The employment calculation needs only a sectoral employment snapshot, not a panel. You can work with what exists.

This is a genuine practical advantage, not a consolation prize.

4. The assumptions are features
#

The ILO paper is notably transparent about its assumptions: constant returns to scale, stable technical coefficients, no macroeconomic feedback through exchange rates, no skill mismatch. Each assumption is named, its direction of bias is discussed, and readers are told when and why the results might be over- or understated.

This is what makes the methodology defensible in a policy setting. A government minister can challenge a specific assumption. An NGO can ask what happens if we relax the constant returns assumption for agriculture. The model is auditable.

A deep learning model for the same task would have implicit assumptions embedded in architecture choices, training data selection, and regularization hyperparameters. These are not auditable in the same way. In development economics — where the outputs influence resource allocation for millions of people — that opacity is a serious problem.

Where ML Does Add Value in This Context
#

This isn’t an argument against ML in economics. It’s an argument for knowing which tool solves which problem.

There are genuine ML applications adjacent to this work:

Feature construction for the EPI supply component: The paper uses a modified PRODY index (GDP-per-capita-weighted export intensity) that requires careful handling of re-exports and tariff preferences. A ML model trained on trade data could potentially identify products with genuine comparative advantage more robustly, by learning the complex interaction between tariff preferences, re-export patterns, and true production capacity.

Anomaly detection in trade data: The methodology requires a “reliability check” to identify reporters whose trade statistics are inconsistent with their trading partners’ mirror statistics. This is pattern recognition — a natural ML task.

Demand elasticity estimation: The demand component uses estimated revenue elasticities of import demand. These come from econometric estimates in the literature. Modern ML approaches (double/debiased ML, causal forests) could potentially improve these estimates, particularly for products with unusual demand curves.

But the core structural calculation — the Leontief inverse, the employment multiplier chain — remains an econometric construct. Its value is precisely that it is grounded in an explicit theory of production.

What the Debate Gets Wrong
#

The econometrics vs ML debate in economics is often framed as a competition, with ML seen as the newer, more powerful approach that economists are reluctantly adopting.

This framing misses the point in both directions.

The economists who argue ML is “just curve fitting” are wrong: ML methods like causal forests and double debiased ML can recover structural parameters in high-dimensional settings where classical approaches break down.

The ML practitioners who argue econometrics is “just old statistics” are also wrong: an econometric model forces you to specify what you’re trying to estimate, makes causal assumptions explicit, and connects to theory in a way that constrains what conclusions you can draw.

For the export potential problem specifically, the right framing is simpler: the ILO/ITC methodology works because it is fit for purpose. It asks a structural question (what is the employment multiplier of export growth in sector X?) and answers it with a structural tool (Leontief IO analysis). Substituting ML would be like using a regression model to invert a matrix — not wrong in principle, but solving the problem badly when the right solution is available.

The Technical Lesson That Stayed With Me
#

Five years on, the thing I think about most from this project is not the econometrics vs ML question. It’s the matrix formulation lesson.

When I switched from nested loops to vectorized operations, the code didn’t just get faster. It became easier to verify. Each matrix operation corresponded to a named economic concept. The broadcasting rules enforced dimensional consistency that the loop version hid. Bugs that would have taken hours to diagnose appeared immediately as shape mismatches.

There’s an epistemological point here: how you implement a computation shapes how you understand it. Writing np.linalg.inv(np.eye(n) - B @ A) forces you to know what n is, what B and A represent, and why you’re subtracting rather than adding. Writing a nested loop lets you avoid all of that — until something goes wrong.

The Technical Appendix of the ILO paper is written entirely in matrix notation for exactly this reason. The notation is the specification. Implementation is translation, and the closer the translation stays to the original language, the less gets lost.

References
#

Cheong, D., Decreux, Y., & Spies, J. (2018). Spotting Export Potential and Implications for Employment in Developing Countries. ILO STRENGTHEN Working Paper No. 5.
Decreux, Y., & Spies, J. (2016). Export Potential Assessments: A Methodology to Identify Export Opportunities for Developing Countries. ITC.
Mullainathan, S., & Spiess, J. (2017). Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), 87–106.
Hausmann, R., Hwang, J., & Rodrik, D. (2007). What You Export Matters. Journal of Economic Growth, 12(1), 1–25.
Leontief, W. (1941). The Structure of the American Economy. Oxford University Press.
O’Hagan, J., & Mooney, D. (1983). Input-Output Multipliers in a Small Open Economy. Economic and Social Review, 14(4), 273–280.

The Paper: What It Actually Does #

The Implementation Mistake I Made First #

Why ML Would Have Been the Wrong Choice #

1. The question is “how many jobs”, not “which sector will grow” #

2. Interpretability isn’t a luxury — it’s the deliverable #

3. Developing countries don’t have ML-scale data #

4. The assumptions are features #

Where ML Does Add Value in This Context #

What the Debate Gets Wrong #

The Technical Lesson That Stayed With Me #

References #