Understanding terrestrial ecosystems and their response to anthropogenic climate change requires quantification of land-atmosphere carbon exchange. However, top-down and bottom-up estimates of large-scale land-atmosphere fluxes, including the northern extratropical growing season net flux (GSNF), show significant discrepancies. We developed a data-driven metric for the GSNF using atmospheric carbon dioxide concentration observations collected during the High-Performance Instrumented Airborne Platform for Environmental Research Pole-to-Pole Observations and Atmospheric Tomography Mission flight campaigns. This aircraft-derived metric is bias-corrected using three independent atmospheric inversion systems. We estimate the northern extratropical GSNF to be 5.7 ± 0.3 Pg C and use it to evaluate net biosphere productivity from the Coupled Model Intercomparison Project phase 5 and 6 (CMIP5 and CMIP6) models. While the model-to-model spread in the GSNF has decreased in the CMIP6 models relative to that of the CMIP5 models, there is still disagreement on the magnitude and timing of seasonal carbon uptake with most models underestimating the GSNF and overestimating the length of the growing season relative to the observations. We also use an emergent constraint approach to estimate annual northern extratropical gross primary productivity to be 56 ± 17 Pg C, heterotrophic respiration to be 25 ± 13 Pg C, and net primary productivity to be 28 ± 12 Pg C. The flux inferred from these aircraft observations provides an additional constraint on large-scale gross fluxes in prognostic Earth system models that may ultimately improve our ability to accurately predict carbon-climate feedbacks. Plain Language Summary The exchange of carbon between the land and atmosphere is an important part of the Earth's climate, and this exchange might change due to human-caused climate change. However, estimates of land-atmosphere carbon fluxes made using different techniques do not agree with each other. We use atmospheric carbon dioxide observations collected during two flight campaigns to show that 5.7 Pg C is exchanged between the atmosphere and the land in the northern hemisphere during the summer growing season. This estimate is used to evaluate the performance of two generations of climate prediction models. The newer generation of models show less spread than the older generation, but there is still significant disagreement on the magnitude and timing of land-atmosphere carbon exchange among models. Most models underestimate the growing season net flux and overestimate the length of the growing season. We also use our observational estimate to reduce the spread on component fluxes of carbon exchange, namely uptake by photosynthesis and release by respiration.