Final report of my JSMF fellowship

I recently finished my JSMF fellowship at Sant’Anna Pisa, and took up a position as a researcher at CENTAI. I just took the chance to submit the final report to JSMF to reflect on the last 3 years, and thought to share this report as a somewhat autobiographical note for this blog, which I’m unfortunately not maintaining as I would like to.

_______________________________________________________________

Thanks to the unique flexibility that the JSMF fellowship provides, during my 3-year postdoctoral period I have been able to translate my relatively vague research proposal into a precise agenda. This happened through the collaboration with multiple people coming from several institutions, and while adapting my plans to contingent events such as the Covid-19 pandemic. In the spirit of the fellowship, I broadened my research areas from theoretical to data-driven modeling.

My JSMF project is titled “A theory of prediction for economic agent-based models”. Agent-based models (ABMs) are computational representations of complex systems in which individual agents interact adopting simple behavioral rules, and non-obvious patterns emerge from their interactions. I started working on ABMs during my PhD, as I worked on stylized ABMs in game theory, but I wanted to make my ABMs more data-driven. Moreover, I was interested in understanding when ABMs outperform traditional economic models in out-of-sample prediction. In such situation, ABMs would be a more accurate description of reality than traditional models, and this would justify their use for both scientific understanding and policy advice, making them more widely accepted. In the bigger picture, in my view using ABMs rather than traditional models would lead to a better representation of the economy as a complex system.

My initial plans were mostly on the “theory” of data-driven ABMs. I wanted to understand the theoretical conditions under which ABMs outperform traditional models in forecasting, mostly by using synthetic data generated by other models used as ground truth, before attempting to compare predictions in the real world. This has been the focus of Ref. [1]. A key problem for prediction with ABMs is that many variables of individual agents are unobserved, or latent. To the extent that the ABM dynamics depends on the values of these latent individual-level variables, unless these are estimated precisely the model cannot produce reliable forecasts. Using a specific ABM, we have been able to learn some lessons on the conditions of ABMs that make it possible to estimate latent variables. First, the amount of stochasticity of the ABM must be commensurate with data availability. If the model has many stochastic elements providing outcomes that cannot be observed, it is very difficult to write a computationally tractable likelihood function that enables precise estimates of latent variables. Second, the model must be continuous when possible, keeping discrete elements only when discreteness is crucial for the mechanisms that the ABM represents. This work [1] paves the way to a research agenda to make agent-based models “learnable”, i.e. such that their latent variables can be estimated from real-world data, and thus amenable to forecasting.

Further zooming into the structure of ABMs, I started thinking of ABMs as dynamic causal networks in which nodes correspond to the values of variables at a given time step and links indicate a dependency in the computer code describing the ABM. For instance, when z(t) <- x(t) + y(t-1), the causal network would have links from x(t) and y(t-1) to z(t), for all time steps t. Together with colleagues at Sant’Anna Pisa (my JSMF host institution), we developed a programming language that makes it possible to automatically derive the dynamic causal network of an ABM from the model code as it is executed [8]. In work in progress, we are using this causal network formalism to classify simulation models into a taxonomy with several dimensions, such as how stochastic, discrete, interactive, heterogeneous, complex a model is [9]. This taxonomy is not restricted to ABMs, as it extends to all simulation models. As a first goal, we hope to use this causal network to see which features make ABMs unique; we are the first to analyze this from a formal, rather than conceptual, point of view. This is useful in several respects. First, it would make it possible to open the “black-box” of an ABM. Indeed, usual methods treat the ABM as a black box that takes inputs and produces outputs [2, 11], but since we know the code of the model it is a big waste of information not to use it. Moreover, building the causal network of the ABM makes it possible to replace certain parts or the entire model by machine learning metamodels, which may be easier to deal with in the presence of latent variables [1,10]. Putting all pieces together I am getting closer to understanding the theoretical conditions under which ABMs can be used for forecasting.

My postdoctoral research also involved applications of data-driven ABMs. This line of work started from the Covid-19 pandemic. Stuck at home during the first lockdown, in spring 2020, my co-authors and I started working intensely on macroeconomic ABMs with an industry-to-industry input-output structure. We wanted to forecast the economic impacts of lockdowns, both at the aggregate level and across specific industries, and also to provide policy recommendations on which industries might be closed to minimize economic harm and maximize health benefits. In an early paper representing the UK economy [3], we predicted a 21.5% reduction in UK GDP in spring 2020, two months before the official release stating a 22.1% contraction. Our forecast was much closer to reality than the one by the Bank of England (around 30%) and the median forecast by several commercial banks and institutions (around 16%). At the policy level, we recommended against closing manufacturing industries, because they are relatively “upstream”, in the sense that they may provide outputs that are necessary inputs by other industries, and at the same time do not involve as many face-to-face contacts as more “downstream” industries such as entertainment and food. Our paper was widely circulated within the UK Treasury.

Building on this paper [3], I proposed to join forces with a team of epidemiologists to create an integrated epidemic-economic ABM that could address the most debated epidemic-economic tradeoffs [6]. This project took more than two years, but in the end we came up with what we think is the most granular and data-driven epidemic-economic model to date. We represent the New York metropolitan area, simulating the mobility and consumption decisions of a synthetic population of half a million individuals that closely resembles the real population. Mobility decisions are obtained from a privacy-preserving algorithm that reads individual-level mobility traces extracted from cell phone data and associates them to synthetic individuals. Households may reduce consumption for fear of infection as the number of Covid-related deaths increases. We find several results, including that epidemic-economic tradeoffs affect low-income more than high-income individuals, and that mandated government closures have similar tradeoffs as spontaneous consumption avoidance due to fear of infection.

A last line of research on the applications of data-driven ABMs, which is still in progress, is about housing markets and climate change [13]. We obtained access to a very rich dataset comprising all properties, transactions and mortgages in the Miami area, and we used it to initialize an ABM in which households buy and sell houses. In this ABM, buyers may avoid properties that may be most at risk of sea level rise, and this brings down the value of these properties. We reproduce interesting patterns of climate gentrification, such as the fact that prices are increasing in low-income but relatively high-altitude areas such as Little Haiti and decreasing in high-income low-altitude area such as Miami beach, because of a flux of affluent individuals from low-altitude areas at high risk of sea level rise to safer areas. We plan to use this model to test several climate adaptation strategies and study scenarios according to different climate pathways.

Finally, in addition to the new research lines that the JSMF fellowship enabled me to start, I had the chance to conclude papers on game theory [4, 14] and business cycles synchronization [7].

Following such an ambitious and wide-ranging research agenda has only been possible thanks to the unique characteristics of the JSMF fellowship. First of all, I greatly benefited from interacting with multiple coauthors coming from different backgrounds, many of whom I met when searching for a host institution. As the JSMF fellowship is not tied to a host institution, the search period is a great opportunity for finding new collaborators. Second, because I did not have to adhere to a strict reporting schedule, I had the flexibility to adapt my research to the circumstances, such as the Covid-19 pandemic, enriching my initial plans. Third, the 3-year period of the fellowship gave me time and independence to build my own research agenda. Fourth, the generous research budget made it possible to organize an international workshop on the topics of my fellowship, better connecting with the community working on data-driven ABMs, which I believe is in a great position to bridge theoretical and empirical approaches across disciplines [12].

Thank you for giving me the opportunity to pursue this research line. In my view, many postdoctoral programs around the world should follow in the footsteps of the JSMF Postdoctoral Fellowship.

___________________________________________________

[1] Monti, C., Pangallo, M., Morales, G. D. F., & Bonchi, F. (2023a). On learning agent-based models from data. arXiv:2205.05052.

[2] Borgonovo, E., Pangallo, M., Rivkin, J., Rizzo, L., & Siggelkow, N. (2022). Sensitivity analysis of agent-based models: a new protocol. Computational and Mathematical Organization Theory, 28(1), 52-94.

[3] Pichler, A., Pangallo, M., del Rio-Chanona, R. M., Lafond, F., & Farmer, J. D. (2022). Forecasting the propagation of pandemic shocks with a dynamic input-output model. Journal of Economic Dynamics and Control, 144, 104527.

[4] Heinrich, T., Jang, Y., Mungo, L., Pangallo, M., Scott, A., Tarbush, B., & Wiese, S. (2023). Best-response dynamics, playing sequences, and convergence to equilibrium in random games. arXiv:2101.04222.

[5] Loberto, M., Luciani, A., & Pangallo, M. (2022). What Do Online Listings Tell Us about the Housing Market? International Journal of Central Banking, 18(4), 325.

[6] Pangallo, M., Aleta, A., Chanona, R., Pichler, A., Martín-Corral, D., Chinazzi, M., Lafond, F., Ajelli, M., Moro, E., Moreno, Y., Vespignani, A., & Farmer, J.D. (2022). The unequal effects of the health-economy tradeoff during the COVID-19 pandemic. arXiv:2212.03567.

[7] Pangallo, M. (2023). Synchronization of endogenous business cycles. arXiv:2002.06555.

[8]  Comparing causal networks extracted from model code and derived from time series. In preparation.

[9] Quantifying features of simulation models directly from model code. In preparation.

[10]  Learning agent-based models through graph neural networks. In preparation.

[11] Pangallo, M., Giachini, D., & Vandin, A. (2023c). Statistical Model Checking of NetLogo Models. In preparation.

[12] Prediction and understanding in data-driven agent-based models. In preparation.

[13] Pangallo, M., Coronese, M., Lamperti, F., Cervone, G., & Chiaromonte, F. (2023d). Climate change attitudes in a data-driven agent-based model of the housing market. In preparation.

[14]  Best-response dynamics in multiplayer network games. In preparation.