Episodio 123 - 2025-11-16
In this episode we had the opportunity of talking to Josiah Parry, currently working at ESRI in the spatial analytics and data science team (spatial statistics). Josiah started with sociology, then went to urban informatics, worked with Posit (before RStudio) with the government sector and then joined ESRI where he plays an important role in bridging the company with the R ecosystem.
Josiah shares his origin story, which many in the data world will relate to: starting in social sciences (sociology), he realized the need to empirically test theories. On the advice of his uncle, he explored R, and a single regression class simulation "pulled him in." This led to a journey of self-learning (via podcasts and side projects) into ML, statistics, and GIS, eventually leading him to a master's in urban informatics. He recounts how presenting at a UseR meeting led to a role at RStudio (Posit) before he moved to ESRI to focus on applied geospatial statistics.
At ESRI, Josiah’s work involves applying machine learning and deep learning to spatial data, a field with unique challenges like the "first law of geography" (mirror things are more related). We dove into the role of LLMs in this space. Josiah is pragmatic: while he uses LLMs daily as a "personal tutor" (e.g., to understand complex papers), he cautions that we "should not be trusting models trained on Wikipedia for data." He sees their current strength in ingesting documentation and planning an analysis, not in executing it.
A prolific author of 22 R packages (like sfdep), Josiah explains his motivation: "When there is something I want, that does not exist... then I develop it." He discusses his move into Rust (rextendr), noting that while he defends R against claims of being "slow," he found Rust easier to learn than C++ thanks to its helpful compiler and strong community. He also shares his favorite new developer tools, including the Zed editor and the just command runner. We also discussed the barriers to R adoption in large organizations like government. Josiah argues the main issue is bureaucracy and IT misunderstanding what R is, rather than a lack of desire from analysts. He emphasizes that the "ggplot gateway drug is real" and that helping people solve their specific problems with R is the key to adoption.
Finally, looking at the future of AI, Josiah warns of a "cycle of non-innovations" if people rely on LLMs to create content without critical oversight. He stresses that these tools need to be "babysat" and treated as interactive guides, not as humanistic, autonomous agents.