Hi there!

I’m a scientist at the Helmholtz Centre for Environmental Research, a trained computational social scientist, studying the impacts of climatic extreme events using text-as-data . In my Ph.D. I focused on assessing, modeling, and understanding the impacts of natural hazards (droughts) on social-ecological systems. I apply text mining, machine learning and natural language processing, and participatory modeling. Next to my academic work, I’m deeply interested in sports analytics by the means of large-scale data analysis and the integration of computational tools into journalism.

You can contact me via jansodoge(at)protonmail.de

Interests
  • Natural language processing
  • Natural hazards and disasters
  • Machine learning
  • Big data approaches for handling text data
  • Participatory modeling
Education
  • PhD in Geography - Environmental Risk Research, 2021 - 2024

    Helmholtz-Centre for Environmental Research, Germany

  • MSc in Computational Social Science, 2021

    Linköping University, Sweden

  • BSc in Environmental System Science & Geography, 2019

    University of Osnabrück, Germany

Portfolio

Research and non-research projects

*
Monitoring Impacts of Natural Hazards Using Natural Language Processing
Monitoring Impacts of Natural Hazards Using Natural Language Processing Understanding how natural hazards affect societies across different sectors is critical for developing effective adaptation strategies and early warning systems. Yet, data on these impacts remain scarce.
Combining machine learning and stakeholder expertise to uncover drought-society interactions
Understanding how drought affects society is key to building more resilient systems. In my research, I explore these interactions between drought and society using stakeholder expertise mapped through participatory modelling Machine learning on large datasets of biophysical drought indicators and socio-economic impact Both approaches aim to to uncover patterns, feedbacks, and pathways of vulnerability that are often missed in conventional analyses.
Digital tools for recommending actually relevant talks at conferences using deep learning
Attending the European Geoscience Union (EGU) General Assembly for the first time in 2022 was eye-opening—and overwhelming. With more than 15,000 presentations spread across hundreds of sessions, I faced the classic conference conundrum: how do you find the talks that truly matter to your work?
aweSOM – Training and Visualizing Self-Organizing Maps in R
Dimensionality reduction is a powerful technique for making sense of high-dimensional data—and self-organizing maps (SOMs) are one of the more intuitive and visually appealing approaches in this space. After reading this paper, I became especially interested in how SOMs can be used to explore complex social and economic datasets.
Basketball Analytics – Supporting Professional Coaching with Data
During my university studies I’ve been consulting professional German basketball teams to integrate advanced analytics into their coaching workflows. Collaborating with the coaching staff I’ve developed customized tools and metrics that go beyond standard box scores such as lineup efficiencies and spatial shooting patterns.

Recent Publications

Quickly discover relevant content by filtering publications.
(2025). Comprehensive assessment of flood socioeconomic impacts through text-mining.

PDF Cite

(2025). What constitutes sustainable agriculture for different audiences in Germany? A comparative analysis of large-scale text data.

PDF Cite

(2024). Flash droughts and their impacts—using newspaper articles to assess the perceived consequences of rapidly emerging droughts.

PDF Cite

(2024). Text mining uncovers the unique dynamics of socio-economic impacts of the 2018--2022 multi-year drought in Germany.

PDF Cite

(2024). Unified in diversity - Unravelling emerging knowledge on drought impact cascades via participatory modeling.

PDF Cite