Request for Expression of Interest: STC - Data Scientist



The Global Environment Facility (GEF) provides support to address global environmental concerns related to biodiversity, climate change, international waters, land degradation, and chemicals and waste. Since its inception in 1991, the GEF has provided developing countries and countries with economies in transition US $25 billion in grants and mobilized more than US $138 billion in cofinancing. These grants are implemented on the ground through a network of 18 accredited agencies (GEF Agencies). The GEF receives its funds through a four-year replenishment cycle.

The Independent Evaluation Office (GEF IEO) has a central role in ensuring the independent evaluation function within the GEF. The GEF IEO is based in Washington DC. It is administered by the World Bank but is independent of its management as well as the management of the GEF. The GEF IEO reports directly to the GEF Council, the GEF governing body. All contracts with the IEO are World Bank contracts.

GEF IEO undertakes independent evaluations on issues relevant to GEF's performance. These cover issues related to GEF policies, processes, projects, and programs funded by the GEF. GEF IEO is undertaking the Eighth Comprehensive Evaluation (OPS8) to inform the replenishment process for the GEF-9 period. The audience for the OPS8 comprises replenishment participants, the GEF Council, the GEF Assembly, members of the GEF, and external stakeholders. To prepare OPS8, the GEF IEO will draw from its evaluations, including evaluations on leveraging technologies for the environment and on GEF support to policy coherence.

GEF IEO seeks a highly motivated Data Scientist to use advanced text analytics, machine learning, and artificial intelligence to extract and analyze text from evaluations and program/project documents, especially but not limited to terminal evaluation reports (TEs), terminal evaluation report reviews (TERs), requests for project endorsement/approval, project documents, program framework documents (PFDs). The goal of this consultancy is to extract, share, and regularly update data and lessons not currently available in a structured format, help identify portfolios of projects relevant to specific themes (e.g., leverage of new technologies, support to policy coherence), and distill further insights from existing evaluative evidence.

Responsibilities and Accountabilities

The consultant will perform the following tasks:

  • Extract, analyze, and prepare source data from documents produced by different GEF Agencies, working with structured and mainly unstructured text data
  • Apply Data Science/Natural Language Processing/AI/Machine Learning techniques to unstructured data
  • Build and/or optimize models and algorithms for identifying documents that are relevant to the identified themes
  • Consult with internal experts to create, validate, and possibly modify existing taxonomies
  • Consult with internal sectoral experts to create and validate robust training data
  • Provide expert guidance on prompt engineering


Candidates must have the following qualifications:

  • Master’s degree in Computer Science, Mathematics, Statistics or related field.
  • Must have extensive programming experience with Python – especially for text analytics and topic modeling.
  • Must also be proficient in R programming.
  • At least three years of relevant experience in setting up supervised and unsupervised learning ML/NLP models including data cleaning, data analytics, model selection, performance metrics.
  • Strong Microsoft Office skills, with advanced knowledge of MS Excel.
  • Strong communication and collaboration skills with the ability to present and translate complex information to non-technical teams in relevant business terms.
  • Attention to detail and advanced organizational skills.
  • Fluent in English.
  • Knowledge of data visualization (especially Tableau) is an advantage.
  • Experience in the environment sector, especially in relation to innovative and novel technologies and policy coherence, is a major advantage.

Contract duration

The tasks will be undertaken in the period from August 2024 to June 2025. The selected consultant will work for 25 days, with potential for extension based on initial results.


The interested candidates should submit their CV along with a cover letter at with "STC Data Science" as the subject line. The last date of submission is August 2, 2024.