Skip to main contentlogo of the One Health Vector-Borne Diseases HubVBD Hub - Learn

Training workshop on data sharing and analysisLink to Training workshop on data sharing and analysis

Travel, accommodation and foodLink to Travel, accommodation and food

All of the training activities and most of the social activities (including lunches) will take place at Silwood Park campus of Imperial College London. The participants which are not commuting and requested accommodation are housed at Travelodge Egham in a nearby village Egham. Please keep your receipts for your inward and outward travel. Trains are preferred whenever possible, but airport transfers with taxis are also acceptable when appropriate. The Egham train station is an especially well serviced location, with trains to London Waterloo every 30 minutes (please check Trainline).

Travel between Silwood Park and Egham

We have arranged taxis to collect you each day at 8:50 AM to come to Silwood Park. Please meet in the lobby of Travelodge Egham. They will also take you back every day in the evening.

Food and refreshments

Hotel bookings include a breakfast box, please ask for it. Lunches and refreshments will be provided during the workshop. Please let us know if you have any dietary requirements or allergies. We have also made group dinner plans for Wed and Thu evening. Details will be provided during the workshop.

Prerequisite material (optional but recommended)Link to

We are using R language for all data manipulation, analyses and model fitting. Any operating system (Windows, Mac, Linux) will do, as long as you have R (version 4 or higher) installed.

You may use any IDE for R (VScode, RStudio, DataSpell, etc.). RStudio is a good option for most people. Whichever one you decide to use, please make sure it is installed and tested before the workshop.

We are assuming familiarity with R basics. In case you need a refresher, here's a 30 minute article on syntax.

In addition, we recommend that you do the following:

Although there will be input at the start of most sessions, much of the synchronous time is planned to be dedicated to helping you work through exercises and activities.

If you feel difficulty when working through any of this please do not feel discouraged. The workshop's difficulty is intentionally set at challenging. However, it is targeted at people with various levels of proficiency.

Course timetableLink to Course timetable

  • Arrival and check-in to accommodation.
    • 09:30Welcome and Overview: Informatics for Vector Borne Disease Responses
    • 10:00Introduction of the Hub and Data Wrangling (where you can get data)
    • 11:00Break
    • 11:15Tutorial: Data wrangling and visualizing data
    • 12:00Lunch
    • 13:00Tick Drag Activity
    • 15:00Break
    • 15:30Continue: Data wrangling and visualising data
    • 09:00Linear vs Nonlinear Models
    • 10:30Break
    • 10:50Introduction to time series
    • 12:00Lunch
    • 13:00

      Track 1: Time series (classical models, decomposition, lags, autoregressive/Arima models)

      Track 2: Modelling vector distributions through time and space (forecast modelling frameworks, GBIF & biomod2, spatial & temporal trends)

    • 16:30Choosing capstone projects
    • 09:00Capstone: Work on analysis
    • 10:30Break
    • 11:00Continue: Work on analysis
    • 12:00Lunch
    • 13:00Continue: Work on analysis
    • 15:00Break
    • 15:30Presentations
    • 16:30Discussion & Wrap-up

Course contentLink to Course content

Day 1

Welcome and overview (9:30)

Welcome and overview of the workshop, including introductions to the participants. Small presentation given by Dr. Lauren Cator.

Resources:

Introduction of the VBD Hub (10:00)

Introduction to the VBD Hub, its purpose, and how it can be used for data sharing and collaboration. By Dr. Francis Windram, Stanislav Modrak and Sarah Kelly.

Resources:

Tick drag activity (11:15)

A hands-on activity where participants will try a field technique used for collecting ticks when conducting abundance studies. Led by Dr. Lauren Cator, Dr. Marion England and Dr. Hannah Vineer.

Activities:

  1. going to the field to collect ticks using a drag method

Resources:

Introduction to data wrangling and visualising data (13:00)

Overview of data wrangling techniques in R. Led by Dr. Francis Windram.

Activities:

  1. (optionally) read and work through the Biological Computing in R
  2. read and work through the Data Management and Visualisation (only up to but not the section "Beautiful Graphics in R")

Resources:

Tutorial on data wrangling and visualising data (15:00)

A hands-on tutorial on data wrangling and visualizing data in R. Participants will learn how to manipulate and visualize data using R packages such as dplyr and ggplot2. Led by Dr. Francis Windram and Prof. Samraat Pawar.

Activities:

  1. read and work through the Data Management and Visualisation (start from the section "Beautiful Graphics in R")

Resources:

Day 2

Linear vs Nonlinear Models (9:00)

Introduction to linear and nonlinear models, including their applications in ecological data analysis. Led by Prof. Samraat Pawar.

Activities:

  1. work through the Linear regression

Introduction to time series (10:50)

Overview of time series analysis and its applications in ecological data. Led by Dr. Francis Windram.

Activities:

  1. work through the Introduction to Time Series

Choosing a track (13:00)

Participants will choose one of the two tracks. In the first track, they will learn about classical time series models, decomposition, lags, and autoregressive/Arima models. In the second track, they will learn about modelling vector distributions through time and space, including forecast modelling frameworks, GBIF & biomod2, and spatial & temporal trends.

Track 1: Time series - classical models, decomposition, lags, autoregressive/Arima models

Led by Dr. Francis Windram.

Activities:

  1. continue the Introduction to Time Series
Track 2: Modelling vector distributions through time and space - forecast modelling frameworks, GBIF & biomod2, spatial & temporal trends

Led by Dr. Will Pearse and Dr. Josh Tyler, with help from Nathan Clark. This track is sponsored by the SPHERE-PPL project.

Activities:

  1. follow the external resources on Temporal & Spatial Modelling of Disease Vectors

Resources:

Choosing capstone projects (16:30)

Participants will choose their capstone projects for the final day of the workshop. They will work in groups to develop a project plan and outline. Led by Dr. Lauren Cator.

Day 3

Working on capstone projects (9:00)

Participants will work in groups on their capstone projects, applying the skills and knowledge gained during the workshop. They will receive guidance and support from the instructors.

Presentations (15:30)

Participants will present their capstone projects to the group. Each group will have 5 minutes to present, followed by a Q&A session. Led by Dr. Lauren Cator.

InstructorsLink to Instructors

Lauren Cators's profile picture

Lauren Cator researches the role of mosquito behaviour and ecology in disease transmission at Imperial College London. Lauren is leading the Hub and responsible for overall project management and coordination of the team and engagement with the wider UK VBD research community.

Samraat Pawar's profile picture

Samraat Pawar studies how individual-level metabolism scales up through species (population) interactions to community- and ecosystem-level dynamics at Imperial College London. Samraat is supporting integration of existing repositories with the Hub and the development of software for working with the data.

Will Pearse' profile picture

Will Pearse develops new statistical and computational tools to answer fundamental questions about the origins and future of biodiversity, and applies those insights to improve human wellbeing at Imperial College London. In this project, Will is focussed on how best to link environmental data with other types of biological data important for understanding VBD transmission.

Francis Windram's profile picture

Francis Windram is a PDRA on the hub at Imperial College London where he develops tools and visualisations for disease vector trait and population data. During his PhD, he created computational imaging methods to extract traits from the webs of UK orb-weaving spiders. Aside from science, Francis is also an avid musician, climber, and nature enthusiast.

Josh Tyler's profile picture

Josh Tyler is Post Doc at the Turing Institute with a focus on biodiversity and modelling. He is particularly interested in understanding the levels to which evolution and ecology are predictable and how we can use advances in simulation and statistics to model past and future biodiversity. His current project looks at how we can better use Bayesian methods, such as PGLMMs, to better elucidate patterns in macroecology & macroevolution.

Supporting staff

Sarah Kelly's profile picture

Sarah Kelly is the data curator for the hub. She predominantly focuses on relationship building with data depositors and data wrangling. For the last 9 years Sarah has worked as part of the VEuPathDB funded by NIAID, curating both entomological and epidemiological data. When she isn’t curating data you will find her running, swimming and cycling around the coastline and camping on hilltops.

Stanislav Modrak's profile picture

Stanislav Modrak is the software engineer behind the hub platform based at the Imperial College London. He has previously worked on risk analysis and compliance in cryptocurrency markets, digital bureaucracy and e-government platforms.