Women into Maths PhDs! Or: why qualified women don’t apply.

Recently, a friend of mine shared an article by the Harvard Business Review on the different application strategies of men and women: While men apply to a job when they meet around 60% of the criteria mentioned in the advert, women only apply if they are 100% qualified! The article then carefully explains how women are not confident enough in their abilities to apply, even if they are highly skilled and would excel in the advertised role.

Of course, this issue appears across all domains, but is naturally prevalent in fields where women are already underrepresented – one of them being mathematical research. According to the LMS report 2013, only about 6% of maths professors in the UK are female, but women make up 40% of undergraduate mathematics students in the UK! However, they do not choose to continue their academic career as much as their male colleagues do, a phenomenon known as the “leaky pipeline”.

What we can do.

To change the status quo, I have teamed up with two colleagues and founded the Piscopia Initiative, named after Elena Piscopia, the first women ever to be awarded a PhD. Keeping the aforementioned article in mind, we want to boost the confidence of already highly qualified women and non-binary students to encourage them to submit competitive PhD applications in maths and adjacent fields.

After highly successful campus and online events across Scotland, we are now organising PiFORUM 2020, a virtual conference featuring an application bootcamp. Interested students can join us from the 7th to the 11th of September to connect with other students and learn more about PhD life. There will be copious opportunities to ask questions about the application process and to work on their own applications. We will even conduct individual mock interviews to prepare the participants in the best way possible for their steps towards their PhD.

Applications are now open!

Thanks to our sponsors, the event is free to attend, and all participants will receive a goody bag with supplementary materials and a few nice things to get them through the week. So if you like maths but are not convinced a PhD is for you, please apply for the PiFORUM 2020 to find out more about the possibilities you have! Similarly, if you are a lecturer, personal tutor, or professor, please encourage your women and non-binary students to join us for PiFORUM 2020!

Combatting the leaky pipeline. One application at a time.

All info:

Data Science in a Post-Corona World

The recent Corona pandemic is a world-wide catastrophe, impacting every aspect of our lives. As a PhD student in Statistics, I want to share some thought on how this crisis could substantially change the way we do Data Science in a post-corona world. While these aspects are minuscule in comparison to the humanitarian crises we are facing, I nevertheless want to add this fragment to the many aspects in which COVID-19 influences us.

Data Science has two main goals. First, we want to explain the past data we have observed: Why are people switching from product A to product B? How are the votes in the recent election and the Brexit referendum related? All these questions relate to the data we have seen. The second goal of Data Science is to predict the future: How many units of product A will be sold next week? How will people vote in the next election? Here, we utilise the data we have observed to make “educated guesses” about the future.

Usually, the more data is available, the better our predictions become. This mechanism works well when the future is rather similar to the past. However, we may observe some events that don’t fit in – so-called “outliers”. For example, when looking at stock return data, the financial crisis of 2007/08 is clearly visible. However, despite this big shock, most stocks returned to their normal levels a year or two later.

Return of the Dow Jones index. Picture generated from Yahoo Finance.

Life will not be the same after Corona, and neither will Data Science.

Virtually any data is influenced by the current Corona pandemic. The stock market is hit heavily. Unemployment rates sky-rocket. Sales of most products go down, while they soar for toilet paper and pasta. CO2 emissions are at a historic low. The impact of the pandemic on any data collection gives rise to new challenges for Data Scientists. Corona changes how humans buy, eat, work – well, live. These changes don’t just last for a few days, but for a substantial amount of time. As restrictions will only be lifted gradually, it might take a while until we are “back to normal”.

Besides, we don’t even know if and how that “normal” will be similar to what we were used to before 2020. Maybe we’ll spend more time with our family and friends instead of going on the fifth city break this year. Maybe we will buy more regional products as we realised how important a strong local economy is. While this is only speculative, it is essential to realise that we just don’t know how the future will look like compared to a pre-COVID past.

Both issues, a long abnormal period and a shift of what is “normal”, constitute special challenges for Data Science. From now on, anyone doing data analysis will have to deal with those questions. We can’t just exclude the first half of 2020 as it would leave too big of a big gap in our analysis. Additionally, past data might not be particularly useful to predict the future anyway. For example, we wouldn’t use data from 2010 to predict today’s internet usage. In the same spirit, data collected before the pandemic might not be that informative to predict post-Corona.

Luckily, statisticians are not new to these questions.

So-called “Change Point Models” are a well-established method focusing on spotting changes in a time series and estimating what happens before and after these change points. Now, any time series model has to work with data influenced Corona pandemic and Change Point Models are an elegant way of dealing with the respective issues.

This relates both to explaining past data and predicting new data. Change Point Models could do a good job at capturing this extraordinary period. However, predictions as to what the future holds could be less certain, as past data (i.e. pre-Corona) becomes less relevant. Any predictions we make will be based on less – or less relevant – data.

The Corona crisis substantially changes the way we analyse data, as virtually any data set will be affected by the pandemic. Fortunately, statisticians already have developed tools to deal with some of these issues. In turn, old historic data might become worthless as it is not relevant for what the future holds. Generally, this leads to more uncertainty in our predictions. Back in 2017, the Economist proclaimed that data would be the new oil. Well, relevant data might be the new diamond – and data scientists its diamond cutters.