Machine Learning for Cost-Efficient Reaction Yield Prediction in Chemistry
Applications are now CLOSED
Supervisors: Dr Thai-Son Mai (EEECS) and Dr Paul Dingwall (CCE) Prediction of chemistry reaction yield is one of the most important and emerging tasks in Chemistry. Reaction yield models, which describes the percentage of the desired products from the reactants, can be used to select high-performance reactions, thus reducing uncessary attempts, times and costs. However, most existing prediction models requires a huge amount of input reaction results to build the predictive models which are clearly very expensive and time consuming. The goal of the project is to develop effective Machine Learning techniques (supervised or unsupervised) to predict reaction yield based on input chemical compounds and other conditions such as temperatures, pressures, etc under limited budget constraints. That means, these techniques should work well with small input reactions to reduce the overall costs while still need to maintain high prediction accuracy.
Predicting the performance of a reaction is a challenging task, even more so when also attempting to predict optimal ‘above-the-arrow’ conditions; an integral element in automated retrosynthetic planning (Nature 2019, 570, 175). The rare reports that do so (Science, 2018, 360, 186; JACS, 2018, 140, 5004; Nature 2018, 559, 377) use machine learning algorithms that are fed a large volume of single timepoint yield (STPY) data, generated via high throughput experimentation, and an even larger volume of computed molecular descriptors, which quantitatively describe properties of the molecules involved in the reaction (Acc. Chem. Res. 2016, 49, 1292; Chem. Cent. J. 2015, 9, 38.), to create predictive models. Though they are proven to be very effective with high prediction accuracy, they require a huge amount of STPY data in order to train the prediction models. Obtaining these data is very time consuming and expensive. Therefore, how to effectively predict the reaction yield using a limited number of STPY data for reducing these costs is thus a crucial and interesting question for both chemistry and machine learning communities.
Impact: This project directly addresses two Grand Challenge areas highlighted in the UK’s industrial strategy. The chemical sector contributes £258 bn per annum to the UK economy, highlighting its importance to the UK. The sector also contributes 4.4% of the UK’s CO2 emissions and consumes 30% of the UK’s industrial energy supply. In accord with the Paris Agreement, the Government is required to reduce emissions by 80% by 2050. Developing more efficient and effective chemical reactions, particularly through catalysis, to reduce consumption of energy and raw materials are a crucial part in achieving this ambitious target.
Approach: We propose to build an active iterative machine learning model to predict the reaction yield. This model can start with very small number of training STPY data to initialize the prediction process. It then iteratively asssesses the overall performance and actively choose a subset of reactions that can help to improve the prediction performance to get the new yield results to enrich the training data. The whole process is repeated until it reaches a limitation budget. By this way, we can maximize the overall performance while consuming much less costs than traditional prediction methods. Our approach consists of two major parts:
1)Model Training: We propose to develop an iterative ensemble approach that combines multiple supervised learning methods (e.g., Support Vector Machines (SVMs) or Neural Networks (NNs)) in a single model instead of using only one learning algorithm, as in existing works. This is expected to bring a significant boost to performance and accuracy, especially when dealing with small training sets. Importantly, sub-optimal reaction conditions will be included in the data gathered. Lack of this negative data is a major failing of mining literature data, where there is a persistent bias of reporting only successful results (Science 2018, 361, 569).
2)Active Learning: The most significant element in our approach. Starting with a small training set, our algorithm will suggest experiments to gather new reaction profiles as it progresses (Aggarwal, C. C. et al. In Data Classification, 2014; Settles, B. Active learning literature survey, 2009). The user can collect this data and the active learning process is iterated. This active learning approach results in a cost-effective procedure (an emerging research topic (e.g., ACS Cent. Sci. 2017, 3, 1337)), that will minimize the total number of reaction profiles used for training models, thus reducing experimental costs. It can also enhance prediction accuracy by strengthening decision boundaries between close groups.
Current funding: A part of this project has been accepted as a EPSRC pilot research project between School of Electronics, Electrical Engineering and Computer Sciences (Dr. Son Mai) and School of Chemistry and Chemical Engineering (Dr. Paul Dingwall) in collaborations with some industry partners
Electrical & Electronic Engineering overview
The School of Electronics, Electrical Engineering and Computer Science (EEECS) aims to enhance the way we use technology in communication, data science, computing systems, cyber security, power electronics, intelligent control, and many related areas.
You’ll be part of a dynamic doctoral research environment and will study alongside students from
over 40 countries worldwide; we supervise students undertaking research in key areas of electronics and
electrical engineering, including: power electronics,robotics, wireless communications, cybersecurity and sensor-based systems. As part of a lively community of over 100 full-time and part-time research students you’ll have the opportunity to develop your research potential in a vibrant research community that prioritises the cross-fertilisation of ideas and innovation in the advancement of knowledge.
Within the School we have a number of specialist research centres including a Global Research Institute, the Institute of Electronics, Communications and Information Technology (ECIT) specialising in Cyber Security, Wireless Innovation and Data Science and scalable computing.
Many PhD studentships attract scholarships and top-up supplements. PhD programmes provide our students with the opportunity to acquire an extensive training in research techniques.
Electrical & Electronic Engineering Highlights
- ECIT brings together, in one building, internationally recognised research groups specialising in key areas of advanced digital and communications technology.
- CSIT brings together research specialists in complementary fields such as data security, network security systems, wireless-enabled security systems, intelligent surveillance systems; and serves as the national point of reference for knowledge transfer in these areas.
- Electric Power and Energy Systems research is focused on problems related to distributed sources of energy and their integration into power networks. The cluster is a member of the IET Power Academy and is a major collaborator on all-island energy research.
- SoCaM is dedicated to the design of advanced, integrated, high-speed wireless and couples activities in High Frequency Electronics, System-on-Chip, Signals and Systems and Digital Signal Processing, and for Gigabit/sec wireless.
World Class Facilities
- The Institute of Electronics, Communications and Information Technology, with state-of-the-art technology, offers a bespoke research environment.
Internationally Renowned Experts
- You will be working under the supervision of leading international academic experts.
Research students are encouraged to play a full and active role in relation to the wide range of research activities undertaken within the School and there are many resources available including:
- A wide range of personal development and specialist training courses offered through the Personal Development programme
- Access to the Queen's University Postgraduate Researcher Development Programme
- Office accommodation with access to computing facilities and support to attend conferences for full-time PhD students
Research within the School is organised into research themes.
ECIT brings together, in one building, internationally recognised research groups specialising in key areas of advanced digital and communications technology.
Electric Power and Energy Systems research is focused on problems related to distributed sources of energy and their integration into power networks. The cluster is a member of the IET Power Academy and is a major collaborator on all-island energy research.
SoCaM is dedicated to the design of advanced, integrated, high-speed wireless and couples activities in High Frequency Electronics, System-on-Chip, Signals and Systems and Digital Signal Processing, and for Gigabit/sec wireless.
PhD opportunities are available in a wide range of subjects in electronics and electrical engineering,
aligned to the specific expertise of our PhD supervisors.
Queen’s is a leader in commercial impact and one of the five highest performing universities in the UK
for intellectual property commercialisation. We have created over 80 spin-out companies. Three of these -
Kainos, Andor Technology and Fusion Antibodies - have been publicly listed on the London Stock Exchange.
Queen’s has strong collaborative links with industry in Northern Ireland, and internationally. It has a
strong funding track record with EPSRC and the EC H2020 programme.
For further information on career opportunities at PhD level please contact the Faculty of Engineering and Physical Sciences Student Recruitment Team on askEPS@qub.ac.uk. Our advisors - in consultation with the School - will be happy to provide further information on your research area, possible career prospects and your research application.
People teaching you
There is no specific course content as such. You are expected to take research training modules that are supported by the School which focus on quantitative and qualitative research methods. You are also expected to carry out your research under the guidance of your supervisor.
Over the course of study you can attend postgraduate skills training organised by the Graduate School.
You will normally register, in the first instance, as an ‘undifferentiated PhD student’ which means that you have satisfied staff that you are capable of undertaking a research degree.
The decision as to whether you should undertake a PhD is delayed until you have completed ‘differentiation’.
Differentiation takes place about 8-9 months after registration for full time students and about 16-18 months for part time students: You are normally asked to submit work to a panel of up two academics and this is followed up with a formal meeting with the ‘Differentiation Panel’. The Panel then make a judgement about your capacity to continue with your study. Sometimes students are advised to revise their research objectives or to consider submitting their work for an MPhil qualification rather than a doctoral qualification.
To complete with a doctoral qualification you will be required to submit a thesis of approx 80,000 words and you will be required to attend a viva voce [oral examination] with an external and internal examiner to defend your thesis.
A PhD programme runs for 3-4 years full-time or 6-8 years part-time. Students can apply for a writing up year should it be required.
The PhD is open to both full and part time candidates and is often a useful preparation for a career within academia or consultancy.
Full time students are often attracted to research degree programmes because they offer an opportunity to pursue in some depth an area of academic interest.
The part time research degree is an exciting option for professionals already working in the education field who are seeking to extend their knowledge on an issue of professional interest. Often part time candidates choose to research an area that is related to their professional responsibilities.
If you meet the Entry Requirements, the next step is to check whether we can supervise research in your chosen area. We only take students to whom we can offer expert research supervision from one of our academic staff. Therefore, your research question needs to engage with the research interests of one of our staff.
Assessment processes for the Research Degree differ from taught degrees. Students will be expected to present write up their work at regular intervals to their supervisor who will provide written and oral feedback; a formal assessment process takes place annually.
This Annual Progress Review requires students to present their work in writing and orally to a panel of academics from within the School. Successful completion of this process will allow students to register for the next academic year.
The final assessment of the doctoral degree is both oral and written. Students will submit their thesis to an internal and external examining team who will review the written thesis before inviting the student to orally defend their work at a Viva Voce.
Supervisors will offer feedback on the research work at regular intervals throughout the period of registration on the degree.
Full time PhD students will have access to a shared office space and access to a desk with personal computer and internet access.
The minimum academic requirement for admission to a research degree programme is normally an Upper Second Class Honours degree from a UK or ROI HE provider, or an equivalent qualification acceptable to the University. Further information can be obtained by contacting the School.
For information on international qualification equivalents, please check the specific information for your country.
English Language Requirements
Evidence of an IELTS* score of 6.0, with not less than 5.5 in any component, or equivalent qualification acceptable to the University is required (*taken within the last 2 years).
International students wishing to apply to Queen's University Belfast (and for whom English is not their first language), must be able to demonstrate their proficiency in English in order to benefit fully from their course of study or research. Non-EEA nationals must also satisfy UK Visas and Immigration (UKVI) immigration requirements for English language for visa purposes.
For more information on English Language requirements for EEA and non-EEA nationals see: www.qub.ac.uk/EnglishLanguageReqs.
If you need to improve your English language skills before you enter this degree programme, INTO Queen's University Belfast offers a range of English language courses. These intensive and flexible courses are designed to improve your English ability for admission to this degree.
|Northern Ireland (NI) 1||£4,500|
|Republic of Ireland (ROI) 2||£4,500|
|England, Scotland or Wales (GB) 1||£4,500|
|EU Other 3||£22,000|
1 EU citizens in the EU Settlement Scheme, with settled or pre-settled status, are expected to be charged the NI or GB tuition fee based on where they are ordinarily resident, however this is provisional and subject to the publication of the Northern Ireland Assembly Student Fees Regulations. Students who are ROI nationals resident in GB are expected to be charged the GB fee, however this is provisional and subject to the publication of the Northern Ireland Assembly student fees Regulations.
2 It is expected that EU students who are ROI nationals resident in ROI will be eligible for NI tuition fees, in line with the Common Travel Agreement arrangements. The tuition fee set out above is provisional and subject to the publication of the Northern Ireland Assembly student fees Regulations.
3 EU Other students (excludes Republic of Ireland nationals living in GB, NI or ROI) are charged tuition fees in line with international fees.
All tuition fees quoted are for the academic year 2021-22, and relate to a single year of study unless stated otherwise. Tuition fees will be subject to an annual inflationary increase, unless explicitly stated otherwise.
Electrical & Electronic Engineering costs
There are no specific additional course costs associated with this programme.
Additional course costs
Depending on the programme of study, there may also be other extra costs which are not covered by tuition fees, which students will need to consider when planning their studies . Students can borrow books and access online learning resources from any Queen's library. If students wish to purchase recommended texts, rather than borrow them from the University Library, prices per text can range from £30 to £100. Students should also budget between £30 to £100 per year for photocopying, memory sticks and printing charges. Students may wish to consider purchasing an electronic device; costs will vary depending on the specification of the model chosen. There are also additional charges for graduation ceremonies, and library fines. In undertaking a research project students may incur costs associated with transport and/or materials, and there will also be additional costs for printing and binding the thesis. There may also be individually tailored research project expenses and students should consult directly with the School for further information.
How do I fund my study?1.PhD Opportunities
Find PhD opportunities and funded studentships by subject area.2.Funded Doctoral Training Programmes
We offer numerous opportunities for funded doctoral study in a world-class research environment. Our centres and partnerships, aim to seek out and nurture outstanding postgraduate research students, and provide targeted training and skills development.3.PhD loans
The Government offers doctoral loans of up to £26,445 for PhDs and equivalent postgraduate research programmes for English- or Welsh-resident UK and EU students.4.International Scholarships
Information on Postgraduate Research scholarships for international students.
Funding and Scholarships
The Funding & Scholarship Finder helps prospective and current students find funding to help cover costs towards a whole range of study related expenses.
How to Apply
Find a supervisor
If you're interested in a particular project, we suggest you contact the relevant academic before you apply, to introduce yourself and ask questions.
To find a potential supervisor aligned with your area of interest, or if you are unsure of who to contact, look through the staff profiles linked here.
You might be asked to provide a short outline of your proposal to help us identify potential supervisors.