Research Engineer, Data (Foundational Research, Machine Learning)
Are you a curious and open-minded individual with an interest in state-of-the-art machine learning engineering and research? Thomson Reuters Labs is seeking a Data Engineer with a passion for solving challenging machine learning problems in a data-rich, complex and innovative environment.
What does Thomson Reuters Labs do? We experiment, we build, we deliver. We support the organization and our product teams through foundational research and development of new products and technologies. The Labs innovate collaboratively across our core segments in Legal, Tax & Accounting, Government, and Reuters News. We undertake a diverse portfolio of projects while investing in long-term research for the future.
As a Research Engineer, you will be part of a diverse global team of experts. We hire worldleading specialists in SWE /Applied ML, as well as Research, to drive the company's leading internal AI model development, fueled by an unprecedented wealth of data and powered by cutting-edge technical infrastructure. You will have the opportunity to contribute to a data curation & filtering system combining the best of real-world scalable data processing systems combined with the latest insights into what training data leads to the best LLMs. Thomson Reuters Labs is known for consistently delivering successful datadriven Artificial Intelligence solutions in support of high-growth products that serve Thomson Reuters customers in new and exciting ways.
About the Role:
In this opportunity as a Research Engineer - Data, you will:
Innovate: You will work at the very cutting edge of AI Research at an institution with some of the richest data sources in the world. Through your work, you will help us make the best use of this resource, in a dynamic flywheel that connects data collection & annotation with model training and expert evaluation, helping us continuously improve our training data. You will also develop novel performance-driven data sub-selection methods together with the latest training insights from our researchers.
Engineer and Develop: Design, develop, and optimize scalable data pipelines to support LLM training and evaluation. You will also help us develop this in a robust and testable way, through careful source control and a solid back-up system for various data versioning methods.
Collaborate: Working on a collaborative global team of engineers and scientists both within Thomson Reuters and our academic partners at world-leading universities. In addition, you will work closely with world experts in the legal domain, which can provide feedback to your work and/or evaluate your outputs or annotate training data.
About You:
You're a fit for the role of Research Engineer - Data, if your background includes:
Required qualifications:
Relevant degree in a technical discipline.
Interest in & experience working with (applied) machine learning, e.g. few-shot learning with out-of-the-box language models, training of smaller NLP classifiers, etc.
Excellent programming, debugging and system design skills.
Excellent communication skills to report and present software designs and findings clearly, both orally and in writing.
Curious and innovative disposition capable of devising novel, well-founded algorithmic solutions to relevant problems.
Self-driven attitude and ability to work with limited supervision.
Experience with relational and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra).
Experience with data pipeline orchestration tools.
Experience with cloud-based data platforms such as AWS, GCP, or Azure (e.g., S3, BigQuery, Azure Data Lake Storage).
Comfortable working in fast-paced, agile environments, managing uncertainty and ambiguity.
Preferred qualifications:
Additional legal knowledge as evidenced by a degree or interest in the legal domain.
Ability to communicate with multiple stakeholders, including non-technical legal subject matter experts.
Experience with big data technologies such as Spark, Hadoop, or similar.
Experience conducting world-leading research, e.g. by contributions to publications at leading ML venues.
Previous experience working on large-scale data processing systems.
Strong software and/or infrastructure engineering skills, as evidenced by code contributions to popular open-source libraries.
#LI-JF1
What's in it For You?
Join us to inform the way forward with the latest AI solutions and address real-world challenges in legal, tax, compliance, and news. Backed by our commitment to continuous learning and market-leading benefits, you'll be prepared to grow, lead, and thrive in an AI-enabled future. This includes:
Industry-Leading Benefits: We offer comprehensive benefit plans to include flexible vacation, two company-wide Mental Health Days off, access to the Headspace app, retirement savings, tuition reimbursement, employee incentive programs, and resources for mental, physical, and financial wellbeing.
Flexibility & Work-Life Balance:Flex My Way is a set of supportive workplace policies designed to help manage personal and professional responsibilities, whether caring for family, giving back to the community, or finding time to refresh and reset. This builds upon our flexible work arrangements, including work from anywhere for up to 8 weeks... For full info follow application link.
As a global business we rely on diversity of culture and thought to deliver on our goals. To ensure we can do that, we seek talented, qualified employees in our operations around the world regardless of race, color, sex/gender, including pregnancy, gender identity and expression, national origin, religion, sexual orientation, disability, age, marital status, citizen status, veteran status, or any other protected classification under country or local law. Thomson Reuters is proud to be an Equal Employment Opportunity Employer providing a drug-free workplace.