Job Description
About Us
We're building AI-native tools that harness the power of large language models (LLMs) to help customers solve high-impact, real-world problems in financial crime compliance. Our platform integrates structured and unstructured data, enabling rapid prototyping, seamless collaboration, and fast iteration with a strong focus on end-to-end delivery.
As we scale, we're looking for a data scientist who thrives at the intersection of research, implementation, and cross-functional collaboration — someone who can own analytical work end-to-end and contribute directly to customer-facing solutions.
What You'll Do
-
Lead data exploration and analysis on large scale financial crime datasets — including sanctions, PEP (Politically Exposed Persons), and adverse media data — to uncover patterns, identify false positives/negatives, and drive feature improvements.
-
Develop and evaluate agents and rule-based models by running experiments, validating hypotheses, and fine-tuning thresholds to improve alert efficiency.
-
Build and deliver production-ready API integrations — coordinating with software engineers and product teams to ensure components are properly integrated, tested, and merged.
-
Conduct customer-focused data studies across multiple enterprise clients (e.g., financial institutions) to benchmark model performance, assess data quality, and propose data driven solutions to reduce investigation loads.
-
Prototype and iterate quickly — using PySpark, Jupyter notebooks, and Python to explore data, build reproducible pipelines, and generate insights that inform product decisions.
-
Investigate and resolve product issues in collaboration with engineering and product teams.
-
Contribute to R&D on emerging techniques — including graph-based approaches (GNNs, graph embeddings) for transaction monitoring, LLM-based feature exploration, and RAG-based models.
-
Communicate findings clearly through well-organized Jupyter notebooks, internal documentation, and stakeholder presentations, translating complex analytical results into actionable business insights.
What We're Looking For
-
Based in Singapore or London (remote-first team; flexible working environment).
-
Bachelor's or Master's degree in Data Science, Computer Science, Statistics, or a related field.
-
Minimal 3 years of hands-on experience delivering data science projects, ideally in financial crime compliance, name screening, or AML/KYC domains.
-
Strong proficiency in Python (data manipulation, modelling, pipeline development) and SQL / Spark SQL for large-scale data querying and transformation.
-
Hands-on experience with PySpark or similar distributed data platforms.
-
Familiarity with NLP techniques, and entity resolution concepts.
-
Experience working with LLMs or RAG-based models for information extraction or classification tasks is an advantage.
-
Solid understanding of data quality assessment, including profiling, anomaly identification, and merging logic across complex multi-source datasets.
-
Comfortable working in Git, Docker, Linux, and collaborative development workflows (including code reviews and pull requests).
-
Strong analytical and problem-solving skills — able to investigate ambiguous data issues, form hypotheses, and validate findings rigorously.
-
Good communication skills — able to document findings in a structured and reproducible manner (Jupyter notebooks, Confluence), and present results clearly to both technical and non-technical stakeholders.
-
A mindset of ownership and curiosity: you take initiative, ask the right questions, and follow through to delivery.
