At BBVA AI Factory we work with the latest techniques in Artificial Intelligence applied to the financial sector, developing concepts such as bias mitigation or the interpretability of algorithmic models.
This dissertation is devoted to developing theoretical and practical tools to enable adaptation of machine learning models in company production environments. More precisely, we focus on devising mechanisms to exploit the knowledge acquired by models to train future generations that are better fit to meet the stringent demands of a changing ecosystem. We introduce copying as a mechanism to replicate the decision behaviour of a model using another that presents differential characteristics, in cases where access to both the models and their training data are restricted. We discuss the theoretical implications of this methodology and show how it can be performed and evaluated in practice. Under the conceptual framework of actionable accountability we also explore how copying can be used to ensure risk mitigation in circumstances where deployment of a machine learning solutions results in a negative impact to individuals or organizations.
Irene Unceta Doctoral Thesis, FEBRUARY 2021
The analysis of end-of-day account balance offers key indicators of customers’ situation and can help to define advisory and proactive actions towards their financial well-being. The detection of unexpected variations in its evolution arise as a key matter, given that they may expose events that require immediate attention or substantial changes in customers’ context. We present a system that puts together (i) time series forecasting with uncertainty; (ii) statistical detection of extreme values; and (iii) analysis of financial transactions to notify and improve the visibility of this type of events for customers.
KDD WORKSHOP ON MACHINE LEARNING IN FINANCE, AUGUST 2020 · VIDEO
In regression tasks, aleatoric uncertainty is commonly addressed by considering a parametric distribution of the output variable, which is based on strong assumptions such as symmetry, unimodality or by supposing a restricted shape. These assumptions are too limited in scenarios where complex shapes, strong skews or multiple modes are present. In this paper, we propose a generic deep learning framework that learns an Uncountable Mixture of Asymmetric Laplacians (UMAL), which will allow us to estimate heterogeneous distributions of the output variable and shows its connections to quantile regression.
NEURIPS, VANCOUVER (CANADA), 2019 · FURTHER READING
The abundance of data about user transactions and interactions has fostered the use of machine learning techniques in financial institutions and startups. Going beyond classical application areas, such as campaign management, these methods can be used to personalize services at different levels. In this paper we explore a use case, related to mobile banking apps, to forecast unusual expenses.
KDD WORKSHOP ON ANOMALY DETECTION IN FINANCE, ANCHORAGE (USA), AUGUST 2019
Ensuring classification models are fair with respect to sensitive data attributes is a crucial task when applying machine learning models to real-world problems. Particularly in company production environments, where the decision output by models may have a direct impact on individuals and predictive performance should be maintained over time. In this article, build upon, we propose copies as a technique to mitigate the bias of trained algorithms in circumstances where the original data is not accessible and/or the models cannot be re-trained.
IBPRIA, MADRID (SPAIN), 2019 · FURTHER READING
BBVA was a global financial group that established a data science center of excellence (CoE) in 2014 to co-create high-value analytics-based solutions with BBVA project teams; create innovative BBVA data science capabilities; and lead BBVA’s data-driven culture change.
MIT CISR, CAMBRIDGE MA (USA), 2018
In regression tasks, the straightforward application of Deep Learning models provides a point estimate of the target. In addition, the model does not take into account the uncertainty of a prediction. This represents a great limitation for tasks where communicating an erroneous prediction carries a risk. In this paper we tackle a real-world problem of forecasting impending financial expenses and incomings of customers, while displaying predictable monetary amounts on a mobile app.
ECML PKDD, DUBLIN (IRELAND), 2018
Domestic tourism is harder to analyse compared to international tourism due to its smaller data footprint generation, as most times private means of transport are used, no border is crossed, and no lodging is registered. Digital data sources can be a useful, but still underused, complement to official survey-based statistics to fill this lack of reliable information. These digital sources can not only extend the scope of the scientific literature, but will also increase the level of detailed information throughout time and space, what will enable an improved management of the touristic sector. In particular the present paper covers a research gap in the use of card transactions data (on site payments and cash withdrawals) to provide an innovative methodology to enhance vision on domestic tourism dynamics.
MADRID (SPAIN), 2019
Unfair pricing policies have been shown to be one of the most negative perceptions customers can have concerning pricing, and may result in long-term losses for a company. Despite the fact that dynamic pricing models help companies maximize revenue, fairness and equality should be taken into account in order to avoid unfair price differences between groups of customers. This paper shows how to solve dynamic pricing by using Reinforcement Learning (RL) techniques so that prices are maximized while keeping a balance between revenue and fairness.
MADRID (SPAIN), 2018
The scored task at FEIII Challenge 2018 proposed the identification of competitor relationships in a network of companies from the financial and IT sectors. This article describe our BBVA Data & Analytics submission to the challenge and our experiments with three different approaches to predict competitor links: a local classifier, a recommender and a relational classifier.
MADRID (SPAIN), 2018
In this paper we present a high-dimensionality Retail Trade Index (RTI) constructed to nowcast the retail trade sector economic performance in Spain, using Big Data sources and techniques. The data are the footprints of BBVA clients from their credit or debit card transactions at Spanish point of sale (PoS) terminals
MADRID (SPAIN), 2018
Digital data sources can be useful in measuring the evolution of tourism. In particular, card transactions are a good way of analysing domestic tourism. To do so, firstly, transactions have to be classified as touristic or non-touristic. This paper presents a methodology to identify the usual environment of cardholders, so as to determine whether their transactions are carried out inside or outside that area.
ENTER CONFERENCE, JÖNKÖPING (SWEDEN), 2018
This is a proposal for a presentation on the relation between Machine Learning and design for trust at the Designing the User Experience of Artificial Intelligence symposium as part of the 2018 AAAI Spring Symposium Series in Palo Alto, CA. Trust is at the bedrock of our human social system. We will share our experiments and approaches that use Machine Learning techniques to tackle mistrust and foster a trustworthy relation with our customers.
AAAI SPRING SYMPOSIA, PALO ALTO (USA), 2018
In the present paper, we study the application of time series forecasting methods to massive datasets of financial short time series. In our example, the time series arise from analyzing monthly expenses and incomings personal financial records. Unlike from traditional time series forecasting applications, we work with series of very short depth (as short as 24 data points), which does not allow us to use classical exponential smoothing methods.
People are increasingly leaving digital traces of their daily activities through interacting with their digital environment. Among these traces, financial transactions are of paramount interest since they provide a panoramic view of human life through the lens of purchases, from food and clothes to sport and travel. In this paper, we aim to elucidate how macro-socioeconomic patterns could be understood based on individual financial decisions.
PLoS ONE, 2017
Millions of euros are lost every year due to fraudulent card transactions. The design and implementation of efficient fraud detection methods is mandatory to minimize such losses. In this paper, we present a neural network based system for fraud detection in banking systems.
Socioeconomic inequalities in cities are embedded in space and result in neighborhood effects, whose harmful consequences have proved very hard to counterbalance efficiently by planning policies alone. Considering redistribution of money flows as a first step toward improved spatial equity, we study a bottom-up approach that would rely on a slight evolution of shopping mobility practices.
APPLIED NETWORK SCIENCE, 2017
The increasing capacity to capture and feed behavioral data for systems to learn is transforming the design of user experiences. In this paper, we discuss two effects of this emerging toolkit. On the one hand, creating experiences with learning algorithms is pushing designers to consider how users begin, evolve, and end their interactions, which themselves produce and consume data. On the other hand, the design of experiences powered by machine learning is now occurring in new, multidisciplinary teams, which presents a variety of frictions and opportunities for misunderstandings which must be overcome.
The AAAI SPRING SYMPOSIUM ON DESIGNING THE USER EXPERIENCE OF MACHINE LEARNING SYSTEMS TECHNICAL REPORT, PALO ALTO (USA), 2017
Scientific studies of society increasingly rely on digital traces produced by various aspects of human activity. In this paper, we exploit a relatively unexplored source of data–anonymized records of bank card transactions collected in Spain by a big European bank, and propose a new classification scheme of cities based on the economic behavior of their residents.
PLoS ONE, 2016
This research explores the potential to analyze bank card payments and ATM cash withdrawals in order to map and quantify how people are impacted by and recover from natural disasters. Our approach defines a disaster-affected community’s economic recovery time as the time needed to return to baseline activity levels in terms of number of bank card payments and ATM cash withdrawals.
DATA FOR GOOD EXCHANGE, BLOOMBERG, NEW YORK (USA), 2016
Pricing is a fundamental problem in the banking sector and is closely related to many financial products, such as insurance or credit scoring. The purpose of this research is to provide a simple framework to calculate the Pareto frontier of several pricing strategies through Random Optimization (RO) driven by a probabilistic model, maximum decay points (MDP) and business constraints.
10th INTERNATIONAL CONFERENCE ON COMPUTATIONAL AND FINANCIAL ECONOMETRICS, UNIVERSITY OF SEVILLE (SPAIN), 2016
In order to produce reliable predictions out of only 12-point series, we have adopted a ML approach developing of a deep-network based regression method, whose output is a single point estimate.
Workshop on Machine Learning for Spatiotemporal Forecasting, NIPS, Barcelona, Spain, 2016
Scientific studies investigating laws and regularities of human behavior are nowadays increasingly relying on the wealth of widely available digital information produced by human social activity. In this paper we leverage big data created by three different aspects of human activity (i.e., bank card transactions, geotagged photographs and tweets) in Spain for quantifying city attractiveness for the foreign visitors.
IEEE INTERNATIONAL CONGRESS ON BIG DATA, SANTA CLARA (USA), 2015
For centuries quality of life was a subject of studies across different disciplines. However, only with the emergence of a digital era, it became possible to investigate this topic on a larger scale. Over time it became clear that quality of life not only depends on one, but on three relatively different parameters: social, economic and well-being measures. In this study we focus only on the first two, since the last one is often very subjective and consequently hard to measure.
IEEE INTERNATIONAL CONGRESS ON BIG DATA, SANTA CLARA (USA), 2015
Human mobility has been traditionally studied using surveys that deliver snapshots of population displacement patterns. The growing accessibility to ICT information from portable digital media has recently opened the possibility of exploring human behavior at high spatio-temporal resolutions. In this work, we analyze credit-card records from Barcelona and Madrid and by examining the geolocated credit-card transactions of individuals living in the two provinces, we find that the mobility patterns vary according to gender, age and occupation.
Intensive development of urban systems creates a number of challenges for urban planners and policy makers in order to maintain sustainable growth. In this paper we propose a novel approach of city scoring and classification based on quantitative scale-free metrics related to economic activity of city residents, as well as domestic and foreign visitors.
ASE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE, STANFORD UNIVERSITY (USA), 2014
Increasing availability of big data, which documents human activity in space and time, offers new solutions to well-known operational problems. In this study, we demonstrate the potential of a new type of extensive data, namely bank card transactions executed in a variety of businesses by the domestic and foreign customers of a Spanish bank.
IEEE INTERNATIONAL CONGRESS ON BIG DATA, WASHINGTON DC (USA), 2014