Who will validate my model? How to apply peer review to Data Science projects

BBVA AI Factory Team Ways of Working

“Two heads are better than one”. This traditional aphorism reminds us that solutions that are assessed by several people are more convenient than those based on a single opinion. A critical phase in the field of academic research is also based on this idea. The so-called peer review process concerning the evaluation of the suitability of a manuscript for publication. …

Applying Transfer Learning to Natural Language models

Maria Hernandez Data Processing

Natural Language Processing (NLP) has been one of the key fields of Artificial Intelligence since its inception. After all, language is one of the things that defines human intelligence. In recent years, NLP has undergone a new revolution similar to the one that took place 20 years ago with the introduction of statistical and Machine Learning techniques. This revolution is …

From BBVA Data & Analytics to BBVA AI Factory

BBVA AI Factory News&References, Vision&Values

Ten years ago, a group of inquiring minds at BBVA created a small innovation program to explore opportunities to exploit financial data. A decade may not seem like a long time, but when it comes to technology, it feels like time travels twice as fast. By 2011 we hadn’t yet seen a machine beat the ‘Go’ world champion in the …

Uncertainty Models and Detection of Balance Anomalies

Jose Antonio Rodriguez Serrano Data Processing

The development of Artificial Intelligence in the financial sector drives the creation of new data-based products, which are often linked to a new form of relationship between customers and financial institutions. In this sense, one of the main trends is related to the generation of personalized services and products that allow us to better manage our finances. The BBVA app …

Explaining the reliability of algorithms to humans

Jose Antonio Rodriguez Serrano Data Processing

Machine learning systems have a problem: they are imperfect and can sometimes err. And we humans have a problem too: we are not yet used to working with imperfect results. In 2018, coinciding with the Football World Cup, a company ventured to forecast the probabilities of each team becoming champion -the original report is not available but you can still …

Diversity and inclusion as a policy decision

Joan Llop Vision&Values

In the science and technology sector we see an undeniable reality: there is still a shortage of women to balance the gender gap. This is an objective and, in the vast majority of companies in the sector, a fact that can be mesured. Even though at BBVA Data & Analytics we have a clear commitment to diversity and inclusion, it …

A random search at NeurIPS 2019

Pablo de Jesús Campos Viana AI Factory was there

Last December we had the opportunity to attend the 33rd edition of NeurIPS, one of the most prestigious machine learning conferences in the world. The conference was held in Vancouver, Canada from December 8-14, and was organized with tutorials, workshops, demos, presentations and poster sessions. In this article we summarize some relevant aspects of the event. Some numbers With approximately …

Text categorization and tag suggestion in a single model

Pau Batlle Data Processing

In this post, I would like to explain the topic of my work during the 2018 Internship, continuing the research I did in 2017 and explained in another post. The problem we try to solve is the joint classification and tag prediction for short texts. Tag prediction and classification This machine learning problem arises in practical applications such as categorizing …

An AI Factory team releases internal software to analyze relations

Santiago Basaldúa News&References

A 283 year old field of mathematics founded by Leonhard Euler with his “Seven Bridges of Königsberg” problem is changing the way we tackle customer analytics at BBVA. Rather than studying clients, corporations, managers, products or ATMs as static objects with standard attributes or “labels” such as those emerging from traditional customer segmentation, this field, called Graph Theory, focuses on …

What we saw (and what we showed) at KDD 2019

Jose Antonio Rodriguez Serrano AI Factory was there

One of our first and most successful applications of machine learning to a retail financial tool included in the BBVA app is that which allows customers to know a forecast of recurring expenses and incomes for next month. Knowing what day you will receive the car insurance charge or a recurring transfer -and its amount- is key to manage your …

Accelerating Data Science Workflows

Jose Antonio Rodriguez Serrano Ways of Working

There are many possible answers to the question “What does a Data Scientist Do?”. But a short one that we especially like is: “Use data to help reduce inefficiencies in products, services or processes”. So we see retail companies employing Data Scientists to try to make shipments more efficient, while in banking we use it to predict the next likely …

Talking about inequality with Esteban Moro

Joan Llop AI Factory Talks

On April 10 we had the opportunity to learn about “The Atlas of Inequality“ project, which has been developed by the Human Dynamics group of the MIT Media Lab in collaboration with the Mathematics Department from the Universidad Carlos III, based in Madrid (UC3M). One of the main researchers of this project is Esteban Moro, associate lecturer at UC3M and …

Call for Participants: Data Challenge in Financial Macro-modeling

Jairo Mejía News&References

As data scientists we love participating in various initiatives outside the scope of our daily jobs. This gives us the chance to learn new things that are not directly related to our field of expertise and take a fresh look into complex analytical problems. At the same time, these kinds of experiences allow us to collaborate with colleagues who normally …

More women and diversity in technology

Ana Pombo Vision&Values

To celebrate International Women’s Day, we organized and participated in debates regarding the importance of adding more diversity to technology and Data Science teams, as well addressing challenges for women in the labor market. We invited the co-founder of Mujeres Tech, Cristina Aranda, to talk about the challenges facing women in science and technology. One of the recurrent issues in …

The best events for a traveling data scientist

Jairo Mejía News&References

We get it! You love gatherings where the best data science and advanced analytic knowledge is shared. The research papers, the workshops, the posters, the keynote speakers, and the applied use cases are a great way of getting fresh perspectives and ideas. We also know that you love the chance to enjoy the best cities where these events are organized. …

The #10yearchallenge of Data Science

Jairo Mejía AI Factory Recap

Ten years ago the term “Data Science” was only 7% of what it is today in Google Trends. It was almost non-existent in the news, and only timidly gaining ground in the corporate narrative. One has to go back to 2010 to see a first comprehensive definition of the nascent discipline of data science in the media. The Economist ran …

A few Recommendations for a Data Scientist who wants to get started in Recommender Systems

E052179 Data Processing

As a Data Scientist, you are expected to be able to build all sort of data products, that may involve simple-yet highly valuable business trends extracted through data querying and cleansing; and sometimes, more sophisticated Machine Learning algorithms for prediction, classification, or even recommendation. However, the cold start in a specific topic may be tough for Data Scientists, especially for …

The most important developments in data science of 2018

Jairo Mejía AI Factory Recap

The year 2018 has been one of the most important years in terms of breakthroughs in Machine Learning technologies. It has also been important vis à vis the debate on how to move forward beyond pure optimization, into a more advanced discipline of Data Science and real applied Artificial Intelligence. Ranging from a realization of the challenges of the industrialization …

The Best Online Courses for Data Scientists

Jairo Mejía News&References

The Data Scientist profile is one of the most demanded profiles in the labor market. At the same time, the Data Science toolset is becoming more diverse and the skills demanded are broader. Luckily, for those trying to take the first steps into Data Science or mastering techniques, there are many excellent online courses. After publishing a list of recommended …

Bayesian Deep Learning meets Google Cloud for a better forecasting engine at BBVA

Jairo Mejía Data Processing, News&References

BBVA Data & Analytics have just published a white paper in partnership with Google Cloud that showcases an end-to-end solution to deploy to production a Deep Learning model for time series forecasting. The model incorporates uncertainty of the predictions, which, we believe will have a powerful impact on improving the customer experience of products such as BBVA’s expected expense tracker …

How Data-Driven Initiatives can Save Young Lives

Jairo Mejía Data Stories

In one of the largest cities on the planet, a lot of things happen every day. Mexico City is one of the largest megalopolis in the world, and where adequate sensorization could make it an ideal laboratory for the use of data for the good of its citizens. Furthermore, the city could encourage public participation in data-driven initiatives. One such …

Self-Service Performance Tuning for Hive

Angel Puerto Data Processing

Hive is a very powerful data warehouse framework based on Apache Hadoop. The two together provide stable storing and processing capabilities for big data analysis. In this article, we will analyze how to monitor metrics, tune and optimize the workflow in this environment with Dr. Elephant. Hive is designed to enable easy data summarization, ad-hoc queries, and big data analysis. …

Building Open Source Software in a Large Corporation

Santiago Basaldúa News&References

The world runs on data. However, without the dynamic, accessible and adaptable nature of OSS (Open Source Software) the pace of exploitation of data-rich fields would be painfully slow. Imagine a world of Data Science without Linux, Python, Anaconda or Tensorflow, just to cite some relevant examples of Open Source Software. During the last few years, the trend of using …

Improving Predictions in Deep Learning by Modelling Uncertainty

Axel Brando Data Processing

At BBVA we have been working for some time to leverage transactional data of our clients and Deep Learning modes to offer a personalized and meaningful digital banking experience. Our ability to foresee recurrent income and expenses in an account is unique in the sector. This kind of forecasting helps customers plan budgets, act upon a financial event, or avoid overdrafts. All …

What Will the Bank of the Future Look Like?

Jairo Mejía Vision&Values

Until not long ago many people would still show up at a bank branch with a paycheck from their employer, collect the money and leave with a fresh stack of bills, never to be seen between the columns of the gargantuan building until the next payday. They did not see the use case for a bank account, since the mattress …

Fairness by Design in Machine Learning is Going Mainstream

Jairo Mejía Vision&Values, Ways of Working

The consideration of fairness in the development of machine learning-based solutions is gaining traction as a key aspect of artificial intelligence and modelling of social behaviours. This week the Harvard Business Review published a story authored by the leading forces behind a health analytics project that is using Deep Learning to detect people who are at risk of cardiovascular disease. The …