Awa: Tell me about your role at Rubyx.
Giorgio: I’m a Chief Data Officer at Rubyx, and I’ve got two main jobs. First, I designed our data model – our data warehouse. Second, I verify that we have the data the client needs to answer both their questions and ours.
How do you verify that?
I’ve worked as a data analyst in a number of different sectors, and for the last ten years in microfinance. Thanks to that experience I have some knowledge of what must be analysed, and the questions that we have to ask in operations. My knowledge isn’t perfect – it’s always improving, and always must be challenged – but I can use that knowledge to verify we have the data we need to answer our questions, and the client’s.
What kind of questions are you looking to answer through data?
Whether it’s a financial institution or a digital platform, we want to understand how we can best monitor the evolution of a portfolio, of the disbursements, and the evolution of risk. Or how to best segment end customers in order to understand customer behaviour. These are the practical questions, and in order to answer them we need to check the client’s system, their data, and map it out to see what’s there, what gaps exist, and how we need to treat the data to translate it – convert it – to our data model.
Changing topics slightly, why did you make the move to Rubyx?
I wanted to do things better and smarter. Most of us who founded Rubyx came from Baobab, a microfinance institution. A few of us, including Thomas Carrié, our Chief Customer Officer, had been working together for about 5 years, and we understood each other very well. Since then, we’ve grown quite a bit, but the goal is still the same: to help our clients offer loans in a sustainable way.
What does “sustainable” mean in this context?
Loans that end customers can repay – the goal isn’t to drive customers into over indebtedness. That’s for both altruistic reasons and egoistic reasons: if you drive customers into overindebtedness, the business model will fail in the mid to long term, and you’re going to end up with big problems. It’s a choice between trying to create a virtuous circle or ending up destroying your business in a vicious circle.
Is that a problem that you’ve seen in lending before – lenders creating unsustainable debt for their customers?
Let’s say that in other industries I’ve worked in I’ve seen situations where the focus wasn’t really on the end customer, but on short term financial goals. That might be due to the size of the company. In big companies run by a board and lots of shareholders you can often lose sight of the impact on people. In our own small company we stay focused and make the decisions – backed by our values – that we think are best for our clients and their end customers.
What do you think about alternative scoring data – social data, stuff like that?
We don’t work with alternative scoring data because the best data – if available – is always going to be payment data. We’ve built our scoring around it. So we need to understand how this data is structured in the client’s operational systems and translate it so we have the data to perform the scoring. Our scoring is based on observing how customers have paid in the past and predicting future payment behaviour. Based on that, we provide the score. It’s the most powerful data you can have.
But doesn’t that mean you can only work with existing customers?
For sure. Which is why we want to keep exploring additional options. We already use transactional data – with ecommerce platforms you have huge amounts of data.
In terms of getting the data you need, do platforms pose additional challenges vs. financial institutions?
I’d say there’s a bit of extra creativity required from the data science team. When we talk with prospective digital platforms the data science team looks at the data available and brainstorms all the possible ways of profiling customers and connecting these behaviours to our goal – forecasting the creditworthiness of potential end customers. After they’ve gone through these exercises they tell me the data they need.
So it’s never simply just a question of plugging client data into your data model.
No, because clients have very different operations, different systems, and very different ways to represent the same things. Usually the systems are operational, meaning they weren’t created for the purpose of doing analysis, but created for the daily operation of a bank, or a digital platform. They’re not structured so you can run an algorithm – I mean, maybe you can, but that’s not what they were designed for. They might be different, tables and fields might be missing, but we need to convert it to our tables and fields for our data model. Then we check the data quality with the client to make sure everything that comes after – analysis and reporting – will be based on consistent, high-quality data.
Does Rubyx have one data model, or multiple?
We decided to have one data model rather than many in order to have one unique base code, which is the underlying structure behind the data model. It’s easier to make modifications, and helps us be more flexible and faster. But there’s always a tradeoff. For a number of clients, like digital platforms, we have to make adaptations to convert the client’s data to our model, which was originally built for financial institutions.
And then you can make apples-to-apples comparisons between clients, right?
Yes, because after you convert the data, treatments are standard. Look, I’m not saying this is going to be forever, but having one base code for all clients makes things much easier. In the past I’ve worked in companies where we had different base codes for different clients but it was much messier. Every client is unique, and brings challenges of specificity and complexity, but we prefer to deal with those things right at the beginning when we onboard them. That’s why we spend time with each client figuring out how to adapt their data.
How can clients like financial institutions structure their data to best exploit it for lending?
I think what’s really important is that the process of collecting the data in a data warehouse or data lake is flexible and agile. I hope this doesn’t sound too esoteric, but the idea is to have modular processes so the data flows can be split into different parts, with the fewest interdependencies possible. That way, when you have an issue, you can intervene and fix only the specific part you need. And everything should be replicable, so that every time you execute some data treatment you can get the same result. It’s much cleaner and easier to be managed. So, the two things I would strongly recommend are modularity and replicability.
Finally, what does it take to succeed here at Rubyx?
You need to be curious and make every effort to understand people, the clients, and their end customers. What are their challenges? What are their goals? You need to like this specific context, too. Be interested in microfinance, and the markets we work with, and understanding customer behaviour in developing countries. And you need to understand what the client wants and be proactive, looking for things the client didn’t see, identifying trends in the data. That’s what helps you set up a virtuous cycle where you always learn more and get better.