With more than 100 million users and close to $10 billion in revenue. Intuit has both enterprise-scale obstacles and needs for data processing. Instead of developing separate architectures for each big data project, which would have made the data silo issue worse. Intuit adopted a unified strategy that relies on a lakehouse as the enterprise-wide standard for data.
Every day, more than 100 million people use the Intuit worldwide technology platform. Products from the company, like TurboTax, QuickBooks, Mint, Credit Karma, and Mailchimp. These are some of the most efficient and user-friendly and are made to assist clients with their most urgent financial issues. One of the software architects Rajat Khare works for Intuit. He has experience with the entire QuickBooks ecosystem of products from Intuit. Rajat has been working on the GraphQL ecosystem since its creation. He has created a number of services and applications that use it.
QuickBooks and TurboTax are two of the most popular consumer finance products ever produced, and both are trademarks of Intuit. However, Intuit is forging ahead with aspirations to transform itself into an AI juggernaut. Thanks to recent advancements in machine learning and the purchases of Credit Karma and Mailchimp.
ML & AI
For instance, the business uses machine learning to assist with the manual categorization of transactions in QuickBooks, sparing clients the time-consuming effort. This calls for highly customised machine learning, similar to the Credit Karma service that suggests to users how to raise their credit score depending on their inputs.
Having a strong data architecture is necessary for creating these new data-driven solutions. The corporation preferred to have thousands of analysts, data scientists, and software engineers on the same page with a single view of the data rather than creating multiple data systems for each of its projects.
When Amit and Manish Amde, Intuit’s director of engineering, joined the business three years ago, that architecture wasn’t even thought about. Both Amit and Amde worked at Origami Logic, which Intuit bought to aid in the development of their data and AI infrastructure.
Amde noted during a keynote talk at the Data + AI Summit, “Our data journey started at a place many of you would be familiar with.” “Our data environment was large, intricate, and disorganised. We required a plan to make the most of this data for consumers and small business clients.
The existence of numerous data silos was one of Intuit’s major problems. The hundreds of thousands of database tables containing decades’ worth of historical data were essential. It helped helping Intuit’s analysts and data scientists understand client wants and develop new products. However, it was dispersed around the company, making it challenging to access. The simplest option is to copy the data, however, this has its own set of issues with accuracy and latency.
Like most successful, $10 billion firms, Intuit had a certain amount of technology baggage when Amit and Amde first started working there. First off, Intuit operated a sizable “Parquet cluster” in the cloud and was an AWS company. The customer was content with RedShift and Athena and saw no reason to switch. It was loved due to the low latency that Apache Flink could provide for streaming.
To find a solution, Amit and Amde would have to work within these limitations (and others). From their time at Oragami Logic, both were already familiar with Databricks and were aware of the capabilities of the platform. Amde has previously collaborated with the Databricks creators while Spark was still a largely unheard-of computing project.
The new data architecture had to meet a number of criteria laid out by the data leaders. In order to promote an experimental culture, internal users first needed to be able to quickly create data pipelines and receive results to queries. Additionally, they wanted a storage repository that could handle transactions.