Data Quality Revolution
How Powerimprove.ai Transformed Technische Unie's Product Data and Customer Experience.
Over the years, Heinen & Hopman has grown into an international market leader in Heating, Ventilation, Air Conditioning + Refrigeration systems (HVAC+R systems). The organization was founded in 1965 and now has offices in sixteen different countries. What Heinen & Hopman strives for is to provide the best HVAC&R systems in the world, by continuously innovating and expanding the worldwide service network. Heinen & Hopman has a number of challenges in the field of product data quality, and Squadra Machine Learning Company has used Powerimprove.ai to support Heinen & Hopman in this.
Challenge
The challenge for Heinen & Hopman was to unite the entire product database, with approximately 28,000 products, in a single environment, the Product Information Management (PIM) system. To this end, a process has been initiated internally at Heinen & Hopman to set up a PIM system and department. However, due to a number of different factors, it is not an easy process to quickly and accurately match the various sources of article data with the central article file. Namely, a unique article number is currently missing for many articles, and Heinen & Hopman purchases articles from different suppliers, who in turn regularly purchase from the same manufacturers, resulting in an overlap in the range of different suppliers. Furthermore, due to international ambitions, Heinen & Hopman wants to start working with GTIN article numbers, while at the moment its own material codes are still in use, in addition to supplier and manufacturer article codes and manually entered GTIN article numbers (with the necessary errors). Thus, the data presents challenges in the field of data quality, which can be improved with Powerimprove.ai. Finally, Heinen & Hopman is considering using a data pool, namely 2BA, to obtain product data from manufacturers.
Solution
Heinen & Hopman asked Squadra Machine Learning Company to carry out a Proof of Concept (POC) project because the aforementioned factors, in combination with the size of the range, made it an extremely time-consuming and complicated task to tackle manually. Hence the question whether this process of matching products can be (partly) automated using innovative Data Science and Machine Learning techniques. During the POC, Machine Learning Company matched the Heinen & Hopman master article file with two representative article files and a manufacturer article file from 2BA, scraped by the Powerenrich.ai software.
With these datasets, Heinen & Hopman gains insight into the extent to which the data can be matched automatically. The POC solution offers a number of smart functionalities such as a validation screen to approve or reject the matched articles. This uses the intelligent matching algorithm which puts the matches in an order of probability. If an EAN/GTIN, manufacturer or supplier number was present, the algorithm could match on this, but if not, then the algorithm had to match on the article name or article description, or a translation of this.
The POC is based on the processing of Excel files because this is in line with the working method that Heinen & Hopman has used to date. However, the solutions of Squadra Machine Learning Company can also be accessed via a REST API, so that the article data can also be accessed from and to a PIM system at a later stage.
Result
By using our Powerimprove.ai software, Heinen & Hopman can now automatically clean datasets with contaminated data. This results in enormous time savings and benefits product data related processes.