Helping Van Gogh Museum harness the power of AI Language Models

Helping Van Gogh Museum harness the power of AI Language Models

Article Generative AI Data & AI

The latest advances in natural language processing (NLP) and low-code machine learning pipelines have opened the door for many organizations to perform a wide range of new language-related tasks and integrate them in their core business. Microsoft Azure Language Services, for instance, offers comprehensive language support in a wide range of languages, and can be integrated with other Azure services to create powerful solutions for language-related tasks. The user-friendly interface makes it easy to use even for organizations with little to no experience in natural language processing.

Analyzing Visitor Experience at Van Gogh Museum

Our collaboration with the Van Gogh Museum in Amsterdam is a good example of how these tools can be used and implemented. The museum makes the life and work of Vincent van Gogh and the art of his time accessible to the broad public and strives to inspire as many people as possible and enrich their lives. Every year, the museum welcomes more than 1.6 million visitors from all around the globe, making it one of the top 25 most-visited art museums in the world.

The museum systematically collects visitors’ feedback to gain deep insights into their experience and appreciation, to enhance both. Given the international audience in the museum, the feedback is entered in multiple languages. Every month, about 1,500 anonymous comments are sifted through and analyzed to extract useful insight into what can be improved and enhanced in the museum. Extracting that useful insight requires the manual categorization of feedback for sentiment and more than 15 broad topics. This is indeed quite a lengthy and tedious task.

With our help, the museum can now train an AI language model on Microsoft Azure Language Services, using past, manually labelled reviews as training data, to enable the automatic classification of feedback for different topics and sentiments. After the initial training, the sentiment model reached an impressive 89% F1 score (a measure of accuracy based on precision and recall), and the topic model reached an average of 77% F1 score across around 20 different topics, making the automation of future feedback classification reliable and feasible. Automating the expensive, manual process with a high degree of confidence about output validity allows the museum to free up valuable time for extracting high-level insight about the visitors’ experience. If you want to learn more about the F1 score, check our our article on the difference between accuracy, precision, and recall.

In our collaboration with the museum, we have found that despite the user-friendly interface of Microsoft Azure Language Services, experience in data preparation was needed to convert visitor feedback into a format that can be used in training and evaluation. The original format of the labelled and analyzed customer feedback was a large Excel file, chosen for ease of access and sharing within the museum. With our help, the individual reviews and their sentiment and topic labels were then separated and converted to text files to be fed into Microsoft Azure Language Services.

To ensure that model training and future automation of feedback classification go smoothly, we helped the museum create a data preparation pipeline to train sentiment analysis and topic prediction models. In addition, we shared our expertise and discussed with them their choice of model deployment and inference to automate visitor feedback classification at scale, enabling the museum to deploy the trained AI model and integrate it into their visitor insight process.

Eraneos Analytics bridging the gaps in the data pipeline

We have learnt from our experience with the Van Gogh Museum that users with limited knowledge of machine learning can use innovative natural language processing technology. But while this tool allows for the use of accurate text analytics, it also requires programming knowledge in Python, and expertise in data structures and preprocessing, machine learning and natural language processing, as well as model deployment. This means that the Microsoft Azure Language Services are not yet offered in a way that is seamless and code-free as one would expect.

If you are interested in help with harnessing the potential of Microsoft Azure Language Services, we will be happy to assist you in all the steps where this helpful tool is not yet well integrated within an overall process of data ETL (Extract, Transform, and Load), in your warehouse solution, and subsequent machine learning and analytics modules. We can also impart experience and advice in the decision-making process necessary in the modelling step, specifically in how to translate a business problem into an effective machine learning solution and how to ensure model performance improves with every iteration of model training.


Stay up to date!

Are you enjoying this content? Sign up for our (Dutch) Newsletter to get highlighted insights written by our experts.


Yaron McNabb
By Yaron McNabb
Senior Data Scientist
Wido van Heemstra
By Wido van Heemstra
Applied AI Lead – Data & AI Consultant

Knowledge Hub overview