Using oneAPI AI Toolkits from Intel and Accenture Part 2

Introduction

This is part 2 of our blog series. These posts are really about inspiring others to think of cool projects to do with oneAPI. In the last blog post, we discussed several toolkits that I thought were interesting. If someone produced an idea that uses those toolkits – I would love to know!!
In this blog post, I want to focus on two other AI toolkits and how to use them.
There is one other aspect that we should consider when looking at these toolkits: what do you believe might be unintended consequences?? As an exercise, I am going to go over these toolkits, but I would love to hear what you might believe are unintended consequences that have the potential to be overlooked.

Personalized Retail Experiences with Enhanced Customer Segmentation

Accenture has over 30 toolkits to showcase oneAPI, with more to come. In this post, I am going to look at this idea of using AI to personalize your shopping experience. I think all of us who do any kind of online shopping know the importance of providing a personalized shopping experience. First, we must understand how one might implement a personal shopping experience.
Today, retailers have an incredible amount of data at their disposal. The analytics market is worth up to $20 billion around the world and is growing at a 19.3 percent Compound Annual Growth Rate (CAGR). Retailers are eager to understand customer behavior so that they can provide a better shopping experience and thus drive brand loyalty.

To access the AI toolkit clone the github repository:

$ git clone https://github.com/oneapi-src/customer-segmentation

The reference kit will show how to analyze customer purchasing data and segment it into clusters based on customer behavior. It will also show you how to optimize the reference solution using the Intel Scikit-Learn extension.

The reference example uses an experimental dataset. The dataset is a set of 500k transactions covering 4000 customers from a UK multinational online retailer over a period of a year. The dataset is fed into KMeans and DBSCAN algorithms to label cluster based on different customer behaviors.

Try it out and send me some feedback. Also, keep in mind the challenge of what could go wrong. (disclaimer: I don’t know myself - I’m curious to hear theories)

Faster session Notes with Speech-to-Text AI for Healthcare Providers

My second example is going back to healthcare. Always a fun one. The same challenge as in the previous one.

The premise for this is that mental health providers are required to document their sessions using progress notes. These recorded sessions then need to be transcribed into written notes, and be stored for later reference.

Managing these notes can take quite a bit of time. The idea, then, is to take these recorded notes and feed them to a speech-to-text AI algorithm and provide a summary. This summary can then be used to coordinate care, creating a paper trail, compliance, and keeping track of the state of the client.

By reducing the book keeping, a therapist would have more time for their patients or the capacity to see more patients. Given the shortage of mental health professionals, being able to be more efficient and allowing more “human”contact time will help mental health professionals provide their clients with better care.

You can find the code for this implementation at:

$ git clone https://github.com/oneapi-src/ai-transcribe

The high level overview of this implementation is something like this. The conversion from speech to text is achieved by using a sequence-to-sequence framework called Fairseq. Sequence-to-sequence modeling is a type of machine learning that is commonly built to create summaries, text translations and so on. It was initially conceived by Google. Fairseq is an open source sequence-to-sequence framework from Facebook.

The process is described like this:

Take your dataset of unstructured audio samples
Run it through a data preprocessing using Fairseq modeling
Using GAN you create a trained model using both the training data and the pre-processing data.
Apply inference to generate the text.

I think one of the more interesting parts of this pipeline is the GAN - which is described as two algorithms: one as a generator and the other as a test. The two work against each other until they both end up with the same dataset that, ostensibly, is accurate.

One other piece that is missing as part of training the algorithm is a database of English text corpus data. This database contains speech audio files and text transcription. It is used to create a relationship between an audio signal and phonemes as part of speech recognition.

Where GAN comes in is that a neural network is trained to generate what it thinks are the representations of the phonemes as opposed to the real world data obtained from the corpus data. The other neural network is trained on the corpus data and acts as the validator - as the two neural networks work with each other - the generator portion of the neural network will finally produce the results as expected by the other neural network.

It is through this that we can validate that the output is correct.

The entire software to do this is all open source - I would love to hear from people who have tried it and share what their results were!

I think it would be interesting to train this with humans to determine how accurate it is so you can fully train the corpus and generator algorithm for better results.

Summary

I’ve reviewed two Accenture toolkits that demonstrate how AI can be used practically with real examples. Being a newcomer in this area, there is so much that I don’t know. Ironically, I use chatGPT to help explain some of the salient bits about how GAN works vis-a-vis audio data to really understand what was happening, especially in regards to mapping with words and phonemes.

Looking forward to people’s responses to this post and enjoying a great conversation about AI, its potential uses and applications by using real world examples.

Call to Action

Has this blog post inspired you to write something based on the oneAPI AI toolkits? Let me know - I would love to know how it works out for you!

hpc.social