From Max Kühn
Retrieval Augmented Generation (RAG) enables Large Language Models (LLM) to access information from new, previously unknown documents and thus incorporate in-depth factual knowledge into their answers. However, practice shows that RAG applications reach their limits when questions are aimed at implicit knowledge that only emerges from the sum of several documents.
Knowledge graphs can help to expand these boundaries by linking and enriching information from different documents. In the third article in our "AI Deep Dive" series, we therefore explain ways in which knowledge graphs can be integrated into RAG applications for this purpose.
Under the hood: How RAG enriches user requests for AI
In the contributions to customized generative AI in companies and harnessing corporate knowledge with GenAI and Retrieval Augmented Generation we described how RAG can be used to enrich generative AI applications with new, company-internal knowledge treasures without time-consuming training.
The RAG approach described there, which is generally very widespread, works roughly as follows: A user asks a question to an LLM via an application. Before this question reaches the LLM, it is "embedded", i.e. converted into a numerical representation of its semantic meaning. This representation can be matched against equivalent embeddings of the company's internal documents with the help of search algorithms, which are prepared in so-called chunks in a vector database.
After the search, one or more text chunks are available that are similar in meaning to the question and are ultimately sent to an LLM together with it so that it has sufficient context to answer the question. Thanks to RAG, information that is explicitly contained in the source documents can be made easily accessible to LLMs.
Imagine, for example, a scenario in which a machine builder documents his troubleshooting assignments at the customer's premises in the form of error reports. Among other things, these contain information about the customer, the machine, the defective components and a narrative description of the problem and the solution.
If these are now stored in a vector database, helpful context information on questions such as "The following error has occurred: (...). Has a similar error already occurred in the past? If so, how can it be rectified?" can be found.
However, if questions are aimed at information that is implicitly available but not explicitly mentioned, such systems reach their limits. For example, the question "Which components are particularly frequently affected by errors?" is answered in the sum of the documents, but not in a single or a few text sections. To close this gap, knowledge graphs are an option that can be used in combination with "conventional" RAG.
Meaning and functionality of knowledge graphs
For the purposes of this article, knowledge graphs can essentially be regarded as a type of data structure that maps a network of entities in the form of nodes (i.e. objects, people, organizations, etc.) and their relationships to each other in the form of edges (see Figure 1)
Graph databases store this information and make it accessible via special query languages and corresponding visualizations.
Knowledge graphs are traditionally used in search engines or natural language processing, for example. In RAG solutions, they can help to link information from documents and thus map overarching concepts, which are then passed on to the LLM to answer complex user questions. Depending on the initial situation, knowledge graphs can be integrated into RAG solutions in various ways. In the following, we would like to present two particularly interesting and accessible variants as examples.
Illustration 1Example knowledge graph
How Knowledge Graphs can improve RAG applications
Variant 1: Graph as enriched vector database
In this variant, text chunks including embeddings are saved as nodes in a graph, their sequence is mapped according to the original document via relationships and then, if necessary, set in relation to other entities.
Figure 2 sketches a section of a graph as it might look for the mechanical engineering example. Here, the text chunks (chunk) hang over NEXT-relationships and are each assigned to an error report (report) via the BELONGS_TO-relationship.
Some graph databases offer functions of a vector database and thus enable a vector search for matching chunks to a user question. The graph then has the advantage of being able to quickly and easily provide related information in addition to the matching chunks, such as previous and subsequent chunks or the metadata of the specific error report.
This provides a real advantage if additional, easy-to-extract entities from the documents are mapped as such in the graph. In the case of error reports, for example, these are the affected components, shown in the figure via the component-nodes and the REFERS_TO-relationship.
Suitable queries can then be used to obtain information from the graph, for example, on how frequently a component occurs in all error reports. Answering the aforementioned question "Which components are particularly frequently affected by faults?" is thus within the realms of possibility. It is also conceivable that further information from other (structured) data sets could be added to the graph, e.g. from a component database that also contains information about manufacturers. This can - in theory - systematically expand the knowledge base of an LLM.
In practice, however, as is generally the case when creating a graph in this way, this can involve a great deal of manual effort. Whether this is worthwhile must be weighed up depending on the application, also against alternatives such as the direct integration of relational databases into the RAG application.
Illustration 2Text structure and additional information shown as a graph
Variant 2: Generated graph as an additional module
It is possible to use the advantages of a graph for RAG without having to create and maintain it manually. In this case, the graph is automatically generated by an LLM from the source documents, whereby it is not the text itself that is the subject of the graph, but the automatically recognized entities and relations contained therein.
Figure 3 shows an example of what a section of a graph for an error report could look like. Automatically recognized entities such as machines (machine), model series (Model) and customers or companies (Company) and the relationships between them (OF_TYPE, LOCATED_AT, SUBSIDIARY_OF).
In this variant, the graph is a supplement to a "conventional" vector database and does not replace it. This means that in the finished RAG application, the vector search using the text embeddings cannot be used as an entry point for searching for information within the graph, but that the relevant information must be obtained separately from the graph.
The process for a RAG application using this variant can roughly look like this: A user asks a question from which the entities contained are then extracted by an LLM instructed to do so. The extracted entities are then used to generate a query in the query language of the graph database, which returns the relevant nodes, their direct neighbors and their relationships as a result. Finally, this result is packaged together with the chunks from the vector database in a prompt and delivered in combination to an LLM as a context for the question. This can ultimately answer the user's question with a wealth of knowledge that ideally goes far beyond one or a few chunks.
This variant can be implemented with comparatively little manual effort, as the main work of creating the graph is automated. In theory, this offers many advantages, e.g. the graph can be continuously expanded or useful connections can be discovered that were not known in advance.
In practice, many challenges have to be overcome for the benefits to be realized. For example, an LLM must create the graph in a meaningful and targeted way while at the same time avoiding hallucinations. Syntactically correct and meaningful and, if necessary, concatenated queries must also be generated for the graph databases. Furthermore, the correct prompting of the results in the context of the user question is elementary.
These are all aspects that bring with them an LLM-related lack of clarity that needs to be managed. In addition, the multiple integration of LLMs results in additional costs for each user question. Overall, the same applies here: For a specific use case, the challenges must be weighed up against the potential.
Illustration 3Automatically recognized entities and relationships in a graph
Conclusion: RAG applications can benefit from knowledge graphs in many ways
In summary, it can be said that RAG applications can benefit from knowledge graphs in different ways. What they have in common is that they can potentially answer more and, above all, overarching questions. The result can be very impressive and profitable, given the right starting point.
However, it should also be noted that LLMs in general, RAGs and their combined use with knowledge graphs are evolving rapidly. It is therefore worth keeping an eye on developments, some of which are also referred to as "GraphRAG". The variants described are therefore by no means set in stone and there may generally be other ways of achieving similar benefits in the future, for example if the fine-tuning of LLMs becomes simpler and more economical.
It is therefore important to consider each application individually and to include all the options available at the time.
We will be happy to work with you to find out what options are available for your use case, design a solution and implement it. You will benefit from our project experience in the areas of LLM and RAG as well as from our diverse experience in the areas of data science, data engineering and cloud development. We look forward to hearing from you.
About the author:
Max Kühn has been involved with data science & BI since his studies in business informatics and focuses on data engineering in his work as a consultant. He supports companies in laying the foundations for data-driven decisions.