DocHub: Facilitating Comprehension of Documents via Structured Sensemaking with Large Language Models

PythonPyTorchTypeScriptReactLangChainRetrieval-Augmented Generation (RAG)Large Language Model (LLM)Natural Language Processing (NLP)Human-Computer Interaction (HCI)VisualizationSensemakingHuman-AI Interaction

Read the Paper

Apr 8, 2024

Traditional conversational LLM interfaces often struggle with complex document comprehension due to verbose, unstructured responses and linear interaction models. To bridge this gap, we developed DocHub, an interactive system that integrates large language models with a structured sensemaking framework.

DocHub identifies and transforms document data into dynamic node-link diagrams, allowing users to navigate information through multiple levels of granularity. Key features include:

Diagram View: Visualizes document structure using ClassNodes for categorization and DeepenableNodes for section summaries.
DeepenedNodes: Enables users to initiate in-depth, context-specific dialogues with the LLM by dragging from existing nodes.
InstantOp: A context-sensitive popup providing real-time AI assistance for text selected directly within the Document View.
Non-linear Abstraction: Organizes information to prevent context loss and reduce cognitive overload during deep analytical tasks.

A within-subject user study demonstrated that DocHub significantly improves comprehension efficiency, reducing the average time to grasp complex documents from 24.16 minutes to 15.73 minutes while increasing the volume of information acquired.