Comparing 5 LLMs to Review Long Documents: A Technical Writer’s Experiment

As a solo technical writer who spends a large chunk of my time writing technical documents, I’ve been curious about using Large Language Models (LLMs) to refine up the writing process.

More specifically, I wanted to explore how LLMs could help draft large documents, particularly in getting an independent perspective on structure, flow, identifying missing pieces, and other details easily overlooked when writing long, dense, and labor-intensive material.

So, I conducted an experiment.

I wrote a User Guide for Google’s NotebookLM product. If you’re unfamiliar with it, NotebookLM is a type of RAG (Retrieval Augmented Generation) model that allows you to interrogate a suite of documents. For example, you could upload 100+ documents about a specific project, save them as a repository, and then ask the AI questions that it can answer by referencing that repository. (If you’re interested, there’s a free NotebookLM user guide available here on the Klariti site.)

My goal was to see if some of the most popular LLMs could help review the document. For context, the guide is 180 pages long. I wanted to see if ChatGPT and other well-known models could provide useful feedback.

The results were interesting for different reasons.

4 Prompts to Ask LLMs to Review Long Documents

Before you start, you can ask each of the LLMs to provide you with different prompts that would best describe how to ask it to review your document.

I’d recommend this if you’re relatively new to crafting prompts.

There’s no right way to ask an LLM to review your document. However, to get started, you can try one of the following below then tweak the results. It’s an iterative process. Try playing with them and you’ll get a feel for what works.

Prompt #1 (Simple):
Please review the attached document (or: “Here’s the link to the document: [link]”).

This approach is the least effective. You’ll only get a generic set of recommendations.

Prompt #2 (More Specific):
I’ve attached a 150-page document [or: “Here’s the link: [link]”]. I’m particularly interested in your assessment of the following:

Clarity and Conciseness: Is the document easy to understand? Are there any sections that are overly verbose or confusing?
Structure and Organization: Is the document logically organized? Does the flow make sense? Are there any suggestions for improvement in structuring the content?
Argumentation/Analysis (if applicable): If the document presents an argument or analysis, how strong is it? Are there any weaknesses in the reasoning or evidence?
Target Audience: Is the document appropriate for its intended audience? Why or why not?
Key Takeaways: What are the most important points that the reader should take away from this document?

Please provide a detailed review, including specific examples from the text to support your points.

Prompt #3 (Focus on Specific Aspects):
I’m working on a 200-page report [or: “Here’s the link: [link]”] and need your help focusing on a few key areas. Specifically, I’d like you to review:

The Literature Review (Chapters 2 & 3): Is the literature review comprehensive and up-to-date? Does it effectively synthesize the existing research?
The Methodology (Chapter 4): Is the methodology clearly explained and justified? Are there any potential weaknesses in the approach?
The Conclusion (Chapter 7): Does the conclusion effectively summarize the findings and their implications? Are there any recommendations for future research?

Please provide specific feedback on these sections, including any suggestions for improvement.

Prompt #4 (With Specific Instructions for Formatting the Response):
Please review the attached document [or: “Here’s the link: [link]”]. I’d like your feedback in the following format:

Executive Summary (max 200 words): A brief overview of the document and its key findings.
Strengths: What are the document’s main strengths?
Weaknesses: What are the document’s main weaknesses?
Suggestions for Improvement: Specific recommendations for improving the document. Please include page numbers or section references where applicable.

Be as detailed as possible in your feedback.

Using a combination of these prompts, I asked each of the following LLMs to review my document.

Claude

Sadly, Claude couldn’t handle the document’s size, so it was disqualified. While it suggested breaking the document into smaller chunks, that defeated the purpose of reviewing the document as a whole.

Google Gemini

Next, I tried Google Gemini. Unfortunately, there is no way to upload the entire MS Word document – hardly surprising, I suppose as it’s a Microsoft rival – but this meant I had to paste the entire document into the chat field. It looks messy and as a result a fair bit of the response was about… improving the formatting of the document.

Here are some specific examples of areas for improvements:

“Instead of: ‘Why This Matters: I’ve found that getting started with NotebookLM is incredibly easy…’
Try: ‘Getting started with NotebookLM is straightforward. For example, to create a new notebook, simply click the ‘New Notebook’ button and…’ (Then show a screenshot).”
“Instead of: ‘AI-Powered Summaries and Insights: Automatically generate summaries of long documents or notes.’
Try: ‘AI-Powered Summaries: Quickly condense lengthy documents into concise summaries, saving you valuable time. For example, you can summarize a 50-page research paper into a few key bullet points.’ (Then show a screenshot of the summary feature in action).”

DeepSeek

Then I tested DeepSeek. I was intrigued to see how it would handle the document. It performed quite well, providing very good direction and critiques, exceeding my expectations. It also offered more in-depth information than the other LLMs.

Here’s an example of some of the areas for improvement:

Limited Focus on Technical Writers:
While the guide mentions technical writers as a target audience, it doesn’t provide enough specific examples or workflows tailored to their needs. Adding a dedicated section for technical writers (e.g., how to use NotebookLM for API documentation or user manuals) would enhance its relevance.
Lack of Call-to-Actions:
The guide could benefit from more explicit calls-to-action, such as encouraging users to try specific features or explore advanced options. For example, after explaining the chat panel, you could prompt users to “Try asking a question about your notes now!”
Incomplete Next Steps:
The “Next Steps” section is brief and generic. It could be expanded to include actionable recommendations, such as “Create your first thematic notebook” or “Experiment with the AI-powered podcast feature.”

ChatGPT

Next up was ChatGPT. You have a few options here in terms of uploading the actual document: Google Drive, MS OneDrive, or from your computer.

I uploaded the file and asked it to review the document with a simple prompt. It gave relatively good advice, but DeepSeek’s feedback was noticeably better. I’ve recently noticed that ChatGPT seems a bit stingy with its word count in responses.

Here’s an example of some of the areas for improvement and suggested revisions. As you can see, while the suggestion is fine, a little more detail would really have helped.

Better Organization for Troubleshooting & FAQs:
The Troubleshooting section is currently buried near the end.
There is no structured FAQ section.
Fix: Move troubleshooting earlier and add an FAQ addressing:
– “Why can’t I upload a document?”
– “Does NotebookLM work offline?”
– “How secure is my data?”

At that point, DeepSeek seemed to be the winner.

AI Studio from Google

However, there’s one other LLM worth considering: AI Studio from Google. I mention this as it’s often overlooked and doesn’t get the media coverage you’d expect.

However, it had a catch: it wouldn’t accept Microsoft Word documents.

As a workaround, I saved the document as a Google Doc on my Google Drive and then asked it to review it. This extra step wasn’t a big deal, but it was a limitation.

You could of course paste the entire document into the chat field but if you do this you lose all of the formatting. As a result, the LLM will highlight things related to formatting.

Instead, I pointed it towards Google Drive / Docs and imported the file.

I then asked AI Studio to review the document: it returned a one-page summary of areas for improvement. This was surprisingly concise. In contrast with the other LLMs, it took a different angle. Essentially, it was saying, here are the main areas, now which one do you want me to review in more depth? It’s a very different approach that say Claude, which wants you to spoon-feed the document bit by bit, which is disallows it from getting a bird’s eye view of the entire document.

Example of feedback from AI Studio:

Clarity and Accuracy: Ensuring the instructions are easy to understand and technically correct.
Organization: Evaluating the structure and flow of the guide.
Completeness: Checking if all key features are covered and if there are any gaps.
Actionability: Assessing if the guide provides practical tips and examples.
Writing Style: Ensuring the writing is concise, engaging, and appropriate for the target audience.
Troubleshooting: Verifying the troubleshooting section covers relevant problems and provides useful solutions.
Value Proposition: Checking if the guide effectively highlights the benefits of using NotebookLM.
Grammar and Spelling: Proofreading for any errors.

Next, I asked it to elaborate on each of the nine items on the list.

This is where things got interesting. The quality of the responses for each item was excellent. At this point, it became a close competition between DeepSeek and AI Studio. It’s worth noting that I asked DeepSeek for an overall review, not a section-by-section breakdown.

Bonus Point!

AI Studio was the only LLM to find actual mistakes in the document. In fairness, I asked the other LLMs to simply review the material and not check its technical integrity. However, AI Studio flagged a few areas which I’d ‘hallucinated’ and added in buttons and features which didn’t exist.

It’s worth remembering though that AI Studio is a Google product as it NotebookLM, so it’s more familiar with its workings than the others.

Here’s a taste of the feedback it gave me.

Specific Comments & Suggestions:

Introduction: “NotebookLM is an AI-driven web-based tool designed to help you organize, analyze, and interact with notes and documents. It’s not for casual writing or blog posts, rather more suited for academic, technical writing or for RFP projects.” – Consider rephrasing this to better define the ideal uses. Suggestion: NotebookLM excels at supporting complex projects requiring synthesis and analysis of multiple sources, such as research papers, technical documentation, and proposals. It is designed for deep work, rather than short form or creative writing.
Key Features and Benefits: “Ask questions about your notes and get instant ‘AI-generated’ answers.” – Better to say “AI Powered answers”, or even “Answers powered by AI.”
System Requirements: “Storage: While NotebookLM stores your notes and documents in the cloud, ensure your device has enough storage for any offline files or exports you may download.” – This might confuse users since NotebookLM is cloud-based. Consider removing or rephrasing. Suggestion: While NotebookLM primarily operates in the cloud, ensure you have sufficient storage on your device if you plan to export your notes or download source documents.
What to Do After Setting Up Your Account: “Click on ‘New Notebook’ and give it a name (e.g., ‘Project Research’ or ‘Meeting Notes’).” – Great to provide examples, but consider more relevant examples, such as Case Study Notes, or Document Outlines.
Sources Panel: “Upload Sources: Click the ‘Upload’ button to add documents, PDFs, or text files.” – The “Upload” button does not exist. Suggestion: Click the “Add Source” button to add documents, PDFs, or text files.

Technical Considerations when A/B Testing LLMs

I found out that if you run several LLMs on your PC at the same time, the machine tends to grind to a halt or at least goes very slow. Not sure why technically but maybe holding large documents in its working memory effects the web browser.

I’d suggest you keep only two tabs open if doing split tests between LLMs unless or course you have a very powerful machine. I open AI Studio and DeepSeek, then close all other tabs.

Takeaways

So, where am I now? I’ve narrowed down the contenders to two: DeepSeek and AI Studio.

My key takeaways are: If you’re using LLMs to review documentation, consider splitting the review tasks between different LLMs to see which is best suited for your material. Some, like Claude, may not support reviewing large documents. Google Gemini worked fine with Google Docs, but its feedback wasn’t as impressive as others. ChatGPT was good, but its responses felt somewhat limited. DeepSeek and AI Studio were the clear standouts.

In the following articles, I’ll continue with DeepSeek and AI Studio to review a set of long documents.

If you’re interested, please follow along and send me your questions over on LinkedIn.

AI Writing, Google Gemini, Technical Writing, User Guides

Comparing 5 LLMs to Review Long Documents: A Technical Writer’s Experiment

4 Prompts to Ask LLMs to Review Long Documents

Claude

Google Gemini

DeepSeek

ChatGPT

AI Studio from Google

Bonus Point!

Technical Considerations when A/B Testing LLMs

Takeaways

Alan Thompson

4 Prompts to Ask LLMs to Review Long Documents

Claude

Google Gemini

DeepSeek

ChatGPT

AI Studio from Google

Bonus Point!

Technical Considerations when A/B Testing LLMs

Takeaways

Alan Thompson

Login