Understanding LLMs Deeper

What LLMs Are Good At
  • Explaining physics, math, and engineering concepts in multiple ways
  • Generating practice problems and quiz questions tailored to your syllabus
  • Debugging code and suggesting targeted fixes
  • Drafting lab reports and technical documentation
  • Brainstorming project architectures and research directions
  • Translating technical documentation between languages
When NOT to Trust LLMs
  • Blindly copying unverified code into production or assignments
  • Assuming citations are real — hallucinated references are still common
  • Using outdated information without cross-checking training cutoffs
  • Submitting AI-generated text as your own work (academic integrity!)
  • Trusting LLMs for critical engineering calculations — use Wolfram Alpha
  • Trusting "reasoning" models for guaranteed correct answers
// What's New in 2026

LLMs have transitioned from standalone chat tools to core infrastructure — they now power agentic workflows, tool-calling pipelines, MCP servers, and multi-step autonomous systems far beyond simple Q&A. Multimodality is now standard — top models handle text, images, audio, and video in a single prompt. Reasoning models are a new category entirely: they spend "thinking time" before answering, trading latency for accuracy on hard problems.

How LLMs Actually Work (Updated)

  • Trained on billions to trillions of parameters across text, code, images, and audio
  • Support 100,000–1,000,000+ token context windows — entire textbooks in one prompt
  • Multimodality is now standard across top models
  • Reasoning models differ fundamentally: they reason internally before outputting
  • Fine-tuning is now cheap — LoRA/QLoRA adapt models in days without supercomputer access
  • Chain-of-thought prompting improves performance by up to 61 percentage points on math benchmarks — but only on models with 100B+ parameters
  • CoT on reasoning models gives only marginal benefit — they already reason internally; forcing CoT adds 20–80% latency with little gain
  • Open-source models (Llama 4, DeepSeek, Qwen3) now rival closed models on many benchmarks
  • RAG (Retrieval-Augmented Generation) lets LLMs answer from your own documents

Reasoning Models: A New Category

Standard and reasoning models are now two distinct classes:

Standard LLMs
  • Examples: GPT-5.3, Gemini Flash, Mistral
  • Speed: Fast, generates immediately
  • Best for: Writing, summarization, chat
  • CoT prompting: Helps significantly
  • Thinking visible? No
Reasoning LLMs
  • Examples: o3, DeepSeek R1, Gemini 2.5 Pro, Qwen3
  • Speed: Slower (thinks before answering)
  • Best for: Math, code, logic, multi-step problems
  • CoT prompting: Marginal gain, adds latency
  • Thinking visible? Yes — blocks exposed

Local LLMs — Run Offline

You can run powerful models entirely on your own machine, with no subscriptions and full privacy:

Ollama
One command to pull and run 100+ models. Cross-platform. OpenAI-compatible API. ollama run llama3
LM Studio
Best GUI for beginners. Model discovery, easy quantization settings, local API server.
GPT4All
Desktop app, beginner-friendly. Includes local RAG — chat with your own files.
Top Local Models
Llama 4, DeepSeek V3.2, Qwen3-80B, Mistral Large 3, Gemma 3, NVIDIA Nemotron 3.

Popular Tools Compared (April 2026)

Tool Best For Free Tier (2026) Student Edge
ChatGPT Tutoring, summarization, agentic tasks GPT-5.3, ~10 msgs/5 hrs File uploads, image gen (2–3/day on free)
Google Gemini Research, Docs/Slides integration Gemini 2.5 Flash Up to 12 months AI Pro for verified students
GitHub Copilot AI code completion & debugging Free for everyone since 2024 Integrated in VS Code, JetBrains, Neovim
Cursor AI-first IDE, deep codebase context 50 premium completions/mo VS Code fork; GPT-4, Claude, Gemini, Grok
Perplexity Research with citations 5 Pro + 3 Deep Research/day Source-linked answers, literature reviews
Claude Long documents, code analysis, writing Free tier available 200K context window; free is limited
DeepSeek Coding, math, reasoning Free — no account needed Most capable fully-free reasoning model
Claude Code Complex agentic coding, architecture $20/mo (Pro required) CLI agentic tool; 200K context; native terminal
NotebookLM Studying from your own PDFs/notes Free Upload lectures; AI answers from your sources
Wolfram Alpha Math, physics, verified calculations Free (limited) Do NOT use LLMs for safety-critical math

Prompting Tips

Core Techniques
  • Add "think step by step" to improve reasoning in standard models
  • Few-shot prompting: give 2–3 examples — often more effective than long instructions
  • Specify your role: "You are an expert thermodynamics professor"
  • Ask for multiple approaches to see trade-offs
  • Iterative refinement beats single-shot — start broad, then refine
  • Break complex problems into subtasks — one prompt per subtask
Advanced Tips
  • Use negative constraints: "Don't use recursion" sharpens code generation
  • For code, always specify language, framework version, and constraints
  • Ask the model to explain its reasoning after giving an answer
  • "What are common mistakes when doing X?" often beats "How do I do X?"
  • DeepSeek is particularly strong at competitive-programming-style algorithm problems
  • Use AI to document existing code — paste a function, ask for full docstrings

50 Additional Facts & Tips

Understanding LLMs Deeper
  • LLMs don't "look things up" — they generate text based on learned patterns
  • Temperature controls creativity: 0 = deterministic, 1+ = creative
  • Context window = the LLM's working memory; anything outside is forgotten
  • Model size ≠ model quality; training data quality and RLHF matter equally
  • RLHF is why models feel helpful and safe, not just fluent
  • Hallucination remains unsolved even in 2026 frontier models
Coding with AI
  • Scaffold, then understand — let AI generate structure, you fill in understanding
  • GitHub Copilot now has Agent Mode for multi-step tasks in VS Code
  • Always run AI-generated code in a sandbox first, never on production
  • LLMs are excellent for writing regex — notoriously hard to write manually
  • Use AI to generate test cases — better at edge cases than expected
  • Ask AI to review code for security: SQL injection, XSS, etc. as a first pass

Academic Use — Do's & Don'ts

Do

  • Use LLMs to understand, not replace — ask it to explain the concept, then solve the problem yourself
  • Generate practice problems and immediately attempt them before checking solutions
  • Use NotebookLM to chat with your own lecture slides and PDFs
  • Cross-check any citation with Google Scholar before submitting
  • Use Wolfram Alpha or MATLAB for physics/math calculations — not ChatGPT
  • Disclose AI assistance according to your institution's policy

Don't

  • Use an LLM to write your lab report — write it yourself, use AI only for clarity
  • Use AI for exam simulations you plan to submit — that's cheating
  • Blindly trust code solutions — understand every line before using it
  • Paste API keys or personal data into public LLM interfaces
  • Assume "reasoning" models are always correct — confident, detailed, wrong answers exist
  • Trust free tier data privacy — check training opt-out settings
The Future (What's Coming)

Agentic AI is the 2026 paradigm shift — models that plan, use tools, browse, write code, and execute autonomously. MCP (Model Context Protocol) is the new standard letting LLMs persistently access your apps, CRMs, and databases. The competitive landscape means free tiers keep improving — GPT-5.3 is now on ChatGPT's free plan, unthinkable a year ago. The bottleneck is no longer capability — it's consistency, reliability, and system design.

Scroll to track progress
Scroll Progress
0%
of this page viewed