LLMs at Oxide: Navigating Risks and Responsibilities in Reading, Writing, and Coding
LLM use at Oxide is evaluated carefully across three key domains: reading, writing, and programming. Each presents distinct benefits and risks that must be understood and managed. LLMs excel as readers. They can instantly process and comprehend large documents, making them powerful tools for summarization or answering specific questions about technical materials like datasheets or specifications. Ironically, they are also effective at assessing how much an LLM contributed to a document’s creation. However, using hosted LLMs like ChatGPT, Claude, or Gemini requires strict attention to data privacy. There is a real risk that uploaded documents may be used to train future model versions, even if this is not clearly communicated. Some platforms make this opt-out by default—OpenAI’s “Improve the model for everyone” setting is a prime example, framing privacy-conscious choices as selfish. Users must actively disable such features to protect sensitive information. Even when privacy is ensured, LLMs should never replace actual human reading, especially in contexts where deep understanding is expected. For instance, while LLMs can assist in evaluating candidate materials during hiring, they should only support—not replace—the human judgment that is essential to the process. As writers, LLMs are far less reliable. Their output often feels generic, clichéd, or riddled with subtle signs of automation. This undermines authenticity. When readers detect LLM-generated text, they may question not just the writing but the underlying thought process. If the prose was auto-generated, how can they trust the ideas are truly understood? This breaks a fundamental social contract: writing requires more intellectual effort than reading. When that balance is disrupted, readers may feel misled—especially if they spend more time deciphering flawed logic than the writer did crafting it. In extreme cases, this leads to cognitive dissonance: a sense of confusion because the text appears meaningful but lacks coherence. At Oxide, where strong writing is a core hiring criterion, this risk is especially acute. Everyone here can write well, and we value the integrity of our own voices. Therefore, we generally avoid using LLMs to produce final written content. That said, LLMs can still play a supportive role—such as brainstorming or drafting initial ideas—provided the final work reflects personal thought and responsibility. In programming, LLMs are remarkably effective. They can generate code quickly, especially for experimental, auxiliary, or throwaway tasks. For production code, however, caution is essential. The closer the code is to shipped systems, the more scrutiny it demands. Even tasks like writing tests can go wrong if not carefully reviewed. LLMs can produce plausible-sounding but incorrect or nonsensical code. When LLMs are used in code generation, the engineer remains fully responsible. Self-review is non-negotiable—no peer review should accept code that hasn’t been personally vetted. Once in the review loop, re-generating code in response to feedback breaks the iterative process. The goal is not to outsource thinking, but to enhance it. LLMs are tools, not replacements. The best results come when engineers use them with rigor, empathy, and a clear sense of ownership.
