Solving Hallucinated Image URLs in LLM-Generated Code with ASTs and Babel
Scale AI recently encountered a unique challenge while building a platform that allows users to create web apps using simple prompts: hallucinated image URLs in AI-generated frontend code. These URLs often pointed to invalid or non-existent images on the web, leading to issues in the final application. To address this problem, the team explored various solutions, including regex patterns and Large Language Models (LLMs), before settling on a more robust approach using Abstract Syntax Trees (ASTs). The Problem In frontend web development, images are frequently used and can appear in several forms: - HTML <img> tags: <img src="https://example.com/image1.png" alt="Image 1" /> - Framework-specific image tags: <Image src="https://cdn.site.com/pic.jpg" width={300} height={200} /> - Custom components: <ProfileImage src="https://userpics.com/avatar.png" /> - Dynamic references: Images can be defined in JavaScript objects and then used in JSX elements. These variations make it challenging to reliably extract and replace image URLs using simple regex patterns. Similarly, while LLMs can handle complex tasks, they introduce overheads such as cost, latency, and non-deterministic behavior, making them less ideal for this specific task. Solution: Using Abstract Syntax Trees (ASTs) ASTs provide a structured representation of code, breaking it down into syntactic components arranged in a tree structure. This makes it easier to navigate and manipulate different parts of the code, including image URLs, without worrying about the varied forms they might take. Steps to Solve the Problem Extract Image Nodes: Identify all occurrences of image nodes in the code. Generate Captions: Use context and additional information to generate relevant captions for the images. Generate Images: Utilize a text-to-image model to create images based on the generated captions. Replace Hallucinated URLs: Replace the invalid or fake URLs with the newly generated image URLs. Code Implementation Finding Image Nodes: Construct an AST from the source code using the @babel/parser library. Traverse the AST to locate JSX elements and JavaScript object expressions that represent images. ```javascript const babelParser = require('@babel/parser'); const traverse = require('@babel/traverse').default; const generator = require('@babel/generator').default; const t = require('@babel/types'); function findImageNodes(code) { const ast = babelParser.parse(code, { sourceType: 'module', plugins: ['jsx', 'typescript'] }); const candidateNodes = []; traverse(ast, { ObjectExpression(path) { const properties = path.node.properties; properties.forEach((prop) => { if (prop.type === 'ObjectProperty' && (prop.key.name === 'src' || prop.key.name === 'url' || prop.key.name === 'image')) { const nodeSourceCode = generator(path.node).code; candidateNodes.push(nodeSourceCode); } }); }, JSXElement(path) { const openingElement = path.node.openingElement; const tagName = openingElement.name && openingElement.name.name; if (tagName === 'img' || tagName === 'Image') { const nodeSourceCode = generator(openingElement).code; const existingSrcAttr = openingElement.attributes.find((attr) => attr.name.name === 'src'); if (existingSrcAttr) { if (existingSrcAttr.value.type === 'StringLiteral') { candidateNodes.push(nodeSourceCode); } } else { candidateNodes.push(nodeSourceCode); } } } }); return candidateNodes; } ``` Updating Image Nodes with Metadata: After generating images and captions, update the AST nodes with the new metadata. Convert the updated AST back to source code. ```javascript function updateImageNodesWithMetaData(code, nodeResults) { const ast = babelParser.parse(code, { sourceType: 'module', plugins: ['jsx', 'typescript'] }); const resultMap = new Map(nodeResults); traverse(ast, { ObjectExpression(path) { const properties = path.node.properties; const nodeSourceCode = generator(path.node).code; if (resultMap.has(nodeSourceCode)) { properties.forEach((prop) => { if (prop.type === 'ObjectProperty' && (prop.key.name === 'src' || prop.key.name === 'url' || prop.key.name === 'image')) { prop.value = t.stringLiteral(resultMap.get(nodeSourceCode)['image_url']); const existingAlt = properties.find((p) => p.key.name === 'alt'); if (existingAlt) { existingAlt.value = t.stringLiteral(resultMap.get(nodeSourceCode)['description']); } else { const altProperty = t.objectProperty( t.identifier('alt'), t.stringLiteral(resultMap.get(nodeSourceCode)['description']) ); path.node.properties.push(altProperty); } } }); } }, JSXElement(path) { const openingElement = path.node.openingElement; const nodeSourceCode = generator(openingElement).code; if (resultMap.has(nodeSourceCode)) { const srcAttr = openingElement.attributes.find((attr) => attr.type === 'JSXAttribute' && attr.name.name === 'src'); if (srcAttr) { srcAttr.value = t.stringLiteral(resultMap.get(nodeSourceCode)['image_url']); const existingAltAttr = openingElement.attributes.find((attr) => attr.type === 'JSXAttribute' && attr.name.name === 'alt'); if (existingAltAttr) { existingAltAttr.value = t.stringLiteral(resultMap.get(nodeSourceCode)['description']); } else { openingElement.attributes.push( t.jsxAttribute( t.jsxIdentifier('alt'), t.stringLiteral(resultMap.get(nodeSourceCode)['description']) ) ); } } } } }); return generator(ast).code; } ``` Example Use Case Consider the following input code with hallucinated image URLs: ```javascript export default function EcommerceTShirts() { const [menuOpen, setMenuOpen] = useState(false); const [searchOpen, setSearchOpen] = useState(false); const hats = [ { id: 1, name: "Simple Hat", price: "$24.99", image: "/hat.jpg", alt: "existing" }, { id: 2, name: "New Hat", price: "$19.99", image: "/hat-new.jpg" }, ]; return ( Featured T-Shirts [ { id: 1, name: "Graphic Tee", price: "$24.99", image: "/tshirt1.jpg" }, { id: 2, name: "Pocket Tee", price: "$19.99", image: "/tshirt2.jpg" }, { id: 3, name: "V-Neck Tee", price: "$22.99", image: "/tshirt3.jpg" }, { id: 4, name: "Crew Neck Tee", price: "$21.99", image: "/tshirt4.jpg" }, ].map((shirt) => ( {shirt.name} {shirt.price} Add to Cart )) {hats.map((hat) => ( ))} ); } ``` After applying the findImageNodes and updateImageNodesWithMetaData functions, the output code with valid and relevant image URLs is: ```javascript export default function EcommerceTShirts() { const [menuOpen, setMenuOpen] = useState(false); const [searchOpen, setSearchOpen] = useState(false); const hats = [ { id: 1, name: "Simple Hat", price: "$24.99", image: "s3.productx.hat.jpg", alt: "A simple classic hat in a neutral tone displayed on a flat surface" }, { id: 2, name: "New Hat", price: "$19.99", image: "s3.productx.hat-new.jpg", alt: "A modern-style hat with a minimalist logo on the front panel" }, ]; return ( Featured T-Shirts [ { id: 1, name: "Graphic Tee", price: "$24.99", image: "s3.productx.tshirt1.jpg", alt: "A white graphic t-shirt featuring bold abstract art" }, { id: 2, name: "Pocket Tee", price: "$19.99", image: "s3.productx.tshirt2.jpg", alt: "A soft grey t-shirt with a front chest pocket" }, { id: 3, name: "V-Neck Tee", price: "$22.99", image: "s3.productx.tshirt3.jpg", alt: "A navy V-neck t-shirt made from breathable cotton" }, { id: 4, name: "Crew Neck Tee", price: "$21.99", image: "s3.productx.tshirt4.jpg", alt: "A classic black crew neck t-shirt with a tailored fit" }, ].map((shirt) => ( {shirt.name} {shirt.price} Add to Cart )) {hats.map((hat) => ( ))} ); } ``` Additional Use Cases ASTs are versatile and can be applied to various other tasks in the realm of code manipulation: 1. Linting for LLMs: Aider’s blog post details how ASTs can be used to lint code generated by LLMs, ensuring it meets quality standards. 2. Code Maintenance and Manipulation at Scale: GritQL, a tool that uses ASTs under the hood, demonstrates how ASTs can streamline large-scale code maintenance and manipulation. By leveraging ASTs, Scale AI successfully addressed the issue of hallucinated image URLs, ensuring that the generated code is not only syntactically correct but also semantically meaningful and visually appealing. Industry Insights and Company Profile Industry experts commend the use of ASTs for their precision and efficiency in handling complex code structures. This approach is seen as a key method for improving the reliability of AI-generated code, especially as LLMs become more prevalent in development workflows. Scale AI, founded in 2016, has emerged as a leader in the data-labeling and annotation space, providing essential training data for AI and machine learning applications. The company’s strategic partnership and significant investment from Meta underscore the growing importance of high-quality data in the AI landscape. Meta’s investment will help Scale AI continue its growth and innovation, while also benefiting Meta’s ambitious AI projects. For more updates on the latest AI stories and developments, connect with us on LinkedIn and follow Zeniteq. Subscribe to our newsletter and YouTube channel to stay informed about the evolving world of generative AI.
