Fine-Tuning Llama-3.1 Model to Predict Dutch Rental Property Values Based on Descriptions
Fine-Tuning a Large Language Model to Predict the Rental Value of a Dwelling In the Netherlands, rental property values are regulated by a set of rules that determine the maximum allowable rent based on the dwelling's characteristics and quality. These rules are complex and can be found on the Huurcommissie's website. To simplify this process, we developed an open-source Python package called woningwaardering (which translates to "home valuation"). This tool uses a point system to assess a dwelling's value, assigning points based on its features and quality. The maximum rent for a property is directly linked to the total number of points it receives. Ed Donner recently fine-tuned a Llama-3.1–8B model to predict Amazon product prices from their descriptions. His work was an inspiration for me to explore whether a similar approach could be applied to predict the points assigned to a dwelling based on its textual description, rather than using the traditional woningwaardering package. Method At Woonstad Rotterdam, the social housing organization where I work, we manage approximately 60,000 dwellings. We have implemented rigorous data quality checks to identify dwellings with near-perfect data records, leveraging our open-source pyspark-testframework package. This allowed us to compile a high-quality dataset of over 40,000 properties, each with detailed and accurate descriptions and their corresponding point values. To train the model, we selected a large language model (LLM) that could handle the complexity of natural language processing required to interpret property descriptions. We chose to fine-tune a Llama-3.1–8B model, given its proven effectiveness in handling similar regression tasks. The training dataset consisted of property descriptions paired with their respective point values as calculated by the woningwaardering package. Fine-Tuning Process The fine-tuning process began by preprocessing the property descriptions to clean and standardize the text. We then split the dataset into training, validation, and testing sets. The training set was used to teach the model to recognize patterns in the descriptions that correspond to the assigned point values. During the validation phase, we optimized the model’s hyperparameters to improve its performance. Finally, the test set was used to evaluate the model’s accuracy and reliability. Results After several iterations of fine-tuning, the model achieved impressive results. It demonstrated a high level of accuracy in predicting the point values for dwellings based on their descriptions. The model's predictions were compared against those generated by the woningwaardering package, and in many cases, the values were remarkably close. However, there were also instances where the model deviated significantly, which provided valuable insights into areas needing improvement. One key advantage of using the LLM for rental value prediction is its ability to process and analyze large volumes of data quickly. This could save significant time and resources for organizations like Woonstad Rotterdam, which need to regularly assess and update rental values for a substantial number of properties. Additionally, the model’s flexibility allows for continuous updates as new data becomes available, ensuring that predictions remain current and relevant. Challenges and Limitations Despite the promising results, several challenges and limitations emerged during the project. One of the primary issues was ensuring the diversity and representativeness of the training data. While our dataset was extensive, some rare or unique property features were underrepresented, leading to less accurate predictions for those types of dwellings. Future work will focus on expanding the dataset to include a wider range of property types and characteristics. Another challenge was interpreting the model’s decision-making process. LLMs are often considered black boxes due to their complex internal structures, making it difficult to understand why certain predictions were made. This lack of transparency can be problematic for regulatory and auditing purposes. Techniques such as feature importance analysis and explanatory models will be explored to make the model’s output more interpretable. Conclusion The experiment to fine-tune a large language model for predicting rental values based on property descriptions shows significant promise. It offers a faster and more flexible alternative to traditional methods, potentially saving time and resources for social housing organizations. While the model has achieved high accuracy in many cases, addressing the identified challenges will be crucial for its widespread adoption and reliability. Further research and development are needed to refine the model and ensure it meets the stringent requirements of the Dutch rental market.
