Predictive data: Revolutionizing preconstruction planning

By Sherman Wong

Preconstruction planning has been, and it continues to be, one of the most challenging aspects of a building project’s life cycle. Design professionals often rely on yesterday’s data to plan tomorrow’s projects. However, historical data has proven to be unreliable as it does not include factors for present markets or track trends impacting costs. Nevertheless, architects and other design professionals are expected to provide a project budget as well as stick to it.

Thanks to modern data science and predictive analytics, those involved in the construction planning phases are now able to supplement historical data with reliable projections of future expenses. Predictive cost data was developed by using a hybrid methodology combining classical econometric techniques with contemporary data mining procedures to address the shortcomings of traditional forecast information.


Images courtesy RSMeans data
Images courtesy RSMeans data

Until the economic crash of 2008, construction professionals often relied on historic prices and localization factors to provide reasonably accurate costs to build. While these expenses and factors are helpful when putting a budget together, stakeholders have increasingly voiced dissatisfaction with their accuracy (or lack of). Roughly 98 percent of construction projects go overbudget (For more information, read “98 Percent of Construction Projects Go Over Budget. These Robots Could Fix That” by Luke Dormehl in Digital Trends.). Further, market volatility and a shrinking construction labor pool have contributed to the inability to rely on past data for budgetary purposes.

Volatilities can be brought about by labor shortages, tariffs, and natural disasters. Another factor contributing to market volatility is that the construction industry shows some of the lowest technology adoption rates.

Prior to 2008, projects moved forward without major concerns about volatile costs. During and following the recession, a large number of contractors were forced to leave the construction industry. When owners and builders were able to begin planning for regrowth, the construction labor force had been reduced by three-fifths.

Historic building costs and factors used in previous years became obsolete. More importantly, boards of directors and investors’ concerns about escalating prices grew exponentially. This led to a higher standard of accountability for construction and design professionals to manage and adhere to forecasted budgets as material, labor, and equipment rates account for 79 percent of total construction costs on average (Calculated from historical RSMeans data.). Overheads and profit make up the remaining 21 percent, including workers compensation, state and federal unemployment costs, social security, and public liability expenses, as well as an estimated profit percentage for material and equipment for the installing contractor. There is a clear need for diligent management of construction material and labor costs.

When employing current data at the capital planning stage—typically six to 24 months before construction starts—it becomes impossible to maintain an accurate estimate by the time the project breaks ground. Throughout the planning phase and all the way through construction, numerous unknowns could cause unforeseen cost increases. Material prices can fluctuate greatly year-over-year based on interactions of various commodities and sheer construction volume. Without a reliable method to keep track of all the moving parts, blown budgets, broken processes, and finger-pointing ensues. This can not only slow a project greatly, but also grind it to a halt.


Traditional forecasting data, developed during a time of far less computing power and limited availability of ‘big data,’ simply does not meet today’s needs for accurate planning and budgeting. Traditional economic forecast methods do not predict market swings or sharp cost escalations well. Although based on econometric principles and modeling techniques, predictive cost data differs from traditional econometric forecasts in two ways.

First, traditional forecasts are based on macroeconomic theory, even when analysis of historical values of those economic indicators demonstrates them to be statistically insignificant predictors. Predictive cost models disregard theory altogether and are based exclusively on data-driven empirical evidence.

This proof is the result of extensive exploratory data analysis and pattern-seeking visualizations of historical cost information with economic and market indicators. This approach, clearly an update to the centuries-old, theory-driven process, has been extensively researched and validated by Edward Leamer, professor of global economics and management at the University of California, Los Angeles (UCLA) (Read Macroeconomic Patterns and Stories by Edward E. Leamer, published in 2009 by Springer-Verlag.). Only economic indicators that have ‘proven themselves’ in exploratory analysis become candidates for model development, testing, validation, and resulting predictive cost estimates.

Second, predictive cost data uses mining techniques and principles to improve traditional econometric modeling practices. This family of processes and analyses has evolved since the 1990s from a mix of classic statistical principles and more contemporary computer science and machine learning methods.

Data mining methodology is specifically designed to analyze observational data instead of experimental information. A robust methodology, data mining takes advantage of recent increases in computing power, visualization techniques, and updated statistic procedures to find patterns and determine drivers of construction material and labor cost changes. Measures of these drivers and their relationships to each other and to construction costs, along with their associated lead or lag times, are represented in a statistical algorithm predicting future values for a defined material and location.

Leave a Comment


Your email address will not be published. Required fields are marked *