## Multiple criteria decision analysis application in real estate

Lately, I’ve been toying around with the idea of buying a new house. I recently had a discussion with real estate agent and loan specialist, and have been compiling information on how to assess the affordability of a new home. Buying a property is something I have been putting off since I started working at CheckSum in December 2011, Retrospectively, I should have done so around 2014 when prices were significantly lower. I have a few incentives for wanting to move by this point, and I think it’s about time to see what is currently on the market. I started browsing the real estate registries in the area, and quickly became overwhelmed by the amount of options and multiple criteria which I would need to consider when choosing an ideal location. I had the idea to start a spreadsheet and somehow come up with a ranking system which scores various factors I considered important (What are the monthly payments? How many square feet? How long would it take to drive to work? … etc).

I went on to search the internet, and discovered that MCDA (multiple criteria decision analysis) is a real discipline, and various people had submitted Microsoft Excel templates for performing MCDA analysis, however I found the templates generally inadequate for my particular situation. Take for instance, this template (link) provided on timeatlas.com. This excel spreadsheet provides a macro interface for creating criteria, and assigning weight factors to those criteria. For instance, they provide an example MCDA application for evaluating web browsers in terms of plug-ins, support, documentation, and speed.

The interpretation of this chart is fairly straight-forward. You define criteria, assign each criterion a factor weight, and then record the data for a particular browser. At the end, the score for a web browser is computed by multiplying all the factor weights by the criteria recorded for that browser; and then summing up all the results to produce your final score.

This sort of analysis is severely limited, for instance, in the case where you want a property within a certain price range or number of square feet with maximum and minimum limits. For this sort of analysis, you would not want to use a linear metric as this example does, but rather a metric which scores highest when the value is within a certain range. For instance, if my criteria is to find a house with an area between 1,400 sq. ft. to 1,900 sq. ft., you would want the score for this criterion to be higher if it is within this range than if it is outside. For this, you would need to apply a non-linear equation to the data.

Being that the templates I found online were lacking and not suited for my purpose, I decided to create my own method for MCDA. For this, I considered three possibilities for non-linear equations: 1) exponential curves, 2) parabolic curves, and 3) bell curves. My goal with each of these methods was to create ways of computing formulas which accept a weight, low limits, and high limits as variables, then applies a mathematical equation to the data to assign a score. In the following sections, I discuss the three MCDA methods I came up with, and later provide my own Microsoft Excel MCDA template.

## Method 1. Exponential curves

An example exponential curve is the following equation y = 2x. This equation usually has a curve which explodes to infinity as x gets large, which is shown in the following example:

I found using exponential functions, you can meet the criteria by implementing a piece-wise exponential function. A function I came up with is the following:

In this equation, a is the ideal value for the criteria. The value p is the criterion weight, and k is a factor which determines how steeply the equation drops off.

Figure 3 was plot with the parameters k = 0.2, p = 2, a = 5. The equation produces a piece-wise graph of two exponential curves which intersect at coordinates xayp. The slopes of the exponential curves slightly before and after yp are ±k.

I wasn’t very fond of how sharply the equation peaked. In most cases, you would want a function which remained relatively flat between two limits. Through experimenting, I determined a way of combining piece-wise exponential in a way which allowed for no variance

In Equation 2, a is your minimum limit, and b is your maximum limit.

Figure 3 was plot with the parameters k = 0.1, p = 2, a = 5, and b = 10. The graph consists of two exponential curves with a flat area. Between x = a and x = b, the graph above shows that the value of the weight remains constant at p. For x slightly lower than a, the slope of the exponential line is k, and for x lightly larger than b, the slope of the exponential line is -k. For data that lands between a and b, the score remains at a constant 1. As x approaches ±∞, the value of y approaches 0.

## Method 2. Parabolic curves

A simple parabolic curve is the equation yx2.

The next idea was to use parabolas to define weight values. To do this, I came up with the following equation:

In this equation, a is the lower limit, b is the upper limit, and p is the criterion weight. Unlike the equations for the exponential curves, Equation 3 does not have an extra k parameter as in the exponential function in Equation 2, the rate at which the parabola slopes is entirely determined by the upper and lower limits. By taking the derivative, we find that when the curve crosses the axis, the slope is ±4p/(– a).

Equation 3 was plot using p = 2, a = 5, and b = 10. The result is an inverted parabola which crosses the x axis at xa and xb.  The slope of the parabola at these points is ±1.6. The parabola peaks at coordinates x = (ab)/2, yp (in this example, {x,y}={7.5,2}.

For data points which far exceed the optimal values, the weights tend towards positive or negative infinity. In some situations, you may want an exceptionally high score for one criterion to overshadow a bad score for another criterion. Unlike the exponential plot in Figure 4, the function goes off to infinity as x gets large. For that reason, I was less interested in this method, and chose to look for a different technique.

## Method 3. Bell curves

Bell curves, to include the famous Gaussian function, are very commonly used in statistical techniques, which makes them quite appealing for this application. In most applications, the functions use powers of e (2.7182818….), however for this case, we are using powers of 2 which are easier to visualize. The following is an example equation and plot of a simple bell curve:

Instead of using piece-wise exponential functions, and since the exponential functions I had come up with earlier seemed to resemble bell curves (see Figure 4), I thought I’d try to implement an MDCA scheme using various bell curves. After some experimentation, I came up wtih the following function:

In this equation, a is the lower limit, b is the upper limit, and p is the criterion weight. Like the parabolic function in Equation 3, the slope is computed by how wide the limits are spaced.

Equation 7 was plot using p = 2, a = 5, and b = 10. The resulting graph is a bell curve with a peak at coordinates =(b)/2, = p (in this example, {x,y}={7.5,2}). One useful feature of this graph is that at the coordinates xa and xb, the value of the function at these points is exactly one-half the value at the peak. Computing the slope of the curve at the limits can be a bit more tedious, since the derivative of Equation is quite complex. However, plugging in xa into the derivative function yields:

Evaluating the slope equation using the parameters used in Figure 7, the slope at xa is approximately 0.5545, which means that at the point xa, an decrease of 1 for the data point will lead to a score around 0.5 less.

I eventually settled on this method since the function results are bounded unlike the parabolic method, and it didn’t require peace-wise equations or an extra slope variable as did the exponential method.

## A new Microsoft Excel MCDA template

Using Method 3, I developed a new template for computing MCDA scores for properties. I implemented the MDCA scheme described in Method 3 into the housing data, played with the min and max values, weights, and tested the data against various different properties. Figure 9 shows some example data. The columns B through F contain the data I collected for various properties, and the columns H through L contain the MCDA weight scores for all of the data.

The fields in columns H though J starting in row 5 implement the bell curve equation shown in Equation 5. In rows 2 through 4 starting in column H, the user can enter the min (a), max (b), and weight (p), which are plugged into the equation.

When choosing values for the Min, Max, and Weight, consider that data points that are half-way between max and min will receive a score equal to the weight of that column; and data points equal to the max or min will receive half that amount. Therefore, if you you for instance want houses in 2013 to receive the most points, choose your max and min such that 2013 is half-way between (this may require choosing a max date which is some time in the future). You will want to space out the max and min enough so that the score for properties below the max and min doesn’t drop off too sharp. After plugging in data and playing with the numbers, you can eventually feel confident that the MCDA scheme is able to predict which properties you will likely be more interested in. Figure 10. Showing the field formulas implementing the bell curve equation in Equation 5. The Min, Max, and Weight fill into the equation as a, b, and p, respectively.

Notice in the cell field, I include dollar signs before the numbers in the references for the parameter fields. This dollar sign prevents the auto-increment feature in Excel from messing up the link if you copy the cell to another location in the table. The fields were designed so you should only need to fill in the equation for one cell, in this case, H5, and then the equation can be dragged down all columns and rows to copy the equation to adjacent cells.

After rows H though L are completed, the final MCDA score is computed by summing up all the scores for all criteria to the right.