Finding the Median Value for Fields in Access 2007
This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.
Summary: Learn how to calculate the median value in Microsoft Office Access 2007 fields by importing a custom module that contains a DMedian function. (7 printed pages)
Applies to:2007 Microsoft Office system, Microsoft Office Access 2007
Adapted from Access Cookbook, 2nd Edition by Ken Getz, Paul Litwin, and Andy Baron. Copyright © 2004, O'Reilly Media, Inc. All rights reserved. Used with permission.
Ken Getz, MCW Technologies, LLC
Office Access 2007 does not provide a built-in function that calculates the median of a set of values, although if it did, the function might work something like the built-in DSum and DMax functions. This article demonstrates a median function, written in Visual Basic for Applications (VBA), that you can use in your own applications.
To try the function, first load the frmMedian form from the sample database, FindMedianValue.accdb. Choose a table and a field in that table by using the combo boxes on the form. After you choose a field, frmMedian uses the DMedian function to calculate the median value, and then displays that value in a text box. The form displays an error message if you choose a field with a nonnumeric data type, and displays the string "(Null)" if the median value is Null.
Follow these steps to use the DMedian function in your own applications:
Import the basMedian module from FindMedianValue.accdb into your database.
Use the following table to determine what syntax to use to call the DMedian function. (It is similar to the built-in DAvg function.)
Name of field for which to calculate median
Name of a table or query
Optional WHERE clause to limit the rows considered
"CategoryID = 1"
Make sure that each parameter is delimited with quotes. The third parameter is optional. For example, you might enter the following statement in the Immediate window. The function call should return 18, assuming that you use the data in the sample database, FindMedianValue.accdb.
This example uses the DAO type library, and you’ll need to include the reference to the most current version of DAO in your own applications to take advantage of this code. Use the References command in the Tools menu to add the necessary reference, if you want to use this code in your own database.
The DMedian function in basMedian in FindMedianValue.accdb looks and acts similar to the built-in DAvg domain function. The algorithm that calculates the median, however, is more complicated, and uses the following steps:
Sort the dataset on the specified field.
Find the middle row of the dataset and return the value of the field. If there are an odd number of rows, this will be the value in a single row. If there are an even number of rows, there is no middle row, so the function finds the mean of the values in the two rows that straddle the middle. You could modify the function to pick an existing value instead.
After it declares a few variables, the DMedian function creates a recordset that is based on the three parameters that are passed to it (strField, strDomain, and varCriteria), as shown in the following complete source code for the function.
Public Function DMedian( _ ByVal strField As String, ByVal strDomain As String, _ Optional ByVal strCriteria As String) As Variant ' Purpose: ' To calculate the median value ' for a field in a table or query. ' In: ' strField: the field. ' strDomain: the table or query. ' strCriteria: an optional WHERE clause to ' apply to the table or query. ' Out: ' Return value: the median, if successful; ' Otherwise, an Error value. Dim db As DAO.Database Dim rstDomain As DAO.Recordset Dim strSQL As String Dim varMedian As Variant Dim intFieldType As Integer Dim intRecords As Integer Const errAppTypeError = 3169 On Error GoTo HandleErr Set db = CurrentDb() ' Initialize return value. varMedian = Null ' Build SQL string for recordset. strSQL = "SELECT " & strField & " FROM " & strDomain ' Only use a WHERE clause if one is passed in. If Len(strCriteria) > 0 Then strSQL = strSQL & " WHERE " & strCriteria End If strSQL = strSQL & " ORDER BY " & strField Set rstDomain = db.OpenRecordset(strSQL, dbOpenSnapshot) ' Check the data type of the median field. intFieldType = rstDomain.Fields(strField).Type Select Case intFieldType Case dbByte, dbInteger, dbLong, _ dbCurrency, dbSingle, dbDouble, dbDate ' Numeric field. If Not rstDomain.EOF Then rstDomain.MoveLast intRecords = rstDomain.RecordCount ' Start from the first record. rstDomain.MoveFirst If (intRecords Mod 2) = 0 Then ' Even number of records. ' No middle record, so move to the ' record right before the middle. rstDomain.Move ((intRecords \ 2) - 1) varMedian = rstDomain.Fields(strField) ' Now move to the next record, the ' one right after the middle. rstDomain.MoveNext ' And average the two values. varMedian = _ (varMedian + rstDomain.Fields(strField)) / 2 ' Make sure you return a date, even when ' averaging two dates. If intFieldType = dbDate And Not IsNull(varMedian) Then varMedian = CDate(varMedian) End If Else ' Odd number or records. ' Move to the middle record and return its value. rstDomain.Move ((intRecords \ 2)) varMedian = rstDomain.Fields(strField) End If Else ' No records; return Null. varMedian = Null End If Case Else ' Non-numeric field; so raise an app error. Err.Raise errAppTypeError End Select DMedian = varMedian ExitHere: On Error Resume Next rstDomain.Close Set rstDomain = Nothing Exit Function HandleErr: ' Return an error value. DMedian = CVErr(Err.Number) Resume ExitHere End Function
Building the SQL string that defines the recordset is straightforward except for the construction of the optional WHERE clause. Because strCriteria is defined as an optional parameter (by using the Optional keyword), DMedian checks whether a value was passed by determining whether the string length is greater than zero.
' Build SQL string for recordset. strSQL = "SELECT " & strField & " FROM " & strDomain ' Only use a WHERE clause if one is passed in. If Len(strCriteria) > 0 Then strSQL = strSQL & " WHERE " & strCriteria End If strSQL = strSQL & " ORDER BY " & strField
Once DMedian builds the SQL string, it creates a recordset based on that SQL string.
Next, DMedian checks the data type of the field; it will calculate the median only for numeric and date/time fields. If you pass any other data type to DMedian, the function forces an error by calling the Err.Raise method and then uses the special CVErr function in its error handler to send the error state back to the calling procedure.
' Check the data type of the median field. intFieldType = rstDomain.Fields(strField).Type Select Case intFieldType Case dbByte, dbInteger, dbLong, dbCurrency, _ dbSingle, dbDouble, dbDate ' Code removed here... Case Else ' Non-numeric field; so raise an app error. Err.Raise errAppTypeError End Select ' Code removed here... ExitHere: On Error Resume Next rstDomain.Close Set rstDomain = Nothing Exit Function HandleErr: ' Return an error value. DMedian = CVErr(Err.Number) Resume ExitHere End Function
If the field is numeric, the DMedian function checks to see whether there are any rows in the recordset by using the following If...Then statement, and then returns Null if there are no rows.
' Numeric field. If Not rstDomain.EOF Then ' Code removed here... Else ' No records; return Null. varMedian = Null End If
If there are rows, the function moves to the end of the recordset to get a count of the total number of records. This is necessary because the RecordCount property returns only the number of rows that have already been visited. The following is the code.
If the number of records is even, DMedian moves to the record just before the middle by using the Move method, which can move an arbitrary number of records from the current record. The number of records to move forward is calculated by using the following formula.
This tells Access to divide the total number of records by 2 and then subtract 1 from the result (because you are starting from the first record). For example, if you are on the first of 500 records, you would move (500 \ 2 - 1) = (250 - 1) = 249 records forward, which would bring you to the 250th record. Once the function has moved the specified number of records, it is a simple matter to grab the value of the 250th and 251st records and divide the result by 2. The following code shows this part of the function.
' Start from the first record. rstDomain.MoveFirst If (intRecords Mod 2) = 0 Then ' Even number of records. No middle record, so move ' to the record right before the middle. rstDomain.Move ((intRecords \ 2) - 1) varMedian = rstDomain.Fields(strField) ' Now move to the next record, the one right after ' the middle. rstDomain.MoveNext ' Average the two values. varMedian = (varMedian + rstDomain.Fields(strField)) / 2
Because DMedian supports dates, the function uses the following code to make sure that it returns a date value when it takes the average of two dates.
' Make sure you return a date, even when ' averaging two dates. If intFieldType = dbDate And Not IsNull(varMedian) Then varMedian = CDate(varMedian) End If
The code for an odd number of rows is simpler, because the code merely needs to return the value of the middle row.
Else ' Odd number of records. Move to the middle record ' and return its value. rstDomain.Move ((intRecords \ 2)) varMedian = rstDomain.Fields(strField) End If
Finally, acbDMedian returns the median value to the calling procedure.
The median, like the average (or arithmetic mean), is known statistically as a measure of central tendency. In other words, both measures estimate the middle of a set of data. The mean represents the mathematical average value; the median represents the middle-most value. For many datasets, these two measures are the same or very close to each other. Sometimes, however, depending on how the data is distributed, the mean and median will report widely varying values. In these cases, many people favor the median as a better "average" than the mean.
Calculating the median requires sorting the dataset, so it can be rather slow on large datasets. In contrast, calculating the mean does not require a sort, and so will typically be faster.