Categorizing bank transactions in Excel

19,293

Solution 1

Introduction

As I indicated earlier the solution below is over-engineered for your immediate needs since

it is designed for my needs which are more varied than those you list in your question:

  • I am in the process of changing banks so I have two current (checking) accounts and two credit card accounts. I also have a number of savings accounts. I hold the electronic statements for all these accounts, which have different formats, across several workbooks.
  • Your example statement is very neat compared with those I receive. These are some descriptions from a recent MasterCard statement which I tidy to a preferred format of "Organisation Name, Location".

.

SAINSBURY'S S/MKT MONKS CROSS
Amazon *Mktplce EU-UK AMAZON.CO.UK LUX
WRAP LOUGHBOROUGH
SAINSBURYS PETROL MONKS CROSS
  • Like you I categorise transactions.
  • Some organisations bill annually in advance or quarterly in advance or arrears. My income varies from month to month. For these transactions, I apportion the amount over the appropriate months so the result better reflects my true financial situation.

My solution to these multiple requirements is to have a control routine for each account which knows where is it and what each column is used for. These call general routines which accept workbook, worksheet, etc as parameters and perform the necessary conversions and additions. At the heart of these conversions and additions is a worksheet I have called "Rules" which has three columns:

RuleType       A code such as "OrgCat" 
In-keyword     A string, such as "Starbucks", to be found in a text column
Out-keyword    A string, such as "Coffee", to be returned if the In-keyword
               is found

Other rule types that I use include:

"OrgOrg"   Convert an organisation name used in the source statement to my
           preferred name for the organisation.
"CatPer"   Return a code identifying the apportioning rule for a category. For
           example, "Utility" returns "B3" (Back 3) because my utility bills
           are issued for three months in arrears.

In your question, you have a "scenerio version" of your account and "need version" of your account. I assume you have manually created a "need version" of your account so you can see what it looks like. I have provided a macro, CopyFromAcctToRule(), that works down a "need version" of your Account, validating and extracting rules of type "OrgCat". If it finds no errors, it outputs the extracted rules to worksheet "Rule" and converts the "need version" to the "scenerio version". If you have not created a "need version", I suspect the easiest approach is to create a partial "need version" like this:

| A        | B             | C       | D           | E         |
================================================================
| Date     | Description   | Amount  | Category    | Rule      |
| 04/08/12 | Starbucks NYC |  -$5.42 | Coffee      | Starbucks |
| 04/09/12 | Wal-Mart 468  | -$54.32 | Supermarket | Wal-Mart  |
| 04/10/12 | Starbucks SF  |  -$3.68 |             |           |
| 04/11/12 | Wal-Mart 512  |-$123.45 |             |           |

That is, find the first Starbucks and fill in its Category and Rule; find the first Wal-Mart and fill in its Category and Rule; and so on. Run CopyFromAcctToRule() and it will display error messages in column "G" for inconsistencies and organisations you have missed. For one-offs, fill in the Category but leave the Rule empty. Repeat, fixing errors and running CopyFromAcctToRule() until it finds no errors and creates worksheet "Rule". Note: missing Categories will not be added at this stage; that happens below.

I have provided a macro, FillDerivedCol(), with a demonstration of how I use it by completing the Category column of a "scenerio version" account. If you do not want to create a partial "need version", FillDerivedCol() offers an alternative approach. If it cannot find a Category for a Description, it copies the Description to the bottom of the worksheet "Rule". For example, suppose you have misspelt the Rule against Starbucks, "Rule" would be amended to:

| A        | B             | C            |
===========================================
| Type     | In keyword    | Out keyword  |
| OrgCat   | Sarbucks      | Coffee       |
| OrgCat   | Wal-Mart      | Supermarket  |
| OrgCat   | Starbucks NYC |              |
| OrgCat   | Starbucks SF  |              |              

That is, there will be one new row for each branch of Starbucks. Here, the easiest approach is to correct the Sarbucks row and delete the new rows. However, if it was a new organisation, you could edit the In-keyword to remove the branch information and enter the Category in the Out-keyword column. Warning: I was over the limit of 30,000 characters for an answer. I have had to edit these routines to remove diagnostic code. I hope I have not introduced any errors while doing this.

I hope this is useful. Best of luck.

Global

These global constants and routine are used by both the macros mentioned above. I place them in their own module but that is your choice.

Option Explicit
  ' I use constant for objects such as column numbers which are fixed
  ' for long periods but which might change. Any code using a column
  ' that has moved can be updated by changing the constant.
  Public Const ColRuleType As Long = 1
  Public Const ColRuleKeywordIn As Long = 2
  Public Const ColRuleKeywordOut As Long = 3
  Public Const ColRuleLast As Long = 3
  Public Const RowRuleDataFirst As Long = 2

  ' Rules are accumulated in this array by CopyFromAcctToRule
  ' Rules are loaded to this array by UpdateNewTransactions
  ' See GetRuleDetails() for a description of this array.
  Public RuleData() As Variant
Public Sub GetRuleDetails(ByVal RuleType As String, ByVal SrcText As String, _
                          ByRef KeywordIn As String, ByRef KeywordOut As String, _
                          Optional ByRef RowRuleSrc As Long)

  ' This routine performs a case-insensive search of a list of in-keywords for
  ' one that is present in SrcText.  If one is found, it returns the in-keyword
  ' and the matching out-keyword.

  ' This routine uses the previously prepared array RuleData.  Since RuleData
  ' is to be loaded to, or has been loaded from, a worksheet, the first
  ' dimension is for the rows and the second dimension is for the columns.

  ' RuleData has three columns:
  '  * RuleType: a code identifying a type of rule.  Only rows in RuleData for
  '    which this column matches the parameter RuleType will be considered.
  '  * KeywordIn: a string.  The first row in RuleData where the value of this
  '    column is contained within parameter SrcText is the selected Rule.
  '  * KeywordOut: a string.

  ' Input parameters
  '  * RuleType: Foe example, the rule type "OrgCat" will return a
  '    category for an organisation.
  '  * SrcText: The text field to be searched for the in keyword.

  ' Output parameters
  '  * KeywordIn: The value from the KeywordIn column of RuleData for the first
  '    row of RuleData of the required RuleType for which the KeywordIn value can
  '    be found in Desc.  The value in SrcText may be of any case although it is
  '    likely to be capitalised.  This value is the preferred display value.
  '  * KeywordOut: The value from the KeywordOut column of RuleData of the
  '    selected row.  For this routine, KeywordOut is a string with no
  '    significance.  It is the calling routine that understands the rule type.
  '  * RowRuleSrc: Only used during build of RuleData so the caller can access
  '    non-standard data held in RuleData during build.

  Dim LCSrcText As String
  Dim RowRuleCrnt As Long

  LCSrcText = LCase(SrcText)
  For RowRuleCrnt = RowRuleDataFirst To UBound(RuleData, 1)
    If RuleData(RowRuleCrnt, ColRuleKeywordIn) = "" Then
      ' Empty row.  This indicated end of table during build
      KeywordIn = ""
      KeywordOut = ""
      Exit Sub
    End If
    If RuleType = RuleData(RowRuleCrnt, ColRuleType) Then
      ' This row is for the required type of rule
      If InStr(1, LCSrcText, _
                  LCase(RuleData(RowRuleCrnt, ColRuleKeywordIn))) <> 0 Then
        ' Have found first rule with KeywordIn contained within SrcText
        KeywordIn = RuleData(RowRuleCrnt, ColRuleKeywordIn)
        KeywordOut = RuleData(RowRuleCrnt, ColRuleKeywordOut)
        If Not IsEmpty(RowRuleSrc) Then
          RowRuleSrc = RowRuleCrnt
        End If
        Exit Sub
      End If
    End If
  Next
  ' No rule found
  KeywordIn = ""
  KeywordOut = ""

End Sub

Extract rules and convert account from Need to Scenerio style

See the Introduction for details of how I would use this routine. Once you have built worksheet "Rules" for the existing transactions, this code will probably be of no further value. I would place it in its own module so it can be archived and deleted after use. This code assumes the worksheets "Rule" and "Matt's Acct" are in the same workbook. I suggest you make a copy of your account, create worksheet "Rule" and then run CallCopyFromAcctRule() on the copy account and assess the result. Warning: you use "rule" where I use "in-keyword"; I have tried to be consistent in my comments and error messages but cannot guarantee that I have.

Option Explicit
Sub CallCopyFromAcctRule()

  ' This routine exists simply to make it easy to change the names of the
  ' worksheets accessed by CallCopyFromAcctRule.

  Call CopyFromAcctToRule("Rule", "Matt's Acct")

End Sub
Sub CopyFromAcctToRule(ByVal Rule As String, ByVal Acct As String)

  ' * This routine builds the worksheet Rule from worksheet Acct.
  ' * It works down worksheet Acct extracting rules from rows where
  '   there is both a Rule and a Category.  Note: this routine does not
  '   distinguish between Manual and Automatic Categories although, if both are
  '   present, they must be the same.
  ' * The routine checks for a variety of error and possible error conditions.
  '   Error and warning messages are placed in columns defined by ColAcctError
  '   and ColAcctWarn.
  ' * If any errors are found, the routine does not change either worksheet
  '   Acct, apart from adding error messages, or worksheet Rule.
  ' * If no errors are found, worksheet Rule is cleared and the contents of
  '   RuleData written to it.
  ' * If no errors are found, any warning added to worksheet Acct are discarded
  '   and the following additional changes made:
  '    * The values in the Automatic category column are merged into the Manual
  '      category column which is relabelled "Category".
  '    * The Rule and Automatic category columns are cleared.

  Dim ColAcctCatAuto As Long
  Dim ColAcctCatMan As Long
  Dim ColAcctCrnt As Long
  Dim ColAcctDesc As Long
  Dim ColAcctError As Long
  Dim ColAcctRule As Long
  Dim ColAcctWarn As Long
  Dim ColRuleRowSrc As Long
  Dim DescCrnt As String
  Dim ErrorFoundAll As Boolean
  Dim ErrorFoundCrnt As Boolean
  Dim KeywordInCrnt As String
  Dim KeywordInRetn As String
  Dim KeywordOutCrnt As String
  Dim KeywordOutRetn As String
  Dim RowAcctCrnt As Long
  Dim RowAcctDataFirst As Long
  Dim RowAcctLast As Long
  Dim RowRuleCrntMax As Long
  Dim RowRuleSrc As Long

  ' These column values must be changed if the true value do not match those
  ' in the example in the question.
  ColAcctDesc = 2
  ColAcctCatMan = 4
  ColAcctRule = 5
  ColAcctCatAuto = 6
  ColAcctError = 8
  ColAcctWarn = 9
  ColRuleRowSrc = ColRuleLast + 1
  RowAcctDataFirst = 2

  With Worksheets(Acct)
    RowAcctLast = .Cells.SpecialCells(xlCellTypeLastCell).Row

    ' Size the array for the output data ready to be loaded to worksheet
    ' Rule with rows as the first dimension.  Allow for the maximum number of
    ' rows because an array cannot be resized to change the number of
    ' elements in the first dimension.  Allow an extra column for use during
    ' the build process.
    ReDim RuleData(1 To RowAcctLast, 1 To ColRuleRowSrc)
    RuleData(1, ColRuleType) = "Type"
    RuleData(1, ColRuleKeywordIn) = "In keyword"
    RuleData(1, ColRuleKeywordOut) = "Out keyword"
    RowRuleCrntMax = 1      ' Last currently used row

    With .Cells(1, ColAcctError)
      .Value = "Error"
      .Font.Bold = True
    End With
    With .Cells(1, ColAcctWarn)
      .Value = "Warning"
      .Font.Bold = True
    End With

    ErrorFoundAll = False
    For RowAcctCrnt = RowAcctDataFirst To RowAcctLast
      .Cells(RowAcctCrnt, ColAcctError).Value = ""  ' Clear any error or warning
      .Cells(RowAcctCrnt, ColAcctWarn).Value = ""   ' from previous run
      ErrorFoundCrnt = False
      ' Determine Category, if any
      If .Cells(RowAcctCrnt, ColAcctCatMan).Value = "" Then
        ' There is no manual category.
        If .Cells(RowAcctCrnt, ColAcctCatAuto).Value <> "" Then
          KeywordOutCrnt = .Cells(RowAcctCrnt, ColAcctCatAuto).Value
        Else
          ' Neither manual nor automatic category
          KeywordOutCrnt = ""
        End If
      Else
        ' There is a manual category.  Is it consistent with automatic category?
        KeywordOutCrnt = .Cells(RowAcctCrnt, ColAcctCatMan).Value
        If .Cells(RowAcctCrnt, ColAcctCatAuto).Value <> "" Then
          ' Automatic category exists.  It must be the same
          ' as the manual category to be valid.
          If LCase(KeywordOutCrnt) <> _
                             LCase(.Cells(RowAcctCrnt, ColAcctCatAuto).Value) Then
            ErrorFoundCrnt = True
            .Cells(RowAcctCrnt, ColAcctError).Value = _
                                       "Manual and automatic categories different"
          End If
        End If
      End If
      If Not ErrorFoundCrnt Then
        ' Match Rule, if any, against Category, if any
        KeywordInCrnt = .Cells(RowAcctCrnt, ColAcctRule).Value
        If KeywordInCrnt <> "" Then
          ' This row has keyword
          If KeywordOutCrnt = "" Then
            ' Rule but no Category
            DescCrnt = .Cells(RowAcctCrnt, ColAcctDesc).Value
            Call GetRuleDetails("OrgCat", DescCrnt, KeywordInRetn, KeywordOutRetn)
            If KeywordInRetn <> "" Then
              ' Rule found that would generate a category for this Keyword.
              ' No warning necessary
            Else
              ' No rule found that would generate a category for this keyword
              ErrorFoundCrnt = True
              .Cells(RowAcctCrnt, ColAcctError).Value = _
                            "There is no existing rule that would " & _
                            "generate a Category from this Rule"
            End If
          Else
            ' Both Rule and Category found
            ' Is match already recorded?
            DescCrnt = .Cells(RowAcctCrnt, ColAcctDesc).Value
            Call GetRuleDetails("OrgCat", DescCrnt, KeywordInRetn, _
                                                   KeywordOutRetn, RowRuleSrc)
            If KeywordInRetn <> "" Then
              If KeywordInCrnt <> KeywordInRetn Then
                ' A different rule would be applied to this Description
                If InStr(1, LCase(DescCrnt), LCase(KeywordInCrnt)) = 0 Then
                  ' The current Rule is not within the Description
                  ErrorFoundCrnt = True
                  .Cells(RowAcctCrnt, ColAcctError).Value = _
                      "The Rule in column " & Chr(64 + ColAcctRule) & _
                      " is not within the Description.  The Rule " & _
                      "from row " & RowRuleSrc & " would generate " & _
                      "the required Category '" & KeywordOutRetn & _
                      "' from this Description"
                Else
                  ' The current Rule is within the Description
                  If LCase(KeywordOutRetn) = LCase(KeywordOutCrnt) Then
                    ' It would generate the same category
                    ErrorFoundCrnt = True
                    .Cells(RowAcctCrnt, ColAcctError).Value = _
                      "The Rule in column " & Chr(64 + ColAcctRule) & _
                      " is within the Description but the Rule from " & _
                      "row " & RowRuleSrc & " would be selected to " & _
                      "generate the required Category '" & _
                      KeywordOutRetn & "' from this Description"
                  Else
                    ' It would generate a different category
                    ErrorFoundCrnt = True
                    .Cells(RowAcctCrnt, ColAcctError).Value = _
                      "The Rule in column " & Chr(64 + ColAcctRule) & _
                      " is within the Description but the Rule from " & _
                      "row " & RowRuleSrc & " would be selected to " & _
                      "generate Category '" & KeywordOutRetn & _
                      "', not Category '" & KeywordOutCrnt & _
                      "', from this " & "Description"
                  End If
                End If
              Else
                ' Rule already recorded
                If LCase(KeywordOutRetn) = LCase(KeywordOutCrnt) Then
                  ' Rule already recorded for this category. No action required.
                Else
                  ' Rule already recorded but not for this category
                  ErrorFoundCrnt = True
                  .Cells(RowAcctCrnt, ColAcctError).Value = _
                                "The rule from row " & RowRuleSrc & _
                                " would generate category """ & _
                                KeywordOutRetn & """ for this Rule"
                End If
              End If
            Else
              ' New rule
              RowRuleCrntMax = RowRuleCrntMax + 1
              RuleData(RowRuleCrntMax, ColRuleType) = "OrgCat"
              RuleData(RowRuleCrntMax, ColRuleKeywordOut) = KeywordOutCrnt
              RuleData(RowRuleCrntMax, ColRuleKeywordIn) = KeywordInCrnt
              RuleData(RowRuleCrntMax, ColRuleRowSrc) = RowAcctCrnt
            End If
          End If  ' If CatCrnt = ""
        Else
          ' No keyword
          If KeywordOutCrnt = "" Then
            ' No Keyword and no Category
            DescCrnt = .Cells(RowAcctCrnt, ColAcctDesc).Value
            If DescCrnt = "" Then
              ' Probably a blank line.  Ignore
            Else
              ' Would an existing rule generate a Category for Description
              Call GetRuleDetails("OrgCat", DescCrnt, KeywordInRetn, KeywordOutRetn)
              If KeywordInRetn = "" Then
                ' No rule found that would generate a category
                ' for this description
                .Cells(RowAcctCrnt, ColAcctError).Value = _
                          "There is no rule that would generate " & _
                          "a Category from this Description"
              Else
                ' Rule found that would generate a category for
                ' this description.
              End If
            End If
          Else
            ' No Keyword but have Category
            ' Check for a rule that would give current category
            ' from current description
            DescCrnt = .Cells(RowAcctCrnt, ColAcctDesc).Value
            Call GetRuleDetails("OrgCat", DescCrnt, KeywordInRetn, _
                                                   KeywordOutRetn, RowRuleSrc)
            If KeywordInRetn <> "" Then
              ' Have found a rule for the description
              If LCase(KeywordOutRetn) = LCase(KeywordOutCrnt) Then
                ' Rule generates current category
              Else
                ' Rule does not generate current category
                ErrorFoundCrnt = True
                .Cells(RowAcctCrnt, ColAcctError).Value = _
                     "The rule from row " & RuleData(RowRuleSrc, ColRuleRowSrc) & _
                     " would generate Category '" & KeywordOutRetn & _
                     "' from this Description"
              End If
            Else
              ' There is no rule for this Description.  This is not necessarily
              ' an error.  The category may have to be set manually.
              .Cells(RowAcctCrnt, ColAcctWarn).Value = _
                            "There is no rule that would generate " & _
                            "this Category from this Description"
            End If
          End If    ' If KeywordOutCrnt = ""
        End If      ' KeywordInCrnt <> ""
      End If  ' If Not ErrorFoundCrnt
      If ErrorFoundCrnt Then
        ErrorFoundAll = True
      End If
    Next
  End With

  If ErrorFoundAll Then
     Exit Sub
  End If

  ' No errors found

  ' Clear existing contents from worksheet Rule and load with RuleData
  With Worksheets(Rule)
    .Cells.EntireRow.Delete
    .Range(.Cells(1, 1), .Cells(RowRuleCrntMax, _
                                        ColRuleKeywordOut)).Value = RuleData
    .Range("A1:C1").Font.Bold = True
    .Columns.AutoFit
  End With

  With Worksheets(Acct)
    ' Merge values from automatic category column into manual category column
    For RowAcctCrnt = 2 To RowAcctLast
      If .Cells(RowAcctCrnt, ColAcctCatMan).Value = "" Then
        ' There is no manual category so set to automatic category.
        .Cells(RowAcctCrnt, ColAcctCatMan).Value = _
                                    .Cells(RowAcctCrnt, ColAcctCatAuto).Value
      End If
    Next
    ' Clear automatic category
    .Columns(ColAcctCatAuto).ClearContents
    ' Change column heading
    With .Cells(1, ColAcctCatMan)
      .Value = "Category"
      .Font.Bold = True
    End With
    ' Clear Error and Warning columns
    .Columns(ColAcctError).ClearContents   ' Only heading to clear
    .Columns(ColAcctWarn).ClearContents
    ' Clear Rule column
    .Columns(ColAcctRule).ClearContents
  End With

End Sub

Completing the Category column of your scenerio version account

This demonstrates how I fill in the Category column for new transactions.

Option Explicit
Sub CallFillDerivedCol()

  ' I use FillDerivedCol() on worksheets loaded with transactions for different
  ' accounts.  They are in different workbooks, different worksheets and have
  ' different columns.  This routine exists to call FillDerivedCol() for my
  ' test version of your account

  Call FillDerivedCol(ActiveWorkbook, "Rule", _
                      ActiveWorkbook, "Matt's Acct", "OrgCat", 2, 4)

  ' For this example, I had the rules and the account in same workbook.  To
  ' have them in different workbooks, as I normally do, you will need something
  ' like:

  '  Dim PathCrnt As String
  '  Dim WBookOrig As Workbook
  '  Dim WBookOther As Workbook

  '  Set WBookOrig = ActiveWorkbook
  '  PathCrnt = ActiveWorkbook.Path & "\"
  '  Set WBookOther = Workbooks.Open(PathCrnt & "xxxxxxx")

  '  Call FillDerivedCol(WBookOrig, "Rule", _
  '                      WBookOther, "Matt's Acct", "OrgCat", 2, 4)

  '  WBookOther.Close SaveChanges:=True

End Sub
Sub FillDerivedCol(ByVal WBookRule As Workbook, ByVal WSheetRule As String, _
                   ByVal WBookTrans As Workbook, ByVal WSheetTrans As String, _
                   ByVal RuleType As String, _
                   ByVal ColSrc As Long, ByVal ColDest As Long)

  ' Fill any gaps in WBookTrans.Worksheets(WSheetTrans).Columns(ColDest) based on
  ' rules in worksheet WBookRule.Worksheets(WSheetRule).

  ' WBook.Worksheets(WSheetTrans).Columns(ColSrc) is a text field which
  ' contains in-keywords.  Rules of type RuleType convert in-keywords to
  ' out-keywords which are the values required for .Columns(ColDest).

  Dim CellEmptyDest As Range
  Dim KeywordIn As String
  Dim KeywordOut As String
  Dim MissingRule() As Variant
  Dim RowAcctCrnt As Long
  Dim RowAcctPrev As Long
  Dim RowMissingCrntMax As Long
  Dim RowRuleLast As Long

  ' Load array RuleData from worksheet Rule
  With WBookRule.Worksheets(WSheetRule)
    RowRuleLast = .Cells(Rows.Count, 1).End(xlUp).Row
    RuleData = .Range(.Cells(1, 1), .Cells(RowRuleLast, ColRuleLast)).Value
  End With

  ' * Prepare MissingRule() in case any calls to GetRuleDetails() fails to
  '   find a known in-keyword in WBook.Worksheets(WSheetName).Columns(ColDest).
  ' * The number of occurrences of the first dimension cannot be changed. 500
  '   is intended to be more occurrences than could possible be needed. If
  '   more than 500 missing rules are found, only the first 500 will be added
  '   to worksheet "Rule"  This routine can be immediately run again to add
  '   another 500 missing rules.
  ReDim MissingRule(1 To 500, 1 To ColRuleLast)
  RowMissingCrntMax = 0

  With WBookTrans
    With .Worksheets(WSheetTrans)
      RowAcctPrev = 1
      ' Find the next empty cell in column ColDest for a transaction row
      Set CellEmptyDest = .Columns(ColDest).Find(What:="", _
                      After:=.Cells(RowAcctPrev, ColDest), LookIn:=xlFormulas, _
                      LookAt:=xlWhole, SearchOrder:=xlByRows, _
                      SearchDirection:=xlNext, MatchCase:=False, _
                      SearchFormat:=False)
      Do While True
        If CellEmptyDest Is Nothing Then
          ' No empty cell found in column.  This is not a realistic situation
          ' because it would require every row in the worksheet to have a value.
          Exit Do
        End If
        RowAcctCrnt = CellEmptyDest.Row
        If RowAcctCrnt < RowAcctPrev Then
          ' Have looped back to the top.  This is not a realistic situation
          ' because it would require every row in the worksheet to have a value.
          Exit Do
        End If
        If .Cells(RowAcctCrnt, ColSrc).Value = "" Then
          ' This row has no value in either the source or the destination
          ' columns.  Assume all transactions finished
          Exit Do
        End If
        Call GetRuleDetails(RuleType, .Cells(RowAcctCrnt, ColSrc).Value, _
                                                          KeywordIn, KeywordOut)
        If KeywordIn = "" Then
          ' No in-keyword found within source column.  Add source column value
          ' to MissingData for user to edit.
          If RowMissingCrntMax >= UBound(MissingRule, 1) Then
            ' All available rows in MissingRule already used
          Else
            RowMissingCrntMax = RowMissingCrntMax + 1
            MissingRule(RowMissingCrntMax, ColRuleType) = RuleType
            MissingRule(RowMissingCrntMax, ColRuleKeywordIn) = _
                                              .Cells(RowAcctCrnt, ColSrc).Value
          End If
        Else
          .Cells(RowAcctCrnt, ColDest).Value = KeywordOut
        End If
        RowAcctPrev = RowAcctCrnt
        Set CellEmptyDest = .Columns(ColDest).FindNext(CellEmptyDest)
      Loop
    End With
  End With

  If RowMissingCrntMax > 0 Then
    ' Transactions found for which no rule exists.  Add to worksheet "Rule"
    ' for attention by the user.
    With WBookRule.Worksheets(WSheetRule)
      RowRuleLast = .Cells(Rows.Count, 1).End(xlUp).Row
      .Range(.Cells(RowRuleLast + 1, 1), _
             .Cells(RowRuleLast + RowMissingCrntMax, ColRuleLast)).Value _
                                                                 = MissingRule
    End With
  End If

End Sub

Solution 2

I was also looking for an auto-categorization process. The options above seem really powerful but way more complicated than what I wanted.

My idea is simple: develop a set of categorization rules based on keywords. If a keyword is found on the description, the rule is applied and the category set. Not happy with the idea of using VBA or PowerShell, kept looking around and found the following post:

how-to-group-excel-items-based-on-custom-rules by John Bustos (please credit him)

John's solution uses a very simple approach:

  1. Rules are defined in two columns (Keyword - Category) - if we assume that they are in columns F and G:

    Column F     Column G
    Keyword      Category
    Starbucks    Coffee shop
    Wal-Mart     Supermarket
    Safeway      Supermarket
    In-N-Out     Fast Food
    Comcast      Internet Service
    Verizon      Mobile Phone Service
    
  2. Then add this ARRAY formula to the cell where you want to insert the category pointing to the cell that you want to check for the rule (let's assume is cell A2):

    =IFERROR(INDEX(G$2:G$7,MATCH(TRUE,ISNUMBER(SEARCH(F$2:F$7,A2)),0)),"Other")
    

    Remember to use CTRL+SHIFT+ENTER to make sure that it goes in as an array formula. If you have more rules, you will need to change the range height. Afterwards you can simply populate down the formula to all the rows that you need to categorize. Also, the categorization uses the first rule and sticks to that, so if you have two different keywords present in one of the target cells, the first keyword categorization rule will be applied. The rules have to be created manually, when a cell shows "Other" it means that there are no keywords found.

Finally, the credit goes to John Bustos, he is the one that provided the solution here. I found his solution to be simple and extremely easy to implement so I wanted to include it here because searching by "auto categorization in excel" did not come up with it immediately. I had to try other search words.

Share:
19,293
Matt
Author by

Matt

Updated on June 05, 2022

Comments

  • Matt
    Matt almost 2 years

    I have an exported list of bank transactions in Excel, which I'd like to categorize as quickly and simply as possible. I suspect this is possible just with array formulas, but a VBA function would be equally useful.

    The Scenario

    My spreadsheet header looks like this:

    | A    | B           | C      | D        |
    ==========================================
    | Date | Description | Amount | Category |
    ------------------------------------------
    

    Date, Description and Amount come pre-populated from my bank. I fill in the category for each transaction in column D.

    This is fine, but time-consuming, because each category has to be entered individually and manually.

    The Need

    I want categories to auto-populate in rows where I haven't entered a manual category, based on rules that are dynamically generated and applied.

    The output I want looks like this:

    | A        | B             | C       | D           | E         | F           |
    ==============================================================================
    | Date     | Description   | Amount  | Manual cat. | Rule      | Auto cat.   |
    ------------------------------------------------------------------------------
    | 04/08/12 | Starbucks NYC |  -$5.42 | Coffee      | starbucks | Coffee      |
    | 04/09/12 | Wal-Mart 468  | -$54.32 | Supermarket | wal-mart  | Supermarket |
    | 04/10/12 | Starbucks SF  |  -$3.68 |             |           | Starbucks   |
    

    As you can see, I have entered 'manual' categories in Column D. Wherever I have done this, I have entered a categorization 'rule' in Column E. Excel is then using my entries to auto-populate Column F,

    The logic is simple:

    • Where I enter a manual category, Excel does two things:

      1. Populates my manual category in Column F.
      2. Create a rule using the text entered in Column E.
    • Where Excel comes across a description that contains text used in one of my rules, it populates the relevant category in Column F.

    The Benefit

    This would make it very simple to whiz down the list of transactions, adding categories and associated rules. Transactions without a category would be auto-populated where a rule applies, and left blank where a rule does not apply. Transactions where a rule has been applied giving the wrong category could be corrected, and a new rule supplied.

    My best attempt so far

    I have created a way of doing this using only formulas, but it has three drawbacks:

    1. It necessitates the creation of as many columns as transactions.
    2. There is no convenient way of listing the categories and associated rules.
    3. There is no way of changing the order the rules are applied in.
    • brettdj
      brettdj about 12 years
      This is off topic for Stack Overflow. As a non programming question it is better suited for Super User
    • Matt
      Matt about 12 years
      I have amended the question to remove the focus on a formula-only approach. VBA solutions gratefully accepted.
    • Tony Dallimore
      Tony Dallimore about 12 years
      My system for handling and merging the statements for my various accounts is old and creaking at the seams. I have decided to use your question as an excuse to replace it. It will be a day or two before I have finished because I have other tasks. Someone else may produce a satisfactory solution earlier but if not I will have something for you later this week.
  • Matt
    Matt about 12 years
    Thanks for your answer. I'm really looking for something that automatically creates and applies the rules based on a part of the cell.
  • Tony Dallimore
    Tony Dallimore about 12 years
    Do you mean that you want "starbucks" anywhere in the description to result in the category "coffee shop"? That is what my system does but I do not know how to do it with formulae. I can probably offer you a VBA solution. Perhaps some one else can offer you a more sophisticated formula based solution.
  • assylias
    assylias about 12 years
    I might use that one day ;) Just one comment: I prefer using enum vs. constants for columns, as they provide 3 things: namespace, auto increment, ease to insert a new column without having to change anything.
  • Tony Dallimore
    Tony Dallimore about 12 years
    @assylias. I had not thought of using enums in that way. Thanks for the suggestion.