Strip out non-numeric characters in SELECT

10,716

Solution 1

You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.

Public Function OnlyDigits(ByVal pInput As String) As String
    Static objRegExp As Object

    If objRegExp Is Nothing Then
        Set objRegExp = CreateObject("VBScript.RegExp")
        With objRegExp
            .Global = True
            .Pattern = "[^\d]"
        End With
    End If
    OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function

Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)

? OnlyDigits("x1x23x")
123

If that is the output you want, just use the function in your query.

SELECT OnlyDigits(SomeCol) FROM SomeTable;

Solution 2

There is no RegEx in Access, at least not in SQL. If you venture to VBA, you might as well use a custom StripNonNumeric VBA function in the SQL statement.

e.g. SELECT StripNonNumeric(SomeCol) as SomeCol from SomeTable

Function StripNonNumeric(str)
      keep = "0123456789"
      outstr = ""
      For i = 1 to len(str)
          strChar = mid(str,i,1)
          If instr(keep,strChar) Then
              outstr = outstr & strChar
          End If
      Next
      StripNonNumeric = outstr
  End Function
Share:
10,716

Related videos on Youtube

Danny Beckett
Author by

Danny Beckett

I'm a PHP/JS developer & C#/XAML programmer from Liverpool . I lived in Amsterdam for a couple of years, and Oslo for a little while too. OS windows-server centos GUI c# .net wpf visual-studio Web php html5 css3 javascript jquery SQL mysql mssql t-sql oracle pl-sql sybase-asa sqlanywhere HTTPd apache iis

Updated on July 11, 2022

Comments

  • Danny Beckett
    Danny Beckett almost 2 years

    In an MS Access 2007 project report, I have the following (redacted) query:

    SELECT SomeCol FROM SomeTable
    

    The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.

    The column is set to TEXT. I have no control over this data type, so I can't change it.

    How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?

  • Danny Beckett
    Danny Beckett over 11 years
    This is perfect! Just for anyone else who happens upon this question. Simply go to the Create tab, and where it says Macro, press the arrow and select Module. Copy this code into the new module, save and close. Then use as SELECT StripNonNumeric(SomeCol) FROM SomeTable. Very happy, thank you Richard!
  • Matt Donnan
    Matt Donnan over 11 years
    +1 As you say RegEx isn't available in SQL but can be used in Access VBA. As a slight change although I don't imagine it would either increase or decrease the run time, you could always drop the keep string and use the built in IsNumeric function instead.
  • Danny Beckett
    Danny Beckett over 11 years
    SELECT VAL(SomeCol) FROM SomeTable WHERE SomeCol = '123456' returns 0 (where there is some invisible character before the 1 - the WHERE by itself works)
  • Fionnuala
    Fionnuala over 11 years
    The first IIf(IsNumeric([atext] means that val is applied to the text if the text does not have an odd start character, val is applied to Mid([atext],2), that is, remove the start character, when the text does have an odd, non-numeric, start character. The above sql was tested.
  • Danny Beckett
    Danny Beckett over 11 years
    And if the character is not at position 1? What then? It might be in the middle of the string, e.g.
  • Danny Beckett
    Danny Beckett over 11 years
    After seeing your answer, I agree it's better, since it doesn't require every character to be looped through (useful when theres up to 9 characters per string, and 9,500 rows). Thanks!