pyspark regex string matching

regex dataframe pyspark

11,707

Just translate your requirements instead of using a dot-star-soup and add anchors:

# [a-zA-Z].T[0-9].[a-zA-Z]
mydf2 = mydf1.where('col1 rlike "^[a-zA-Z.]+\.T[0-9]+\.[a-zA-Z.]+$"')

See a demo on regex101.com.
Please note, that I have also added the dot to the character class (is this a requirement?), otherwise your second string won't be matched. If this is not what you want, delete it from the class.

11,707

Author by

akn

Senior Developer - BigData

Updated on June 04, 2022

Comments

akn almost 2 years
I have a strings in a dataframe in the following format.
```
abc.T01.xyz
abc.def.T01.xyz
abc.def.ghi.xyz
```
I need to filter the rows where this string has values matching this expression.
```
[a-zA-Z].T[0-9].[a-zA-Z]
```
I have used the following command, but it is giving me the strings that look like this as well: [a-zA-Z].[a-zA-Z].T[0-9].[a-zA-Z] which I don't want in my result.
```
mydf2 = mydf1.where('col1 rlike ".*\.T.*\..*"')
mydf2.show()
```
I am missing something in my regex.

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Related

How to use regex_replace to replace special characters from a column in pyspark dataframe

Create an empty array column of certain type in pyspark DataFrame

How to query an Elasticsearch index using Pyspark and Dataframes

Iterating each row of Data Frame using pySpark

How to create sequential number column in pyspark dataframe?

How can I use a function in dataframe withColumn function in Pyspark?

How to add Extra column with current date in Spark dataframe

adding a unique consecutive row number to dataframe in pyspark

Partition of Timestamp column in Dataframes Pyspark

How can I create a dataframe using other dataframe (PySpark)?