Regex capitalize first letter every word, also after a special character like a dash

94,556

Solution 1

+1 for word boundaries, and here is a comparable Javascript solution. This accounts for possessives, as well:

var re = /(\b[a-z](?!\s))/g;
var s = "fort collins, croton-on-hudson, harper's ferry, coeur d'alene, o'fallon"; 
s = s.replace(re, function(x){return x.toUpperCase();});
console.log(s); // "Fort Collins, Croton-On-Hudson, Harper's Ferry, Coeur D'Alene, O'Fallon"

Solution 2

A simple solution is to use word boundaries:

#\b[a-z0-9-_]+#i

Alternatively, you can match for just a few characters:

#([\s\-_]|^)([a-z0-9-_]+)#i

Solution 3

If you want to use pure regular expressions you must use the \u.

To transform this string:

This Is A Test For-stackoverflow

into

This Is A Test For-Stackoverflow

You must put: (.+)-(.+) to capture the values before and after the "-" then to replace it you must put:

$1-\u$2

If it is in bash you must put:

echo "This Is A Test For-stackoverflow" | sed 's/\(.\)-\(.\)/\1-\u\2/'

Solution 4

Actually dont need to match full string just match the first non-uppercase letter like this:

'~\b([a-z])~'

Solution 5

For JavaScript, here’s a solution that works across different languages and alphabets:

const originalString = "this is a test for-stackoverflow"
const processedString = originalString.replace(/(?:^|\s|[-"'([{])+\S/g, (c) => c.toUpperCase())

It matches any non-whitespace character \S that is preceded by a the start of the string ^, whitespace \s, or any of the characters -"'([{, and replaces it with its uppercase variant.

Share:
94,556
Simmer
Author by

Simmer

Updated on January 23, 2021

Comments

  • Simmer
    Simmer over 3 years

    I use this #(\s|^)([a-z0-9-_]+)#i for capitalize every first letter every word, i want it also to capitalize the letter if it's after a special mark like a dash(-)

    Now it shows:

    This Is A Test For-stackoverflow
    

    And i want this:

    This Is A Test For-Stackoverflow
    

    Any suggestions/samples for me?

    I'am not a pro, so try to keep it simple for me to understand.

  • Kobi
    Kobi almost 13 years
    @Tim - I took artistic freedom and didn't change the way the OP matches letters - It's possible Simmer wants the letter as output, change their colors or whatnot. Also, didn't gave it that much thought, I only had 4 minutes :P
  • Stalin Gino
    Stalin Gino over 9 years
    in js, i've added g like /\b([a-z])/g to capitalize each word
  • Danish
    Danish over 8 years
    i like your lovely answer @StalinGino must say this is the only one i was able to understand.
  • Polopollo
    Polopollo about 8 years
    toUpperCase is capitalizing the whole word. Here is the solution: s.replace(re, function(x){return x.charAt(0).toUpperCase() + x.slice(1);});
  • Pravin W
    Pravin W almost 8 years
    Can someone please add jsfiddle example would be helpful
  • adam-beck
    adam-beck about 7 years
    @Polopollo, in this case the regex is only returning one letter if it matches but globally. So there is no need for that extra coding and it should work as is.
  • adam-beck
    adam-beck about 7 years
    This will not work as OP has asked since a single character would not get capitalized. Just for anybody who comes to this question like I did.
  • JohnK
    JohnK almost 7 years
    Which language's regex is this for?
  • Kobi
    Kobi almost 7 years
    @JohnK - Both of these are simple enough and should work in all languages. # is a separator here, so your language may need "\\b[a-z0-9-_]+" and an IgnoreCase flag.
  • Anderas
    Anderas about 6 years
    I fear this doesn't work: word boundaries include things like '. So don't becomes Don'T
  • Guido Bouman
    Guido Bouman about 6 years
    @Anderas that's what the negative lookahead is for: (?!\s) checks if it's not a character before whitespace. On the other hand, this fails when a word like don't is followed by a non-whitespace, non-alphanumeric character like a comma, period or exclamation mark. It would be better to use a word boundary in the lookahead: /(\b[a-z](?!\b))/g;
  • davemyron
    davemyron almost 5 years
    @GuidoBouman: Your suggested regex fails for Coeur D'Alene and O'Fallon though.
  • anubhava
    anubhava almost 4 years
    That is as per the requirements. Check all other answers as well.