# Creating a regex for finding credit card numbers with grep

Ugly regex:

grep '$$^\|[^0-9]$$\{1\}$$[345]\{1\}[0-9]\{3\}\|6011$$\{1\}[-]\?[0-9]\{4\}[-]\?
[0-9]\{2\}[-]\?[0-9]\{2\}-\?[0-9]\{1,4\}$$\|[^0-9]$$\{1\}' filename


Lately, I’ve been working with the security team where I’m employed to catch people storing information they shouldn’t be on our database server. (SSNs, credit card numbers, etc.) This involved dumping all our databases into a flat file (about a gig of text) and doing some mining. I was given a pre-built regex, but it didn’t work with grep, I’m enough of a command-line geek that I’d rather do things ‘my way’ than just write a perl script or something. So, I had to make my own regex to find cc numbers, because I didn’t find anything effective in a quick search online. This brings to mind a hoary old chestnut of a quote - over-used, but still very true:

Some people, when confronted with a problem, think "I know, I'll use regular expressions."
Now they have two problems.

-Jamie Zawinski, in comp.lang.emacs


All that notwithstanding, I found a few resources that came in handy for this project.

• This php tutorial has a very good description of a basic spec for valid cc numbers.
I am trying to understand what the (^|[^0-9]){1} and ($|[^0-9]){1} are for in the expression, The ^$ seem to be limiting since it must be a credit card on it on line, i not sure what the 0-9 at the start and end are for.